mirror of
https://github.com/instructkr/claw-code.git
synced 2026-06-22 14:48:32 +02:00
Compare commits
26 Commits
9c8375da99
...
pr-3221
| Author | SHA1 | Date | |
|---|---|---|---|
| 96fd46cfbe | |||
| b119afcaca | |||
| 4619375c14 | |||
| 10fe72498a | |||
| 5b22bc0480 | |||
| ae7da0ec74 | |||
| 7dd17c6344 | |||
| d8535bf938 | |||
| b45c61eff9 | |||
| 7cfd83f66a | |||
| b5bead9028 | |||
| 41678eb097 | |||
| ecd3e4ceb9 | |||
| 22fdaeae2c | |||
| 4522490bd5 | |||
| cd58c054ca | |||
| 94579eace5 | |||
| 2ab2f44e1d | |||
| fa35018769 | |||
| 94be902ce1 | |||
| bcc5bfde9c | |||
| 9522674c87 | |||
| c91a3062d5 | |||
| 54d785d0c0 | |||
| 36218ac1b1 | |||
| 6388a2ba3f |
+22
-23
@@ -1244,8 +1244,7 @@ Model name prefix now wins unconditionally over env-var presence. Regression tes
|
||||
|
||||
45. **`claw dump-manifests` fails with opaque "No such file or directory"** — dogfooded 2026-04-09. `claw dump-manifests` emits `error: failed to extract manifests: No such file or directory (os error 2)` with no indication of which file or directory is missing. **Partial fix at `47aa1a5`+1**: error message now includes `looked in: <path>` so the build-tree path is visible, what manifests are, or how to fix it. Fix shape: (a) surface the missing path in the error message; (b) add a pre-check that explains what manifests are and where they should be (e.g. `.claw/manifests/` or the plugins directory); (c) if the command is only valid after `claw init` or after installing plugins, say so explicitly. Source: Jobdori dogfood 2026-04-09.
|
||||
|
||||
45. **`claw dump-manifests` fails with opaque `No such file or directory`** — **done (verified 2026-04-12):** current `main` now accepts `claw dump-manifests --manifests-dir PATH`, pre-checks for the required upstream manifest files (`src/commands.ts`, `src/tools.ts`, `src/entrypoints/cli.tsx`), and replaces the opaque os error with guidance that points users to `CLAUDE_CODE_UPSTREAM` or `--manifests-dir`. Fresh proof: parser coverage for both flag forms, unit coverage for missing-manifest and explicit-path flows, and `output_format_contract` JSON coverage via the new flag all pass. **Original filing below.**
|
||||
45. **`claw dump-manifests` fails with opaque `No such file or directory`** — **done (verified 2026-04-12):** current `main` now accepts `claw dump-manifests --manifests-dir PATH`, pre-checks for the required upstream manifest files (`src/commands.ts`, `src/tools.ts`, `src/entrypoints/cli.tsx`), and replaces the opaque os error with guidance that points users to `CLAUDE_CODE_UPSTREAM` or `--manifests-dir`. Fresh proof: parser coverage for both flag forms, unit coverage for missing-manifest and explicit-path flows, and `output_format_contract` JSON coverage via the new flag all pass. **Original filing below.**
|
||||
45. **`claw dump-manifests` fails with opaque `No such file or directory`** — **done (verified 2026-06-03):** current `main` now emits a self-contained Rust resolver inventory for `claw dump-manifests` without requiring upstream TypeScript files, build-machine paths, or `CLAUDE_CODE_UPSTREAM`. Explicit `--manifests-dir PATH` scopes resolver discovery to another directory and missing/not-directory values emit typed `missing_manifests` guidance. Fresh proof: parser coverage for both flag forms, unit coverage for self-contained default, explicit-directory, and missing-directory flows, plus `output_format_contract` JSON coverage all pass. **Original filing below.**
|
||||
46. **`/tokens`, `/cache`, `/stats` were dead spec — parse arms missing** — dogfooded 2026-04-09. All three had spec entries with `resume_supported: true` but no parse arms, producing the circular error "Unknown slash command: /tokens — Did you mean /tokens". Also `SlashCommand::Stats` existed but was unimplemented in both REPL and resume dispatch. **Done at `60ec2ae` 2026-04-09**: `"tokens" | "cache"` now alias to `SlashCommand::Stats`; `Stats` is wired in both REPL and resume path with full JSON output. Source: Jobdori dogfood.
|
||||
|
||||
47. **`/diff` fails with cryptic "unknown option 'cached'" outside a git repo; resume /diff used wrong CWD** — dogfooded 2026-04-09. `claw --resume <session> /diff` in a non-git directory produced `git diff --cached failed: error: unknown option 'cached'` because git falls back to `--no-index` mode outside a git tree. Also resume `/diff` used `session_path.parent()` (the `.claw/sessions/<id>/` dir) as CWD for the diff — never a git repo. **Done at `aef85f8` 2026-04-09**: `render_diff_report_for()` now checks `git rev-parse --is-inside-work-tree` first and returns a clear "no git repository" message; resume `/diff` uses `std::env::current_dir()`. Source: Jobdori dogfood.
|
||||
@@ -1547,7 +1546,7 @@ Original filing (2026-04-13): user requested a `-acp` parameter to support ACP p
|
||||
|
||||
**Source.** Jobdori dogfood 2026-04-17 against `/tmp/cd3` on main HEAD `e58c194` in response to Clawhip pinpoint nudge at `1494653681222811751`. Distinct from #80/#81/#82 (status/error surfaces lie about *static* runtime state): this is a surface that lies about *time itself*, and the lie is smeared into every live-agent system prompt, not just a single error string or status field.
|
||||
|
||||
84. **`claw dump-manifests` default search path is the build machine's absolute filesystem path baked in at compile time — broken and information-leaking for any user running a distributed binary** — dogfooded 2026-04-17 on main HEAD `70a0f0c` from `/tmp/cd4` (fresh workspace). Running `claw dump-manifests` with no arguments emits:
|
||||
84. **DONE — `claw dump-manifests` no longer bakes or leaks the build machine's absolute filesystem path** — fixed 2026-06-03 in `fix: make dump-manifests self-contained`. The runtime default now uses the current workspace and emits the Rust resolver inventory directly, so distributed binaries no longer depend on upstream TypeScript source files or compile-time `CARGO_MANIFEST_DIR` paths. Explicit `--manifests-dir` remains as a discovery-root override and invalid roots return typed `missing_manifests` diagnostics. Original filing below: running `claw dump-manifests` with no arguments emitted:
|
||||
```
|
||||
error: Manifest source files are missing.
|
||||
repo root: /Users/yeongyu/clawd/claw-code
|
||||
@@ -6303,7 +6302,7 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
|
||||
|
||||
422. **`export --output-format json` and `--resume latest` report the same "no managed sessions" scenario using two different `kind` codes — `no_managed_sessions` vs `session_load_failed` — making "no session found" undetectable by a single kind-code check** — dogfooded 2026-04-30 KST (UTC+9) by Jobdori on `e939777f`. Running `claw export --output-format json` with no session present returns (on stderr, exit 1): `{"error":"no managed sessions found in .claw/sessions/<fingerprint>/","hint":"Start \`claw\` to create a session, then rerun with \`--resume latest\`.\nNote: claw partitions sessions per workspace fingerprint; sessions from other CWDs are invisible.","kind":"no_managed_sessions","type":"error"}`. Running `claw --resume latest /status --output-format json` with no session present returns (on stderr, exit 1): `{"error":"failed to restore session: no managed sessions found in .claw/sessions/<fingerprint>/","hint":"Start \`claw\` to create a session, then rerun with \`--resume latest\`.\nNote: claw partitions sessions per workspace fingerprint; sessions from other CWDs are invisible.","kind":"session_load_failed","type":"error"}`. Both describe the same root condition — there are no sessions to operate on — but they expose it via different `kind` discriminants. Automation that checks `kind == "no_managed_sessions"` to detect a cold workspace will miss the `--resume` path's `session_load_failed`, and vice versa. A wrapper that guards "run with --resume only if a session exists" must special-case both codes. The hint text is identical between them, suggesting the messages are logically equivalent. Additionally neither code matches the proposed canonical names `session_not_found` / `session_load_failed` as stable `ErrorKind` discriminants described in ROADMAP #77's fix shape, which explicitly proposes typed error-kind codes for session lifecycle failures. **Required fix shape:** (a) unify "no sessions found for this workspace fingerprint" under a single canonical `kind` code — either `no_managed_sessions` or `session_not_found` — used consistently by every command path that encounters an empty session registry; (b) if `session_load_failed` is a more general category (covering e.g. corrupt session files, IO errors, schema version mismatches), it should nest a concrete `reason:"no_managed_sessions"` or `reason:"session_not_found"` sub-field so callers can distinguish "empty registry" from "found but unreadable"; (c) align with the canonical error-kind contract proposed in #77; (d) add regression coverage proving `export` and `--resume latest` in an empty workspace both return an error with the same top-level `kind` code. **Why this matters:** session guard-rails in orchestration need a single stable `kind` to detect cold workspaces without enumerating all possible no-session synonyms. Two divergent codes for the same condition make defensive automation brittle and contradict the promise of machine-readable error envelopes. Source: Jobdori live dogfood, `e939777f`, 2026-04-30 KST (UTC+9).
|
||||
|
||||
407. **`config --output-format json` returns `files[].loaded:false` with no `load_error`, `not_found`, or `skip_reason` field — automation cannot distinguish "file does not exist", "file exists but parse failed", and "file exists but was skipped by policy" from the same `loaded:false` value; also `loaded_files` and `merged_keys` are bare integers with no per-file attribution** — dogfooded 2026-04-30 by Jobdori on `e939777f`. Running `./claw --output-format json config` on a workspace with 5 discovered config files returns `{"kind":"config","cwd":"...","files":[{"loaded":false,"path":"/Users/yeongyu/.claw.json","source":"user"},{"loaded":true,"path":"/Users/yeongyu/.claw/settings.json","source":"user"},{"loaded":true,"path":"/Users/yeongyu/clawd/claw-code/.claw.json","source":"project"},{"loaded":false,"path":"/Users/yeongyu/clawd/claw-code/.claw/settings.json","source":"project"},{"loaded":false,"path":"/Users/yeongyu/clawd/claw-code/.claw/settings.local.json","source":"local"}],"loaded_files":2,"merged_keys":2}`. Three of five files have `loaded:false` with no accompanying `not_found:true`, `parse_error`, `io_error`, or `skip_reason`; automation must stat each path separately to guess why. Also `loaded_files:2` and `merged_keys:2` are bare counts — ambiguous whether `merged_keys:2` means 2 total top-level JSON keys across all files or 2 unique merged settings. **Required fix shape:** (a) add `not_found: bool` and optional `load_error: string` to each `files[]` entry so callers can distinguish missing, parse-broken, and policy-skipped files without filesystem probing; (b) document or rename `merged_keys` as `merged_setting_count` or `total_merged_keys` to remove the int-semantics ambiguity; (c) optionally add `merged_keys_by_file: [{path, keys}]` for attribution; (d) add regression coverage proving `files[]` entries with `loaded:false` carry at minimum `not_found` distinguishing non-existent paths from load failures. Source: Jobdori live dogfood, `e939777f`, 2026-04-30.
|
||||
407. **DONE — `config --output-format json` returns structured file load states instead of bare `loaded:false` ambiguity** — fixed 2026-06-03 in `fix: report config file load statuses`. `claw config --output-format json` now emits `files[].status` (`loaded`, `not_found`, `skipped`, `load_error`), `reason`/`skip_reason`, and `detail` where applicable, plus top-level `load_error`, `merged_key_count`, and `merged_keys_meaning`. The config list surface is best-effort so one broken settings file no longer erases the rest of the discovery report, while section-specific config requests still preserve the typed nonzero parse-error envelope. Regression coverage: `inspect_classifies_missing_loaded_and_legacy_skipped_files`, `inspect_reports_parse_errors_but_keeps_valid_merged_config`, `config_json_reports_structured_unloaded_file_reasons_407`, `config_json_list_reports_parse_errors_without_dropping_file_statuses_407`, and existing `config_parse_error_has_typed_error_kind_and_hint_764`.
|
||||
|
||||
|
||||
408. **`status --output-format json` `workspace.changed_files` is ambiguous — on a workspace with 5 untracked files, `changed_files:5`, `staged_files:0`, `unstaged_files:0`, `untracked_files:5`; it is unclear whether `changed_files` is the sum of all four git-status categories or only a subset; automation cannot tell if `changed_files:5` means "5 tracked modified" or "5 total non-clean files including untracked"** — dogfooded 2026-04-30 by Jobdori on `e939777f`. Running `./claw --output-format json status` returns `{"workspace":{"changed_files":5,"staged_files":0,"unstaged_files":0,"untracked_files":5,...}}` — `changed_files==untracked_files==5` with staged and unstaged both zero. The field name `changed_files` implies "modified tracked files" but the value equals the untracked count, not `staged+unstaged`. Without a comment or documented definition, automation must probe whether `changed_files = staged + unstaged` (excludes untracked) or `changed_files = staged + unstaged + untracked + conflicted` (total dirty). Also `git_state:"dirty · 5 files · 5 untracked"` repeats the same data as a prose string alongside the structured integer fields — redundant human-readable string alongside machine-readable integers. **Required fix shape:** (a) document and stabilize `changed_files` as either `tracked_dirty_count` (staged+unstaged only) or `total_non-clean_count` (staged+unstaged+untracked+conflicted) and rename to remove the ambiguity; (b) ensure a machine consumer can compute `is_clean` as a single boolean field without interpreting `git_state` prose; (c) deprecate or remove `git_state` prose string now that all its constituent counts are available as integers; (d) add regression coverage proving `changed_files` semantics against a workspace with staged, unstaged, untracked, and conflicted files. Source: Jobdori live dogfood, `e939777f`, 2026-04-30.
|
||||
@@ -6345,58 +6344,58 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
|
||||
422. **Unknown top-level subcommands fall through to chat prompt path instead of returning `unknown_subcommand` error — typos silently send the subcommand string as a chat message to the configured LLM** — dogfooded 2026-05-11 by Jobdori on `b98b9a71` in response to Clawhip pinpoint nudge at `1503215095088676956`. Reproduction: `unset ANTHROPIC_AUTH_TOKEN; export ANTHROPIC_API_KEY=fake-key-for-routing-test; claw completely-bogus-subcommand --output-format json` returns `{"error":"api returned 401 Unauthorized (authentication_error) [trace req_011...]: invalid x-api-key","kind":"api_http_error"}` — proving the unknown token reached the Anthropic API endpoint as a chat prompt. With valid credentials, the bogus subcommand string would be silently consumed as a chat message, billing the user for a typo and producing whatever continuation the LLM generates. **Pre-error path:** `claw <unknown> --output-format json` with no creds returns `kind:"missing_credentials"` (the auth gate fires first), masking the routing bug. Only with creds present does the fallthrough manifest as the actual prompt being sent. **Sibling exit-code bug:** when the chat-path 401 returns, the JSON envelope is `kind:"api_http_error"` but exit code is **0**, while `cli_parse` errors (e.g. `--no-such-flag`) and `missing_credentials` errors correctly exit **1**. Exit-code parity between error envelopes is broken — automation that gates on `$?` will treat the 401-as-chat as success. **Required fix shape:** (a) reserve unknown top-level tokens that match no registered subcommand and emit `kind:"unknown_subcommand"` with `unknown:<token>` field and exit code 1, BEFORE the chat fallback path; (b) when a token is intended as a chat prompt, require an explicit verb (`prompt`, `chat`, `ask`) or `--prompt` flag; (c) ensure exit codes are non-zero for all `kind:*_error` envelopes; (d) regression test: `claw <bogus> --output-format json` with valid auth returns `kind:"unknown_subcommand"` exit 1, never reaches the API. **Why this matters:** automation that calls `claw <subcommand>` with a programmatically constructed verb (typo, version drift, refactored command) silently bills tokens and produces hallucinated output instead of a typed error. Cross-cluster with #108 (CLI fallthrough discovered earlier) — #422 is the post-#108 audit confirming the routing bug still bites with valid credentials. Source: Jobdori live dogfood, `b98b9a71`, 2026-05-11.
|
||||
|
||||
|
||||
423. **`claw prompt` does not read prompt text from stdin when no positional prompt arg is provided — `echo "what is 2+2" | claw prompt --output-format json` returns `kind:"unknown" error:"prompt subcommand requires a prompt string"` instead of consuming stdin** — dogfooded 2026-05-11 by Jobdori on `3c563fa1` in response to Clawhip pinpoint nudge at `1503222644739276951`. Reproduction: `echo "what is 2+2" | claw prompt --output-format json` → `{"error":"prompt subcommand requires a prompt string","hint":null,"kind":"unknown","type":"error"}` exit 1. Same for `claw prompt --output-format json` with stdin redirected from a file. The most common Unix automation pattern (`cmd | claw prompt`) is broken because the prompt subcommand only reads the positional argument, never falls through to stdin. **Sibling envelope-kind bug:** the error `kind` is `"unknown"` instead of a typed `"missing_argument"` or `"validation_error"`. The `unknown` discriminator is the catch-all bucket — automation that switches on `kind` to differentiate input-validation errors from runtime errors gets no signal here. **Required fix shape:** (a) when `prompt` subcommand has no positional prompt arg AND stdin is not a TTY (i.e., piped or redirected), read stdin to EOF and use that as the prompt; (b) emit `kind:"missing_argument"` (not `"unknown"`) when both positional arg and stdin are absent; (c) add `--prompt-stdin` or `--stdin` opt-in flag for explicit control; (d) regression tests: `echo X | claw prompt --output-format json` reaches the runtime with prompt=X, AND `claw prompt < /dev/null` returns `kind:"missing_argument"` exit 1. **Why this matters:** Unix pipelines are the foundation of CLI automation. Every other major CLI (curl, jq, gh, kubectl) accepts stdin as the primary input when no positional arg is given. Breaking this convention forces automation to either inline the prompt as a shell-quoted string (escaping nightmare for multiline/code) or write to a temp file first. The `kind:"unknown"` error category compounds the problem by making the failure indistinguishable from a runtime crash. Source: Jobdori live dogfood, `3c563fa1`, 2026-05-11.
|
||||
423. **DONE — `claw prompt` reads prompt text from stdin when no positional prompt arg is provided** — fixed 2026-06-03 in `fix: read prompt subcommand input from stdin`. `parse_args()` now treats non-empty piped stdin as the prompt body for `claw prompt` when the positional prompt is empty, and supports `--stdin` / `--prompt-stdin` to append piped context to an explicit positional prompt. The existing `missing_prompt` JSON/stdout contract is preserved for closed or whitespace-only stdin. User docs now show `printf '...' | ./target/debug/claw prompt --output-format json`, and regression coverage verifies both a pure stdin prompt and explicit stdin context reach the mock Anthropic provider request and return structured JSON output.
|
||||
|
||||
|
||||
424. **`--model` rejects bare canonical Anthropic model names (`claude-opus-4-7`, `claude-opus-4-6`, `claude-sonnet-4-6`) as `invalid_model_syntax` — only short aliases (`opus`, `sonnet`, `haiku`) and full prefixed form (`anthropic/claude-opus-4-7`) work; sibling: error message stale-suggests `claude-opus-4-6` not `4-7`** — dogfooded 2026-05-11 by Jobdori on `6c0c305a` in response to Clawhip pinpoint nudge at `1503230194889134103`. Reproduction: `claw --model claude-opus-4-7 status --output-format json` → `{"error":"invalid model syntax: 'claude-opus-4-7'. Expected provider/model (e.g., anthropic/claude-opus-4-6) or known alias (opus, sonnet, haiku)","kind":"invalid_model_syntax"}`. Same for `claude-opus-4-6`, `claude-sonnet-4-6`. Forcing `--model anthropic/claude-opus-4-7` works (`model:"anthropic/claude-opus-4-7"`, `model_source:"flag"`). Three problems compounded: (a) Anthropic-canonical model names without provider prefix are rejected even though the `claude-` prefix unambiguously identifies the provider; (b) the error suggests `anthropic/claude-opus-4-6` as the example — `4-7` shipped 2026-04-16 and is the current production Anthropic frontier model, the suggestion is one model behind; (c) the alias list `opus, sonnet, haiku` doesn't disambiguate version (which `opus` does the alias resolve to — `opus-4-6` or `opus-4-7`?). **Required fix shape:** (a) accept bare `claude-*` and `gpt-*` model names as canonical-named-without-prefix and route via name-prefix detection (already implemented for prefix-routed mode); (b) update the example in `invalid_model_syntax` error to current frontier (`anthropic/claude-opus-4-7`); (c) document or expose `opus` → exact-version mapping in the error message and in `claw doctor`/`status` output (`model_alias_resolved_to: "claude-opus-4-7"`); (d) regression test: `claw --model claude-opus-4-7 status --output-format json` returns `model_source:"flag"`, not `kind:"invalid_model_syntax"`. **Sibling bug observed in same probe:** `enabledPlugins` deprecation warning repeats 3 times in stderr for the same `~/.claw/settings.json` load — config file is being loaded/parsed 3 times during a single `status` invocation. **Why this matters:** every Anthropic doc, every CCAPI route, every internal tooling references models by their bare canonical name (`claude-opus-4-7`). Forcing the `anthropic/` prefix breaks copy-paste from Anthropic's own examples and adds a redundant token to every invocation. The stale `4-6` suggestion in the error message actively misdirects users away from the current model. Source: Jobdori live dogfood, `6c0c305a`, 2026-05-11.
|
||||
424. **DONE — `--model` accepts bare canonical provider model names and Anthropic routing prefixes are stripped before provider calls** — fixed 2026-06-03 in `fix: normalize Anthropic model routing`. `validate_model_syntax()` now accepts unambiguous bare `claude-*` and `gpt-*` model IDs while preserving raw model provenance, and Anthropic `/v1/messages` plus `/v1/messages/count_tokens` request bodies strip the CLI-only `anthropic/` routing prefix so default/alias models do not reach Anthropic as `anthropic/claude-*`. Existing `qwen-*`/`grok-*` prefix-hint behavior remains intentionally unchanged for provider families whose bare names are ambiguous with DashScope/xAI routing. Regression coverage: `standard_messages_body_strips_anthropic_routing_prefix`, `send_message_strips_anthropic_routing_prefix_on_wire`, `default_model_alias_uses_anthropic_routing_prefix`, and the bare `--model=claude-opus-4-6 status` / `--model gpt-4 prompt` parser assertions in `parses_single_word_command_aliases_without_falling_back_to_prompt_mode`.
|
||||
|
||||
|
||||
425. **Config file precedence (`.claw/settings.json` always wins over `.claw.json`) is undocumented in user-facing surfaces — `config --output-format json` reports both files as `loaded:true` with no `precedence_rank` or `wins_for_keys` attribution; sibling: deprecation warning fires 4× per status invocation (was 3× in #424, regression upward)** — dogfooded 2026-05-11 by Jobdori on `d7dbe951` in response to Clawhip pinpoint nudge at `1503237744451649537`. Reproduction: create `.claw.json` with `{"model":"anthropic/claude-sonnet-4-6"}` and `.claw/settings.json` with `{"model":"anthropic/claude-opus-4-7"}` in the same workspace. `claw status --output-format json` returns `model:"anthropic/claude-opus-4-7", model_source:"config"`. Reverse the files (.claw.json=opus, settings.json=sonnet) → `model:"anthropic/claude-sonnet-4-6"`. Confirmed: `.claw/settings.json` **always** wins over `.claw.json` for conflicting keys, regardless of file mtime or alphabetical order. `claw config --output-format json` reports both as `loaded:true` with no `precedence_rank`, `effective_for_keys`, or `shadowed_keys` attribution. The only signal of precedence is the final merged value in `status` — automation cannot programmatically discover which file contributed which key without re-implementing the merge logic. **Sibling bug (regression from #424):** the `enabledPlugins` deprecation warning now fires **4 times** in stderr per single `status` invocation (was 3× in #424's probe at HEAD `6c0c305a`; current HEAD `d7dbe951` shows 4×). Config load count went up by 1. **Sibling bug observed in config-section probe:** `claw config model --output-format json` with a `.claw.json` that contains a benign unknown key (e.g., `"alpha":"x"`) returns `{"error":"/path/.claw.json: unknown key \"alpha\" (line 1)","kind":"unknown"}` — the entire config command fails with a generic `unknown` kind instead of (a) tolerating unrecognized keys with a warning, or (b) emitting a typed `kind:"unknown_key"` error scoped to the offending file/key. **Required fix shape:** (a) document precedence order in `USAGE.md` (`.claw/settings.local.json > .claw/settings.json > .claw.json` for project scope; `user`/`system` scope at each layer); (b) add `precedence_rank:int` and optional `wins_for_keys:[string]` / `shadowed_keys:[string]` to each entry in `config --output-format json` `files[]`; (c) dedupe the deprecation warning to fire **once per discovered file** instead of N× per load pass; (d) make `config <section> --output-format json` tolerate unknown keys with warnings, OR emit `kind:"unknown_key"` with `path:` and `key:` fields scoped to the offending file. **Why this matters:** users mixing legacy `.claw.json` with new `.claw/settings.json` have no way to verify which file is actually controlling their runtime. The undocumented precedence + missing per-key attribution forces trial-and-error to debug config drift. Cross-references #407 (config files no load_error) and #415 (config section returns merged_keys count not values). Source: Jobdori live dogfood, `d7dbe951`, 2026-05-11.
|
||||
425. **DONE — config JSON exposes file precedence attribution and unknown config keys are warnings** — fixed 2026-06-03 in `fix: attribute config precedence in JSON`. Runtime config inspection now reports every discovered file with `precedence_rank`, `wins_for_keys`, and `shadowed_keys`, so `.claw/settings.json` overriding legacy `.claw.json` is visible without reimplementing merge order. Unknown keys are tolerated as structured validation warnings, including `claw config <section> --output-format json`, while wrong-type errors still fail. The deprecation-warning path remains deduplicated once per process for text-mode `status`, and JSON config surfaces collect warnings structurally without stderr duplication. Docs in `USAGE.md` now spell out the precedence chain and JSON attribution fields. Regression coverage: `config_json_attributes_precedence_and_shadowed_keys_425`, `config_section_json_tolerates_unknown_keys_as_warnings_425`, `status_deduplicates_config_deprecation_warnings_per_invocation_425`, and the runtime validator unknown-key warning tests.
|
||||
|
||||
|
||||
426. **`ANTHROPIC_MODEL` env var bypasses the `invalid_model_syntax` validator that `--model` enforces — bogus model strings are accepted with `status:"ok"`, deferred-failing only when the first API call is made** — dogfooded 2026-05-11 by Jobdori on `3730b459` in response to Clawhip pinpoint nudge at `1503245298800136296`. Reproduction (asymmetric validation): `claw --model bogus-model-xyz status --output-format json` returns `kind:"invalid_model_syntax"` exit 1; `ANTHROPIC_MODEL=bogus-model-xyz claw status --output-format json` returns `model:"bogus-model-xyz", model_raw:"bogus-model-xyz", model_source:"env", status:"ok"` — the doctor surface lies that the configured model is valid when it is not. The bogus model only manifests as a failure when the first prompt fires and the API rejects it with 404/400. Three sibling discoveries in the same probe: (a) **alias indirection invisible**: `ANTHROPIC_MODEL=opus claw status --output-format json` returns `model:"claude-opus-4-6", model_raw:"opus", model_source:"env"` — the `opus` alias resolves to `claude-opus-4-6` (the *previous* frontier, not the current `claude-opus-4-7` released 2026-04-16). Users typing `opus` get yesterday's model with no warning. (b) **`CLAW_MODEL` env var silently ignored**: `CLAW_MODEL=opus claw status` shows `model:"claude-opus-4-6" model_source:"default"` — the `CLAW_MODEL` env var (the project-namespaced equivalent that users expect) does not exist; only `ANTHROPIC_MODEL` is honored. No warning when a `CLAW_*` env var that looks like it should work is set. (c) **`ANTHROPIC_DEFAULT_MODEL` also silently ignored**: the longer-named env var that some Anthropic SDKs use is not recognized. **Required fix shape:** (a) symmetric validation: `ANTHROPIC_MODEL` env value must pass the same `invalid_model_syntax` check that `--model` does, and `claw status` must return `kind:"invalid_model"` / `status:"warn"` (not `status:"ok"`) when the resolved model is unrecognized; (b) expose alias resolution in `status`: add `model_alias_resolved_to:string|null` field so automation can see `opus → claude-opus-4-6`; (c) bump the `opus` alias to `claude-opus-4-7` (current frontier) or document the alias-to-version mapping policy explicitly; (d) accept `CLAW_MODEL` and `ANTHROPIC_DEFAULT_MODEL` env vars with parity to `ANTHROPIC_MODEL`, OR emit a warning when those env vars are set but unrecognized. **Why this matters:** the most common automation pattern is `export ANTHROPIC_MODEL=...` in a shell rc file. Bogus values pass silently, alias indirection hides the actual model in use, and `CLAW_MODEL` looking like a working name but doing nothing is a footgun. Cross-references #424 (bare canonical names rejected at validator level) — together #424 + #426 make model selection inconsistent across CLI flag, env var, and alias paths. Source: Jobdori live dogfood, `3730b459`, 2026-05-11.
|
||||
426. **DONE — environment model selection is validated and status exposes alias/env provenance** — fixed 2026-06-03 in `fix: validate env model selection`. `CLAW_MODEL`, `ANTHROPIC_MODEL`, and `ANTHROPIC_DEFAULT_MODEL` now share the same env-model path before config/default fallback; prompt/REPL startup validates the resolved model before provider construction; and `status --output-format json` reports invalid env/config models as `status:"warn"` with `model_validation_error_kind:"invalid_model"` while preserving workspace/config/sandbox context. Status JSON now includes `model_alias_resolved_to` and `model_env_var`, making alias expansion and the winning env var auditable. The built-in/default `opus` alias now targets `anthropic/claude-opus-4-7` / `claude-opus-4-7`, with docs updated in `USAGE.md` and `rust/README.md`; the API alias table keeps token-limit metadata for both `claude-opus-4-7` and legacy `claude-opus-4-6`. Regression coverage: `status_json_accepts_namespaced_model_env_and_surfaces_alias_426`, `status_json_warns_on_invalid_model_env_426`, model alias/unit tests, and provider alias tests.
|
||||
|
||||
|
||||
427. **Subcommand `--help` paths (`resume`, `session`, `compact`) hit the auth gate and trigger config validation before returning static help — `claw resume --help` with no credentials returns `missing_credentials` error instead of help text** — dogfooded 2026-05-11 by Jobdori on `1fecdf09` in response to Clawhip pinpoint nudge at `1503252843669491892`. Reproduction (no env vars, isolated `CLAW_CONFIG_HOME`): `claw resume --help` returns `{"error":"missing Anthropic credentials; export ANTHROPIC_AUTH_TOKEN or ANTHROPIC_API_KEY..."}` instead of usage text. Same for `claw session --help`, `claw compact --help`. By contrast, `claw prompt --help` and `claw --help` (top-level) return proper usage text without auth. Even worse: with a broken `.claw.json` discovered up the parent directory tree (e.g., `mcpServers.missing-command: missing string field command`), the subcommand `--help` paths fail with `[error-kind: unknown]` from config validation — config load is happening before `--help` is parsed. **Sibling exit-code bug:** `claw resume --help --output-format json` returns `kind:"missing_credentials"` but exits **0** (the exit-code parity bug from #422 reproduces on this path too — only `cli_parse` exits 1 consistently). **Sibling: `claw resume <bogus-id>` should be local-only** but also hits `missing_credentials` — `resume` of a session that doesn't exist on disk should return `kind:"session_not_found"` from a local lookup, not require API credentials. Same class as ROADMAP #357 (session list requires creds) and #369 (session help/fork require credentials) — now confirmed for `resume`. **Required fix shape:** (a) `--help` MUST short-circuit before any auth check, config load, or session resolution — emit static usage text from a compiled-in string table, no I/O; (b) `resume <id>` must check the local session store first; if the id is absent on disk, emit `kind:"session_not_found"` with `sessions_dir` field; only require auth when resuming a known-on-disk session that requires re-establishing API context; (c) ensure exit code 1 for all error envelopes including `missing_credentials` returned from a `--help` path that should never have reached the auth gate; (d) regression test: with empty `CLAW_CONFIG_HOME` and no env vars, every `claw <subcommand> --help` returns usage text on stdout, exit 0, no `kind:*_error` envelope. **Why this matters:** `--help` is the universal CLI discovery primitive. Failing `--help` because of missing API credentials or broken config files makes claw undiscoverable to users debugging an already-broken setup. Cross-references #357 (session list), #369 (session help/fork), #422 (exit code parity), #108 (subcommand fallthrough). Source: Jobdori live dogfood, `1fecdf09`, 2026-05-11.
|
||||
427. **DONE — resume/session/compact help short-circuits locally and missing resume sessions report the session store** — fixed 2026-06-03 in `fix: keep session help local`. `claw resume --help`, `claw --resume --help`, `claw session --help`, and `claw compact --help` now route through static `LocalHelpTopic` output before config loading, session resolution, credential checks, provider startup, or slash-command interactive-only fallthrough. The direct `claw resume <session>` alias now shares the existing `--resume` restore parser, so `claw resume <missing> --output-format json` returns a local `session_not_found` restore envelope with `sessions_dir` and exit code 1 instead of reaching provider credentials. Regression coverage: `resume_session_compact_help_short_circuits_before_config_or_auth_427`, `resume_missing_session_json_reports_local_store_before_auth_427`, local help parser tests, and resume parser tests.
|
||||
|
||||
|
||||
428. **Default `permission_mode` is `danger-full-access` — claw runs with FULL filesystem + network + tool access out of the box, with no opt-in flag and no warning from `doctor`** — dogfooded 2026-05-11 by Jobdori on `72048449` in response to Clawhip pinpoint nudge at `1503260393622212628`. Reproduction (no env vars, isolated `CLAW_CONFIG_HOME`, no config files, no CLI flags): `claw status --output-format json` returns `permission_mode:"danger-full-access"` as the default. The three supported modes per the validator error message are `read-only`, `workspace-write`, `danger-full-access` — and `danger-full-access` is chosen with zero user opt-in. `claw doctor --output-format json` produces a `sandbox` check with `status:"warn", summary:"sandbox was requested but is not currently active"` (because macOS lacks Linux `unshare`), but **emits no warning, info, or summary about the permission_mode itself being danger-full-access**. There is no `permissions` check in `doctor` output at all. **Required fix shape:** (a) change default `permission_mode` to `workspace-write` (safe-by-default: filesystem write limited to cwd, network limited to LLM endpoints, no arbitrary command exec); (b) require explicit `--permission-mode danger-full-access` or `--dangerously-skip-permissions` to opt into full access; (c) add a `permissions` check to `doctor --output-format json` that emits `status:"warn"` when `permission_mode == "danger-full-access"` without explicit source (flag/env/config), with details like `mode:"danger-full-access", source:"default", message:"running with full access without explicit opt-in"`; (d) document the three modes and the default in USAGE.md with one-paragraph descriptions of what each mode allows. **Sibling typed-error bug:** `claw --permission-mode bogus-mode status --output-format json` returns `kind:"unknown"` instead of `kind:"invalid_permission_mode"` — same catch-all problem as #424, #426. **Sibling flag-name asymmetry:** `--dangerously-skip-permissions` works but `--skip-permissions` (Claude Code's flag) returns `kind:"cli_parse"` `unknown option`. Users migrating from Claude Code lose the short flag name. **Why this matters:** every other security-conscious CLI (Docker, kubectl, terraform) requires explicit opt-in for dangerous modes. Defaulting to `danger-full-access` is a footgun for first-time users who pipe `curl install.sh | sh` and immediately get a tool with full filesystem write and arbitrary command exec. The doctor surface is the only diagnostic users consult before trusting the tool, and it stays silent about the most permissive setting. Cross-references #50, #87, #91, #94, #97, #101, #106, #115, #123 (permission-audit sweep) — those all cover permission *rule* and *list* surfaces; #428 covers the *mode default* itself. Source: Jobdori live dogfood, `72048449`, 2026-05-11.
|
||||
428. **DONE — default permission mode is workspace-write with auditable permission provenance** — fixed 2026-06-03 in `fix: default to workspace-write permissions`. Fresh invocations now resolve the fallback permission mode to `workspace-write` instead of `danger-full-access`; `danger-full-access` requires an explicit `--permission-mode danger-full-access`, `--dangerously-skip-permissions`, `--skip-permissions`, env, or config opt-in. `status --output-format json` includes `permission_mode_source` and `permission_mode_env_var`, and `doctor --output-format json` includes a `permissions` check with `mode`, `source`, `source_explicit`, `message`, and tool allow/gate lists. Invalid CLI permission modes now emit typed `invalid_permission_mode` JSON errors, and docs describe the three modes plus the safe default. Regression coverage: `default_permission_mode_is_workspace_write_and_audited_428`, `explicit_danger_permission_mode_is_audited_and_alias_supported_428`, `invalid_permission_mode_json_is_typed_428`, parser default tests, classifier coverage, and `given_workspace_write_enforcer_when_web_tools_then_denied`.
|
||||
|
||||
|
||||
429. **No global `--cwd`/`-C`/`--directory` flag — `claw` cannot be invoked against an arbitrary working directory without first `cd`-ing into it; `--cwd` only exists as a subcommand option for `system-prompt`, and the `cli_parse` "Did you mean --acp?" suggestion is misleading (the `--acp` flag is unrelated to directory selection)** — dogfooded 2026-05-11 by Jobdori on `ec882f4c` in response to Clawhip pinpoint nudge at `1503267943285264394`. Reproduction: `claw --cwd /tmp/claw-dog-cwd status --output-format json` → `{"error":"unknown option: --cwd","hint":"Did you mean --acp?\nRun `claw --help` for usage.","kind":"cli_parse"}`. Same error for `--cwd <relative>`, `--cwd <nonexistent>`, `--cwd <file-not-dir>`, `--cwd ""`. Inspecting `claw --help`: `--cwd PATH` appears ONLY in the usage line `claw system-prompt [--cwd PATH] [--date YYYY-MM-DD]` — it is not a global flag and is not accepted by `status`, `doctor`, `mcp list`, `init`, or any other subcommand. Users programmatically running claw against multiple workspaces must `cd` into each one before invoking, breaking the `subprocess.run(['claw', 'status', '--cwd', ws], cwd=other_dir)` pattern that every other major CLI (cargo `-C`, git `-C`, npm `--prefix`, gh `--repo` semantically, kubectl `--kubeconfig`+`--context`) supports. **Sibling misleading-suggestion bug:** the `cli_parse` error's `hint` field suggests `Did you mean --acp?` for `--cwd`. `--acp` is the alias for ACP/Zed editor integration (entirely unrelated to working directory). The Levenshtein-distance auto-complete is matching on first-character similarity without considering semantic relatedness. Users following the hint get a totally orthogonal feature. **Required fix shape:** (a) add a global `--cwd PATH` / `-C PATH` flag accepted before any subcommand, parsed in the global flag pre-pass; (b) validate the path exists and is a directory; emit `kind:"invalid_cwd"` with `path:` and `reason:` (`"not_found"`/`"not_a_directory"`/`"empty"`) when validation fails; (c) document the precedence: `--cwd` flag > `$PWD` > `env::current_dir()`; (d) fix the "Did you mean" hint algorithm to filter suggestions by semantic category (don't suggest `--acp` for `--cwd`; suggest `claw system-prompt --cwd PATH` if the user clearly wants `cwd` override but used the wrong scope); (e) regression test: `claw --cwd /tmp status --output-format json` from any `$PWD` returns `workspace.cwd:"/private/tmp"` (or `cwd:"/tmp"` after #421 fix). **Why this matters:** every claw automation orchestrator runs claw against multiple workspaces from a single parent process. Forcing `cd` before each invocation breaks parallelism (can't use shared cwd across concurrent invocations), breaks subprocess wrappers that want to pass cwd explicitly, and breaks `xargs`/`parallel`-style pipelines. Cross-references #421 (cwd canonicalization leak — fix should canonicalize but report user-input via `--cwd`). Source: Jobdori live dogfood, `ec882f4c`, 2026-05-11.
|
||||
429. **DONE — global workspace directory override is accepted and validated before dispatch** — fixed 2026-06-03 in `fix: add global cwd override`. `claw --cwd PATH ...`, `claw -C PATH ...`, and `claw --directory PATH ...` now run as if launched from the selected workspace before config, status, doctor, MCP, skills, and other command dispatch. The override takes precedence over process `$PWD`; invalid values emit typed `invalid_cwd` JSON errors with `path` and `reason` (`not_found`, `not_a_directory`, or `empty`) instead of the old misleading `Did you mean --acp?` CLI parse path. Help/usage docs list the global flags and the precedence/validation contract. Regression coverage: `global_cwd_flag_routes_status_workspace_and_short_alias_429`, `global_cwd_flag_reports_typed_invalid_paths_429`, and classifier coverage for `invalid_cwd`.
|
||||
|
||||
|
||||
430. **`dump-manifests` is documented as "emit every skill/agent/tool manifest the resolver would load for the current cwd" but actually requires the upstream Claude Code TypeScript source files (`src/commands.ts`, `src/tools.ts`, `src/entrypoints/cli.tsx`) — the command is unusable for any user who installed claw without cloning the original Claude Code repo** — dogfooded 2026-05-11 by Jobdori on `075c2144` in response to Clawhip pinpoint nudge at `1503275502046023690`. Reproduction: `claw dump-manifests --output-format json` returns `{"error":"Manifest source files are missing.","hint":"repo root: /private/tmp/claw-dog-0530\n missing: src/commands.ts, src/tools.ts, src/entrypoints/cli.tsx\n Hint: set CLAUDE_CODE_UPSTREAM=/path/to/upstream or pass \`claw dump-manifests --manifests-dir /path/to/upstream\`.","kind":"missing_manifests"}`. The fresh-main worktree at `/private/tmp/claw-dog-0530` does not contain these TypeScript files because the Rust port doesn't include the upstream TS source. The `--help` text says the command works against "the current cwd" but in practice it requires `CLAUDE_CODE_UPSTREAM=` pointing at an unshipped TS source tree. **Three sibling problems compounded:** (a) **derivative-work disclosure leak**: the error message exposes that `claw-code` is a port of Claude Code (`CLAUDE_CODE_UPSTREAM` env var name) — even if true, surfacing this in a casual diagnostic message couples user-facing behavior to upstream provenance details. (b) **kind drift**: `claw dump-manifests --manifests-dir /tmp/nonexistent --output-format json` returns `kind:"unknown"`, while `claw dump-manifests` (no override) returns `kind:"missing_manifests"`. Same root cause (no usable upstream), two different `kind` discriminators — automation cannot switch on a single error type. (c) **export-positional-arg silently dropped**: probed in the same run — `claw export <bogus-positional>` ignores the path and returns `kind:"no_managed_sessions"` regardless of what positional arg was passed. The `--help` advertises `[PATH]` as the output-file destination but the path is discarded before validation, indistinguishable from invocation with no args. **Required fix shape:** (a) make `dump-manifests` emit the manifests claw-code itself ships with (Rust-resolver-discovered skills/agents/tools), independent of any upstream TS source — that matches the `--help` description; (b) if upstream-comparison is genuinely needed for parity work, move it to a separate command like `parity dump-upstream-manifests` and remove the upstream dependency from `dump-manifests`; (c) standardize on one error `kind` for the manifest-missing failure mode (`missing_manifests` is more descriptive than `unknown`); (d) `claw export <PATH>` must validate the path positional arg before the session-discovery check, so users see `kind:"invalid_output_path"` (or similar) when the path is malformed instead of always seeing `kind:"no_managed_sessions"`. **Why this matters:** `dump-manifests` is the inventory surface a downstream automation lane would call to learn what claw can do in the current workspace. If it's broken without upstream TS source, downstream lanes can't introspect — they have to fall back to `agents list`/`skills list`/`mcp list` separately and re-aggregate. Cross-references #422 (kind:unknown for unknown_subcommand), #423 (kind:unknown for missing_argument), #428 (kind:unknown for invalid_permission_mode) — `kind:"unknown"` keeps appearing as the catch-all for surfaces that should have typed kinds. Source: Jobdori live dogfood, `075c2144`, 2026-05-11.
|
||||
430. **DONE — `dump-manifests` emits the self-contained Rust resolver inventory instead of requiring upstream Claude Code TypeScript source files** — fixed 2026-06-03 in `fix: make dump-manifests self-contained`. `claw dump-manifests --output-format json` now succeeds from an installed workspace with `source:"rust-resolver"`, command/tool/agent/skill/bootstrap manifests, no `CLAUDE_CODE_UPSTREAM` hint, and no `src/commands.ts` dependency. Explicit `--manifests-dir` scopes resolver discovery to another directory and missing/not-directory roots emit typed `missing_manifests` JSON. Sibling export diagnostics now validate explicit positional/`--output` paths before session discovery and return typed `invalid_output_path` JSON with `path` and `reason`. Regression coverage: `dump_manifests_defaults_to_rust_resolver_inventory`, `dump_manifests_scopes_explicit_manifest_dir_without_upstream_ts`, `dump_manifests_missing_explicit_dir_has_typed_kind`, `dump_manifests_and_init_emit_json_when_requested`, `local_json_surfaces_have_non_empty_action_contract_714`, and `export_invalid_output_path_reports_typed_json_430`.
|
||||
|
||||
|
||||
431. **`skills uninstall <name>` requires Anthropic credentials despite being a local filesystem operation — `claw skills uninstall nonexistent-skill-xyz --output-format json` returns `kind:"missing_credentials"` instead of resolving locally that the skill doesn't exist** — dogfooded 2026-05-11 by Jobdori on `328fd114` in response to Clawhip pinpoint nudge at `1503275502046023690` (sibling probe to #430). Reproduction (no creds, isolated `CLAW_CONFIG_HOME`): `claw skills uninstall nonexistent-skill-xyz --output-format json` returns `{"error":"missing Anthropic credentials; export ANTHROPIC_AUTH_TOKEN or ANTHROPIC_API_KEY...","kind":"missing_credentials"}`. Uninstalling a skill is a pure local filesystem operation: read the skills directory, find the named skill, remove its files. There is no semantic reason to require API credentials. Same class of bug as #357 (`session list` requires creds), #369 (`session help/fork` require creds), and #427 (`resume <bogus-id>` requires creds). **Three sibling findings in same probe:** (a) `claw skills install <bogus-name>` returns `{"error":"No such file or directory (os error 2)","kind":"unknown"}` — leaks raw OS error string with no hint about expected install source format (path vs name vs URL?), and the catch-all `kind:"unknown"` again instead of typed `kind:"skill_install_source_not_found"`. (b) `claw skills install` (no args) returns `action:"help"` with `unexpected:"install"` — but `install` IS a documented subcommand. The handler treats it as "unknown action" instead of "missing required argument". Should emit `kind:"missing_argument"` with `argument:"install_source"`. (c) `claw agents create my-agent` returns `action:"help"` with `unexpected:"create my-agent"` — there is no agent-creation surface at all. Users must hand-craft `.claw/agents/<name>.md` files with no scaffolding command, while `claw init` only creates the top-level `.claw/` skeleton. **Required fix shape:** (a) `skills uninstall <name>` must be local-first: enumerate the local skills dir, return `kind:"skill_not_found"` (with `skills_dir:` and `available_names:[]` fields) for missing, or remove the files and return `kind:"skills"` with `action:"uninstall", removed:<name>` for present skills; (b) `skills install <source>` must distinguish source forms (`path:`, `name:`, `url:`) and emit `kind:"invalid_install_source"` with the parsed-and-failed reason; (c) `skills install` (no args) emits `kind:"missing_argument"` with `argument:"install_source"`; (d) add `claw agents create <name>` (or `claw init agent <name>`) that scaffolds `.claw/agents/<name>.md` with a stub frontmatter; or document explicitly that agents are user-authored only. **Why this matters:** lifecycle commands (`uninstall`, `install`, `create`) are the primary surface for managing claw's extension surface area. If `uninstall` requires API creds, an offline user who fat-fingered an install can't undo it. If `install` returns a raw OS error, automation can't programmatically recover. If `agents create` doesn't exist, agent authoring is undocumented file-touching only. Cross-references #357, #369, #427 (auth-gate-on-local-ops cluster), and #422/#423/#428/#430 (`kind:"unknown"` catch-all cluster). Source: Jobdori live dogfood, `328fd114`, 2026-05-11.
|
||||
431. **DONE — `skills uninstall <name>` resolves locally instead of requiring Anthropic credentials** — fixed 2026-06-03 in `fix: keep skills lifecycle local`. `claw skills uninstall nonexistent-skill-xyz --output-format json` now stays on the local skills lifecycle surface and emits `kind:"skills"`, `action:"uninstall"`, `error_kind:"skill_not_found"`, `skills_dir`, `available_names`, and a hint without provider credentials. `claw skills install` no-arg emits typed `missing_argument` with `argument:"install_source"`; `claw skills install <bogus-name>` emits typed `invalid_install_source` with `source`, `source_kind`, `reason`, and a recovery hint. Installed skill roundtrips remove the installed files through the shared local lifecycle helper. `claw agents create <name>` now scaffolds `.claw/agents/<name>.toml` and lists through the existing TOML agent discovery surface. Regression coverage: `skills_lifecycle_errors_have_typed_local_json_795_431`, `skills_install_uninstall_roundtrip_stays_local_431`, `agents_create_scaffolds_toml_and_lists_locally_431`, local command routing tests, parser discriminant tests, and command help/docs assertions.
|
||||
|
||||
|
||||
432. **`--allowedTools` validator inconsistency: tool name list is half snake_case (`bash`, `read_file`, `write_file`, `edit_file`, `glob_search`, `grep_search`) and half PascalCase (`WebFetch`, `WebSearch`, `TodoWrite`, `Skill`, `Agent`, `Sleep`) with three UPPERCASE entries (`REPL`, `LSP`, `MCP`); accepts undocumented CamelCase aliases (`Read`, `Write`, `Edit`) and silently translates them to snake_case; argument parsing consumes the next positional when value is missing** — dogfooded 2026-05-11 by Jobdori on `fad53e2d` in response to Clawhip pinpoint nudge at `1503283046856655029`. Reproduction: `claw --allowedTools status --output-format json` → `{"error":"unsupported tool in --allowedTools: status (expected one of: bash, read_file, write_file, edit_file, glob_search, grep_search, WebFetch, WebSearch, TodoWrite, Skill, Agent, ToolSearch, NotebookEdit, Sleep, SendUserMessage, Config, EnterPlanMode, ExitPlanMode, StructuredOutput, REPL, PowerShell, AskUserQuestion, TaskCreate, RunTaskPacket, TaskGet, TaskList, TaskStop, TaskUpdate, TaskOutput, WorkerCreate, WorkerGet, WorkerObserve, WorkerResolveTrust, WorkerAwaitReady, WorkerSendPrompt, WorkerRestart, WorkerTerminate, WorkerObserveCompletion, TeamCreate, TeamDelete, CronCreate, CronDelete, CronList, LSP, ListMcpResources, ReadMcpResource, McpAuth, RemoteTrigger, MCP, TestingPermission)","kind":"unknown"}`. The `status` subcommand was consumed as the `--allowedTools` value because the flag parser doesn't distinguish missing-value from end-of-flag-args. The error reveals **the supported tool list mixes naming conventions inconsistently within a single error message**: snake_case (`bash`, `read_file`, `write_file`, `edit_file`, `glob_search`, `grep_search`), PascalCase (`WebFetch`, `WebSearch`, `TodoWrite`, `Skill`, `Agent`, `Sleep`, `Config`, `PowerShell`, `AskUserQuestion`, `TaskCreate`, `WorkerCreate`, `TeamCreate`, `CronCreate`), UPPERCASE (`REPL`, `LSP`, `MCP`), and CamelCase compounds (`McpAuth`, `RemoteTrigger`). **Hidden alias mapping**: `claw --allowedTools Read,Write,Edit status --output-format json` is accepted and returns `allowed_tools.entries:["edit_file","read_file","write_file"]` — proving the validator has an undocumented CamelCase→snake_case alias map (`Read`→`read_file`, `Write`→`write_file`, `Edit`→`edit_file`) that is not surfaced in the error message. Users who copy-paste tool names from Claude Code documentation work, users who copy from the validator error don't. **Sibling missing-value bug:** `claw --allowedTools status` with `status` as a positional subcommand is interpreted as `--allowedTools=status`, swallowing the subcommand. The flag parser must require a value for `--allowedTools` and emit `kind:"missing_argument"` when followed by a recognized subcommand or `--`-prefixed flag instead of silently treating the next arg as a tool name. **Sibling typed-kind bug:** both errors use `kind:"unknown"` instead of typed `kind:"invalid_tool_name"` / `kind:"missing_argument"` — the catch-all keeps appearing (#422/#423/#424/#428/#430/#431/#432). **Required fix shape:** (a) standardize the canonical tool-name registry on one casing convention (snake_case is most CLI-ergonomic) and update both the registry and all CamelCase aliases; (b) document and expose the alias map (`tool_aliases:{Read:"read_file",...}`) in `claw doctor`/`status` and in the validator error; (c) flag parser must require a value for `--allowedTools` and refuse to consume a recognized subcommand or `-`/`--`-prefixed token as the value, emit `kind:"missing_argument"` with `argument:"--allowedTools"`; (d) emit `kind:"invalid_tool_name"` with `tool_name:` and `available:[]` fields instead of `kind:"unknown"`; (e) regression test that `claw --allowedTools <subcommand>` rejects with `missing_argument`, and that the canonical name list in errors uses the same casing as the alias map. **Why this matters:** `--allowedTools` is the primary surface for restricting claw's tool surface area (security-relevant). Inconsistent naming between the validator error and the alias map means users following the error message guidance pick names that work in some places and fail in others. The missing-value bug silently swallows a subcommand, leading to confusing "unsupported tool: status" errors when the user actually wanted to run `claw status`. Cross-references #94/#97/#101/#106/#115/#123 (permission-rule audit), #428 (default permission_mode), #422/#423/#424/#428/#430/#431 (`kind:"unknown"` catch-all). Source: Jobdori live dogfood, `fad53e2d`, 2026-05-11.
|
||||
432. **DONE — `--allowedTools` uses a canonical snake_case registry with typed diagnostics and documented aliases** — fixed 2026-06-04 in `fix: type allowed tools validation`. `GlobalToolRegistry::normalize_allowed_tools` now normalizes built-in, plugin, runtime, and MCP wrapper tool names to canonical snake_case allow-list entries while still accepting documented aliases such as `read`, `Read`, and legacy provider-facing names like `WebFetch`/`MCPTool`. Provider tool definitions and CLI/subagent executors compare against canonical names, so aliases do not break internal dispatch. `claw --allowedTools status --output-format json` now refuses to consume `status` as a value and emits typed `missing_argument` JSON with `argument:"--allowedTools"`; unsupported names emit typed `invalid_tool_name` JSON with `tool_name`, `available`, and `tool_aliases`. `status --output-format json` exposes `allowed_tools.available` and `allowed_tools.aliases`, and help/usage docs describe canonical names plus aliases. Regression coverage: `parses_allowed_tools_flags_with_aliases_and_lists`, `rejects_allowed_tools_followed_by_subcommand_or_flag_432`, `rejects_unknown_allowed_tools`, `allowed_tools_errors_have_typed_json_and_alias_map_432`, `allowed_tools_normalize_to_canonical_snake_case_and_aliases_432`, status JSON alias assertions, MCP wrapper normalization coverage, and classifier coverage for `invalid_tool_name`.
|
||||
|
||||
|
||||
433. **Repeated `--output-format` flag silently takes the last value without warning — `claw --output-format json --output-format text status` produces text output, no signal that the prior `json` was overridden; sibling: `--output-format` value is case-sensitive (`JSON` rejected as `kind:"unknown"`); sibling: no `CLAW_OUTPUT_FORMAT` env var for default format override** — dogfooded 2026-05-11 by Jobdori on `ce39d5c5` in response to Clawhip pinpoint nudge at `1503290592556220488`. Reproduction: `claw --output-format json --output-format text status` returns the text-format `Status\n Model claude-opus-4-6...` table — the first `--output-format json` was silently overridden. No warning, no `format_overridden:true` field, no stderr message. Scripts that compose flag arrays from multiple sources (`flags=("${BASE_FLAGS[@]}" --output-format json)` while `BASE_FLAGS` already contains `--output-format text`) silently get the wrong format. **Three sibling findings in same probe:** (a) **case-sensitivity drift**: `claw --output-format JSON status` returns `{"error":"unsupported value for --output-format: JSON (expected text or json)","kind":"unknown"}` — error message tells user to use lowercase `json` but doesn't accept the uppercase form that users often type from muscle memory. Most CLI flag-value validators (cargo, kubectl, gh) are case-insensitive for enum values or accept both forms with normalization. (b) **`kind:"unknown"` for invalid format value**: same catch-all bucket bug as #422/#423/#424/#428/#430/#431/#432 — should be `kind:"invalid_output_format"` with `value:` and `expected:["text","json"]` fields. (c) **no env-var default for output format**: `CLAW_OUTPUT_FORMAT=json claw status` silently ignored — no env override for the global default, forcing scripts to repeat `--output-format json` on every invocation. Other major CLIs honor `KUBECTL_OUTPUT=`, `AWS_DEFAULT_OUTPUT=`, `GH_NO_PROMPT=` etc. (d) **silently-ignored env vars `CLAW_LOG`/`RUST_LOG`**: no env-based log level control surfaced in `claw doctor` — debug logging requires undocumented `RUST_LOG=` (Rust convention) but `claw --help` doesn't mention either. **Required fix shape:** (a) repeated `--output-format` (or any flag that takes a value, not a count flag) emits a warning to stderr (`warning: --output-format specified multiple times; using last value 'text'`) and adds a `format_source:"flag", format_overridden:[]` field to the JSON envelope; (b) accept case-insensitive enum values for `--output-format` (`JSON`, `Json`, `json` all work), document the canonical lowercase form in `--help`; (c) emit `kind:"invalid_output_format"` (not `kind:"unknown"`) when value is invalid; (d) accept `CLAW_OUTPUT_FORMAT` env var as the default for `--output-format`, with flag-overrides-env precedence documented; (e) document `RUST_LOG` / `CLAW_LOG` in `--help` or doctor output as the log-level env vars; (f) regression test: repeated flag emits stderr warning + JSON metadata field; case-insensitive enum accepts all three casings; env-var default is honored when flag is absent. **Why this matters:** scripts that compose flag arrays from multiple sources (CI envs + per-invocation flags) silently get the wrong output format. Case-sensitive enum values trip up users typing from muscle memory. Missing env-var defaults force per-invocation flag repetition. Cross-references #422/#423/#424/#428/#430/#431/#432 (`kind:"unknown"` catch-all cluster). Source: Jobdori live dogfood, `ce39d5c5`, 2026-05-11.
|
||||
433. **DONE — `--output-format` selection is typed, case-insensitive, env-configurable, and auditable** — fixed 2026-06-04 in `fix: type output format selection`. `CliOutputFormat::parse` now accepts `text`/`json` in any casing, `CLAW_OUTPUT_FORMAT` seeds the default output format when no CLI output-format flag is present, explicit flags override the env default, and repeated flags emit `warning: --output-format specified multiple times; using last value '...'`. `status --output-format json` exposes `format_source`, `format_raw`, and `format_overridden`; invalid values return typed `invalid_output_format` JSON with `value`, `expected:["text","json"]`, and a recovery hint instead of `kind:"unknown"`. Top-level help documents `CLAW_OUTPUT_FORMAT`, `CLAW_LOG`, and `RUST_LOG`, and doctor system JSON surfaces those env values. Regression coverage: `output_format_flags_and_env_have_typed_contract_433` and `classify_error_kind_returns_correct_discriminants`. Verification: `cargo fmt --manifest-path rust/Cargo.toml --all -- --check`, focused output-format and classifier tests, `scripts/roadmap-check-ids.sh`, `git diff --check`, `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli --no-run`, and `cargo build --manifest-path rust/Cargo.toml --workspace --locked`.
|
||||
|
||||
|
||||
434. **POSIX `--` end-of-flags separator is not recognized — `claw -- "-prompt-with-dash"` returns `{"error":"unknown option: --","hint":"Did you mean -V?","kind":"cli_parse"}` instead of treating subsequent args as positional; shorthand prompt mode cannot accept dash-prefixed prompts at all** — dogfooded 2026-05-11 by Jobdori on `0e5f6958` in response to Clawhip pinpoint nudge at `1503298142286905484`. Reproduction: `claw -- "-prompt-with-dash" --output-format json` returns `{"error":"unknown option: --","hint":"Did you mean -V?\nRun \`claw --help\` for usage.","kind":"cli_parse"}`. The POSIX/GNU CLI convention — universally honored by cargo, git, npm, gh, kubectl, grep, ls, find, etc. — is that `--` terminates flag parsing and treats everything after it as positional arguments. claw rejects `--` itself as an unknown flag. **Sibling misleading-suggestion bug (recurring from #429):** the `cli_parse` hint suggests `Did you mean -V?` for `--`. `-V` is the version flag; `--` is the end-of-flags separator. They have no semantic relationship; the auto-complete is matching on prefix-character similarity only. **Sibling shorthand-prompt limitation:** `claw "-just a prompt" --output-format json` returns `{"error":"unknown option: -just a prompt","kind":"cli_parse"}` and `claw "--bogus-flag-like" --output-format json` returns the same. The shorthand non-interactive prompt mode (documented as `claw [--model MODEL] [--output-format text|json] TEXT`) cannot accept any TEXT that starts with `-` or `--`, even when the entire string is shell-quoted as a single token. Users must use the explicit `prompt` verb (`claw prompt "-prompt-with-dash"` works) to escape this, but the explicit verb is documented as alternative not required. **Required fix shape:** (a) accept POSIX `--` as the end-of-flags marker globally — every arg after `--` is positional; (b) shorthand prompt mode must distinguish "this looks like a flag" from "this is a quoted positional that happens to start with `-`" by looking at whether the token matches any registered flag name (`-h`, `-V`, `--help`, `--version`, etc.) — strings that don't match any flag should be treated as prompt text; (c) fix the "Did you mean" hint algorithm to filter by semantic category (don't suggest `-V` for `--`, suggest "use \`--\` to terminate flag parsing" if the user types just `--`); (d) regression test: `claw -- "-foo"` reaches the runtime with prompt=`-foo`; `claw "-not-a-flag"` is treated as shorthand prompt when no registered flag matches; canonical `--` is recognized. **Why this matters:** POSIX `--` is the universal mechanism for passing arbitrary text (filenames starting with `-`, prompts containing flag-like syntax, log lines, etc.) to a CLI. Failing on `--` makes claw fundamentally unergonomic in shell pipelines (`echo "-q for quiet" | xargs claw` fails). The shorthand-prompt limitation forces users to remember the `prompt` verb specifically when their prompt happens to start with `-`. Cross-references #422 (unknown subcommand fallthrough), #423 (stdin not consumed by prompt), #429 ("Did you mean --acp" misleading suggestion). Source: Jobdori live dogfood, `0e5f6958`, 2026-05-11.
|
||||
434. **DONE — POSIX `--` and dash-prefixed shorthand prompts stay on the prompt path** — fixed 2026-06-04 in CI recovery after `41678eb` turned Rust CI red. Global argument parsing now treats `--` as an end-of-flags separator, stops JSON-output pre-scans before the separator, and forwards every following token as positional prompt text. Shorthand prompt mode accepts dash-prefixed text that is not a registered or near-miss CLI flag (`-not-a-flag`, `--bogus-flag-like literal`) while still rejecting real typo-like options such as `--resum` with the `--resume` suggestion. Direct `/status` invocation routes to the local status action per the resume-safe slash command contract, matching the CI-red parser regression tests restored after `7cfd83f`. Help, USAGE, and rust README document the `claw -- "-prompt-with-dash"` form. Regression coverage: `parses_dash_prefixed_prompt_text_434`, plus rerun CI-red parser tests `parses_bare_prompt_and_json_output_flag` and `parses_direct_agents_mcp_and_skills_slash_commands`. Verification: `cargo fmt --manifest-path rust/Cargo.toml --all -- --check`, focused parser tests, `scripts/roadmap-check-ids.sh`, `git diff --check -- USAGE.md rust/README.md rust/crates/rusty-claude-cli/src/main.rs ROADMAP.md`, docs source-of-truth/release-readiness/unit helper checks, `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli --no-run`, `cargo build --manifest-path rust/Cargo.toml --workspace --locked`, and `cargo clippy --manifest-path rust/Cargo.toml --workspace`. Local `cargo test --manifest-path rust/Cargo.toml --workspace` still hits the pre-existing Darwin-only `runtime::worker_boot::startup_preflight_warns_when_git_metadata_is_not_writable` permission assertion after all CLI parser tests pass; the red GitHub jobs were parser failures on head `41678eb`.
|
||||
|
||||
|
||||
435. **`claw --resume latest` on a fresh workspace exit code is 0 in text mode but 1 in JSON mode (text mode lies about success); sibling: failed `--resume` creates the `.claw/sessions/<fingerprint>/` directory tree as a filesystem side effect of the failure** — dogfooded 2026-05-11 by Jobdori on `e29010ed` in response to Clawhip pinpoint nudge at `1503305692566655096`. Reproduction (fresh empty dir, no `.claw/`, no sessions): `claw --resume latest` (text mode) prints `failed to restore session: no managed sessions found in .claw/sessions/0ead448127a2de44/` and exits **0**. Same invocation with `--output-format json` correctly exits **1** with `kind:"session_load_failed"`. Exit-code parity broken on the same input depending on format flag. **Sibling filesystem-side-effect bug:** after the failed `--resume latest` on a fresh empty workspace, the directory `.claw/sessions/0ead448127a2de44/` (the workspace-fingerprint partition) is created on disk despite the operation failing. The user did not opt into creating workspace metadata — they asked to resume an existing session, the resume failed, and now there's a partition directory hanging around. The fingerprint directory ought to be created lazily on first successful session save, not as a side effect of every resume attempt. **Three sibling findings in the same probe:** (a) **`claw --compact` alone (no other args) drops into the interactive REPL with the ANSI welcome banner** — `--compact` is documented as a modifier that strips tool call details in text mode for piping (`--compact ... useful for piping`), not as a verb that activates the REPL. Running `claw --compact` with no positional should be a no-op or an error explaining the flag needs a subcommand or prompt; entering the REPL is the wrong default. (b) **`claw --compact "hello"` (shorthand prompt) returns `{"error":"unknown subcommand: hello.","hint":"Did you mean help","kind":"unknown"}` — `--compact` disables shorthand prompt mode entirely**, treating the positional as a subcommand instead of as prompt text. Users must use the explicit `prompt` verb (`claw --compact prompt "hello"`) which contradicts the `claw [flags] TEXT` usage line in `--help`. (c) `kind:"unknown"` again for the unknown-subcommand error in --compact path — same catch-all bucket bug appearing for the 11th time across pinpoints. **Required fix shape:** (a) exit code 1 for all `failed_to_restore` / `session_load_failed` text-mode failures; text mode should print to stderr and exit non-zero, not print to stdout and exit 0; (b) defer `.claw/sessions/<fingerprint>/` creation to first successful save; failed `--resume` must not leave filesystem droppings; (c) `claw --compact` alone (no positional, no subcommand, stdin is TTY) should emit `kind:"missing_argument"` with `argument:"prompt or subcommand"` rather than activating the REPL; (d) `--compact` must be transparent to shorthand prompt mode parsing — `claw --compact "hello"` is equivalent to `claw --compact prompt "hello"`, both should reach the prompt path; (e) emit typed `kind:"unknown_subcommand"` not `kind:"unknown"` for fallthrough cases. **Why this matters:** scripts that gate on `$?` after `claw --resume latest` see success on text mode and failure on JSON mode — the same operation, two outcomes. The filesystem side effect pollutes a user's worktree with workspace partitions they didn't ask for, and CI pipelines that snapshot `.claw/` size silently grow on every failed `--resume`. Cross-references #422 (exit-code parity across error envelopes), #423 (`kind:"unknown"` for `missing_argument`), #434 (shorthand prompt limitations). Source: Jobdori live dogfood, `e29010ed`, 2026-05-11.
|
||||
435. **DONE — failed resume is non-zero and side-effect free; `--compact` stays a prompt modifier** — fixed 2026-06-04 in `fix: keep failed resume side-effect free`. Fresh-workspace `claw --resume latest` exits 1 in text and JSON modes; text writes the restore failure to stderr, JSON writes a typed `no_managed_sessions` restore envelope to stdout, and failed lookup no longer creates `.claw/sessions/<fingerprint>/`. `SessionStore::from_cwd`/`from_data_dir` now only derive the fingerprinted path; session save remains responsible for creating it. Global `--compact` no longer starts the REPL when it has no prompt or stdin: it returns typed `missing_argument` with `argument:"prompt or subcommand"`. `claw --compact "hello"` remains shorthand prompt mode and reaches provider/auth validation rather than command-not-found. Regression coverage: `session_store_from_cwd_is_side_effect_free_until_save`, `resume_latest_missing_session_fails_without_creating_session_dirs_435`, `compact_flag_missing_argument_and_shorthand_prompt_contract_435`, and `parses_compact_flag_for_prompt_mode`; broader checks reran `runtime session_control`, `resume_slash_commands`, `output_format_contract`, `claw` bin tests, `cargo fmt --all -- --check`, `scripts/roadmap-check-ids.sh`, `git diff --check`, and `cargo build --workspace --locked`.
|
||||
|
||||
|
||||
436. **`claw init` shipped `.claw.json` template explicitly sets `permissions.defaultMode:"dontAsk"` — every user who runs `claw init` gets a config file that disables permission prompts by default; sibling: `init` creates an empty `.claw/` directory with no settings.json template inside, and when `.claw/` already exists it skips the whole artifact (no settings template materialized)** — dogfooded 2026-05-11 by Jobdori on `b8f989b6` in response to Clawhip pinpoint nudge at `1503313241751949335`. Reproduction: `mkdir /tmp/probe && cd /tmp/probe && claw init --output-format json` returns `artifacts:[{name:".claw/",status:"created"},{name:".claw.json",status:"created"},...]`. Inspecting the created `.claw.json`: `{"permissions":{"defaultMode":"dontAsk"}}`. This is the polar opposite of safe-by-default: every user who follows the documented onboarding flow (`claw init` after `curl install.sh`) ships their workspace with permission prompts disabled. Compounds with **#428** (default runtime permission_mode is `danger-full-access`) — between the runtime default and the init template, a fresh claw setup has zero user-facing safety friction. **Sibling: `.claw/` artifact is an empty directory.** After `claw init`, `find .claw -type f` returns nothing. No `settings.json`, no template, no scaffolding — just `mkdir .claw`. The `--help` description implies init produces a usable workspace, but `.claw/settings.json` (the project-scope counterpart of `~/.claw/settings.json`) is never templated. **Sibling: `.claw/` skip-on-exists drops the entire artifact.** If `.claw/` already exists (e.g., from a partial setup, a `--resume` failure side effect per #435, or manual creation), `claw init` returns `.claw/: skipped` and does not materialize any expected sub-content. The other artifacts (`.claw.json`, `.gitignore`, `CLAUDE.md`) are still created, but a future `claw skills install` or `claw plugins enable` may expect `.claw/` to contain template files that are now missing. **Required fix shape:** (a) the shipped `.claw.json` template must default to `permissions.defaultMode:"acceptEdits"` or `"plan"` (safe-by-default modes per #428 spec) — `"dontAsk"` requires explicit opt-in; (b) `claw init` must materialize `.claw/settings.json` with documented schema defaults inside `.claw/` so the directory is useful on its own; (c) when `.claw/` already exists, `init` must report `partial` status (not `skipped`) and still try to create missing sub-files like `.claw/settings.json` without overwriting existing files; (d) emit per-sub-file artifact entries for `.claw/settings.json` and `.claw/sessions/` (skipped status if absent, deferred-to-first-save acceptable) so automation knows what's present; (e) regression test: `claw init` produces a `.claw.json` whose `permissions.defaultMode` is NOT `dontAsk`; `.claw/` contains at least one templated file. **Why this matters:** init is the primary onboarding surface. Every first-time user piping `curl install.sh | sh && claw init` gets a workspace pre-configured to skip permission prompts — and that workspace gets committed to the user's repo via the `init`-added entry. The `.claw/` empty-directory bug means feature discovery (skills, plugins) lacks the scaffolding it implies. Cross-references #428 (runtime default permission_mode), #50/#87/#91/#94/#97/#101/#106/#115/#123 (permission-rule audit), #435 (filesystem side effects on failed resume). Source: Jobdori live dogfood, `b8f989b6`, 2026-05-11.
|
||||
436. **DONE — `claw init` scaffolds safe project settings and reports partial/deferred artifacts** — fixed 2026-06-04 in `fix: scaffold safe init settings`. The starter `.claw.json` and new `.claw/settings.json` template now both use `permissions.defaultMode:"acceptEdits"` instead of unsafe `dontAsk`. Fresh init materializes `.claw/settings.json`, keeps `.claw/sessions/` deferred until the first successful session save, and emits per-artifact entries for `.claw/`, `.claw/settings.json`, `.claw/sessions/`, `.claw.json`, `.gitignore`, and `CLAUDE.md`. When `.claw/` already exists but its settings template is missing, init creates `.claw/settings.json` without overwriting existing files and reports `.claw/` as `partial` rather than `skipped`; idempotent reruns keep existing artifacts skipped and session storage deferred. JSON init output now includes `partial[]` and `deferred[]` alongside `created[]`, `updated[]`, and `skipped[]`, and init help/USAGE document the artifact statuses. Regression coverage: `initialize_repo_creates_expected_files_and_gitignore_entries`, `initialize_repo_is_idempotent_and_preserves_existing_files`, `artifacts_with_status_partitions_fresh_and_idempotent_runs`, and `init_json_envelope_has_hint_and_already_initialized_783`.
|
||||
|
||||
|
||||
437. **`version --output-format json` omits build provenance fields — no `is_dirty`, `branch`, `commit_date`, `commit_timestamp`, `rustc_version`; `git_sha` is truncated to 7 chars instead of full 40-char hash; sibling: `executable_path` leaks the build host's path (`/tmp/claw-dog-0530/...`) into runtime output** — dogfooded 2026-05-11 by Jobdori on `8cf628a5` in response to Clawhip pinpoint nudge at `1503320791582900344`. Reproduction: `claw version --output-format json` returns `{"build_date":"2026-05-11","executable_path":"/tmp/claw-dog-0530/rust/target/release/claw","git_sha":"b98b9a7","kind":"version","message":"Claw Code\n Version 0.1.0\n Git SHA b98b9a7\n Target aarch64-apple-darwin\n Build date 2026-05-11","target":"aarch64-apple-darwin","version":"0.1.0"}`. Critical provenance fields missing: (a) **`is_dirty`** — was the working tree clean at build time? Automation that pins on build provenance cannot tell if the binary was built from a clean commit or includes uncommitted changes; (b) **`branch`** — was this built from `main`, `dev/rust`, a release tag, or a feature branch? The `git_sha` alone doesn't reveal the integration point; (c) **`commit_date` / `commit_timestamp`** — only `build_date` (when the binary was compiled) is exposed; the commit itself might be days/weeks older if the build happened later. Reproducibility audits need both; (d) **`rustc_version`** — what Rust compiler version produced this binary? Critical for security advisories (e.g., known regressions in specific rustc versions); (e) **`git_sha` truncated to 7 chars** ("b98b9a7" instead of full "b98b9a71..."): 7-char shas have known collision rates in large repos and prevent unambiguous git rev-parse round-trip. **Sibling: `executable_path` leaks build-host path.** The `executable_path` field returns `/tmp/claw-dog-0530/rust/target/release/claw` — the directory where the binary was compiled, embedded into the binary metadata. For a binary copied/installed/symlinked to a different location, this field still reports the build path, not the actual invocation path. Either the field should reflect the runtime path via `std::env::current_exe()` at runtime (not compile-time), or it should be dropped to avoid leaking compile-host filesystem layout. **Sibling: prose `message` field duplicates structured data.** The `message` field still contains the entire text-mode prose version block (`"Claw Code\n Version 0.1.0\n Git SHA b98b9a7\n..."`) — every field present as structured JSON (`version`, `git_sha`, `target`, `build_date`) is also embedded in the prose. Same issue as #391 (`version json includes prose message field`) which was closed as "fixed" — the prose remains. **Required fix shape:** (a) add `is_dirty:bool`, `branch:string|null`, `commit_date:string` (ISO-8601), `commit_timestamp:int` (Unix epoch), `rustc_version:string` to the JSON envelope; (b) preserve full 40-char `git_sha` and add `git_sha_short:string` as a derived field if 7-char form is needed for UX; (c) `executable_path` should be `std::env::current_exe()` at runtime, not the compile-time path; (d) drop the prose `message` field from JSON or rename it `human_readable:string` and make it explicitly secondary to the structured fields; (e) re-verify #391 closure — the prose `message` is still present, the fix didn't fully land. **Why this matters:** version surface is the canonical provenance probe for security audits, build reproducibility, and bug-report metadata. Missing `is_dirty` means automated triage cannot distinguish "issue against a clean main commit" from "issue against a developer's uncommitted hack". Truncated `git_sha` blocks unambiguous git lookup. Leaked `executable_path` exposes build-host layout. Cross-references #391 (version prose duplication — apparently not fully fixed), #334 (version json omits build_date — fixed, but partial scope), #100 (commit identity audit). Source: Jobdori live dogfood, `8cf628a5`, 2026-05-11.
|
||||
437. **DONE — `version --output-format json` exposes complete build provenance without duplicating prose** — fixed 2026-06-04 in `fix: expose complete version provenance`. Build metadata now records the full 40-character `git_sha`, separate derived `git_sha_short`, `is_dirty`, `branch`, ISO-8601 `commit_date`, Unix `commit_timestamp`, `rustc_version`, target, and build date. Version JSON exposes those fields at top level and mirrors them under `binary_provenance`; `workspace_git_sha` is also a full SHA and `workspace_match` now compares full commit identities. `executable_path` is resolved at runtime with `std::env::current_exe()` instead of reporting a compile-host path. The prose report is no longer duplicated in JSON as `message`; JSON callers get the secondary text block as `human_readable`. Docs in `USAGE.md` and `rust/README.md` describe the provenance contract. Regression coverage: `version_emits_json_when_requested`, `version_status_doctor_include_binary_provenance_797`, `resumed_version_and_init_emit_structured_json_when_requested`, and `resumed_version_command_emits_structured_json`.
|
||||
|
||||
|
||||
438. **Memory file discovery only recognizes `CLAUDE.md` — `AGENTS.md` (industry convention used by OpenCode/Codex/Aider/Cursor) and `CLAW.md` (project's own brand name) are silently ignored despite being present in the workspace** — dogfooded 2026-05-11 by Jobdori on `d3a982dd` in response to Clawhip pinpoint nudge at `1503328341422244012`. Reproduction (fresh empty dir, isolated `CLAW_CONFIG_HOME`): create three files in cwd — `CLAUDE.md` (marker `MARKER-FROM-CLAUDE-MD`), `AGENTS.md` (marker `MARKER-FROM-AGENTS-MD`), `CLAW.md` (marker `MARKER-FROM-CLAW-MD`). Run `claw status --output-format json` → `workspace.memory_file_count: 1`. Run `claw system-prompt --output-format json` and search the `message` field for each marker: only `MARKER-FROM-CLAUDE-MD` is found; `MARKER-FROM-AGENTS-MD` and `MARKER-FROM-CLAW-MD` are absent. `claw-code` exclusively recognizes the Claude-branded filename inherited from upstream Claude Code; the project's own `CLAW.md` brand name and the cross-tool industry convention `AGENTS.md` are both silently dropped. **Three sibling implications:** (a) **brand-consistency gap**: a project rebranded from Claude Code to Claw Code that introduces `CLAUDE.md` as its only memory file is internally inconsistent. Users naturally expect `claw <subcommand>` to read `CLAW.md`. (b) **industry-convention gap**: `AGENTS.md` is the convergent convention for OpenCode (oh-my-opencode/sisyphus), OpenAI Codex CLI, Aider, Cursor, Continue.dev, and most ACP harnesses. Users with mixed-tool workflows maintain a shared `AGENTS.md` and expect every AI coding tool to honor it. (c) **silent failure mode**: there is no warning when `AGENTS.md` or `CLAW.md` exist but are not loaded. Users who copy-paste `AGENTS.md` from another tool's docs see `memory_file_count` stay at 0 or 1 and have to guess why their instructions aren't applied. **Required fix shape:** (a) discover and load **`CLAUDE.md`, `CLAW.md`, `AGENTS.md`** in that priority order (existing config-precedence pattern); (b) all three contribute to `memory_file_count` with `memory_files:[{path, source:"claude_md"|"claw_md"|"agents_md", chars}]` array exposed in `status --output-format json`; (c) when multiple files exist, merge or document the precedence: project-specific `CLAUDE.md`/`CLAW.md` overrides industry-shared `AGENTS.md`; (d) `claw doctor --output-format json` adds a `memory` check that warns when `AGENTS.md` exists but is not the loaded variant (alerting users that they may be relying on the wrong file); (e) regression test: workspace with all three files results in `memory_file_count >= 1` and the system prompt contains markers from at least the highest-precedence file. **Why this matters:** `AGENTS.md` is the lingua-franca instruction file for cross-tool AI coding workflows. A team using OpenCode for one project and Claw Code for another keeps their conventions in a shared `AGENTS.md`. Forcing them to also maintain a `CLAUDE.md` for claw-code (with identical content) is friction that breaks the value proposition of a fork. Cross-references #438 itself (the multi-file convention), and AGENTS.md ecosystem references in oh-my-opencode/sisyphus docs. Source: Jobdori live dogfood, `d3a982dd`, 2026-05-11.
|
||||
438. **DONE — memory discovery loads `CLAUDE.md`, `CLAW.md`, and `AGENTS.md` with structured provenance** — fixed 2026-06-04 in `fix: load Claw and Agents memory files`. Project memory discovery now checks root instruction files in `CLAUDE.md`, `CLAW.md`, then `AGENTS.md` order for each discovered directory, preserves existing scoped `.claw/CLAUDE.md`, `.claude/CLAUDE.md`, `.claw/instructions.md`, and rules-directory imports, and exposes each loaded file's `path`, `source`, `chars`, and `contributes` in `status --output-format json` as `workspace.memory_files[]`. `system-prompt --output-format json` returns the same memory metadata alongside the rendered `message`/`sections`, and all non-duplicate loaded files contribute to the prompt so CLAUDE/CLAW/AGENTS markers are visible together. `claw doctor --output-format json` now includes a dedicated `memory` check with loaded memory metadata and `unloaded_memory_files[]` warnings for present `CLAW.md`/`AGENTS.md` candidates that were skipped (for example empty or duplicate-content variants). Docs in `USAGE.md` and `rust/README.md` describe the priority and JSON contracts. Regression coverage: `discovers_claude_claw_agents_and_dot_claude_instruction_files_together`, `memory_files_load_claude_claw_agents_and_surface_json_438`, and `memory_health_surfaces_loaded_and_unloaded_files_438`.
|
||||
|
||||
|
||||
439. **Memory file discovery walks ALL ancestor directories up to `$HOME` boundary, silently loading any `CLAUDE.md` it finds — `/tmp/CLAUDE.md` left from a previous test silently bleeds into every project under `/tmp/*/`; no `--no-parent-memory` flag, no `.no-claude-md-boundary` marker file to limit discovery scope** — dogfooded 2026-05-11 by Jobdori on `f4a96740` in response to Clawhip pinpoint nudge at `1503335892461293675`. Reproduction: create three nested `CLAUDE.md` files with unique markers — `/tmp/claw-nested-probe/CLAUDE.md` (`PARENT_CLAUDE`), `subproj/CLAUDE.md` (`CHILD_CLAUDE`), `subproj/deep/CLAUDE.md` (`DEEP_CLAUDE`). Run `claw system-prompt --output-format json` from `subproj/deep/nest/` (note: `nest` has no `CLAUDE.md`). The `message` field contains **all three markers** (PARENT + CHILD + DEEP) and `status --output-format json` reports `memory_file_count: 3`. Boundary tests: (a) `$HOME/CLAUDE.md` is NOT picked up from `/tmp/no-claude-dir` (discovery stops at `$HOME` boundary, good); (b) From `/tmp/deep` (no nested CLAUDE.md), `/tmp/CLAUDE.md` IS picked up (count: 1); (c) git-root is NOT a discovery boundary — running from a git subdir still walks above the git root. **Ambient-context-bleed footgun:** any stale `/tmp/CLAUDE.md` (or `/home/<user>/projects/CLAUDE.md`, or any ancestor-path CLAUDE.md left over from a previous experiment, copy-paste, or AI-generated example) silently bleeds into every workspace nested below it. The user has no signal in `status --output-format json` indicating which ancestor file is contributing — only the aggregate `memory_file_count`. **Three required fixes:** (a) **expose discovery list**: `status --output-format json` and `system-prompt --output-format json` must include `memory_files:[{path, source:"workspace"|"ancestor"|"parent_dir"|"home", chars, contributes:bool}]` so users can see what's leaking in; (b) **add `--no-parent-memory` flag** to limit discovery to cwd only (no ancestor walk), or add a boundary marker (`.claude-no-walk`, `.claw-root`, or honor `.git` as the boundary by default — most users expect repo-root scope); (c) **`doctor` warns** when ancestor `CLAUDE.md` files are loaded from outside the current git repo (suggests they may be unintentional). **Sibling discovery scope question:** discovery walks up to `$HOME` — but for a user with a project at `/Users/foo/work/proj`, that's `/Users/foo/work/CLAUDE.md` + `/Users/foo/CLAUDE.md` (if it exists) both load. The home boundary is exclusive, but the entire `/Users/foo` tree under home is in scope. **Why this matters:** test workspaces, scratch dirs, AI-generated example projects, and shared `/tmp` workdirs are full of stale `CLAUDE.md` files. The current discovery rule means every claw invocation can silently inherit context from arbitrary ancestor paths. Cross-references #438 (memory discovery only finds CLAUDE.md, not AGENTS.md or CLAW.md), #421 (cwd canonicalization leak — the canonicalized form determines which ancestor walk path is used). Source: Jobdori live dogfood, `f4a96740`, 2026-05-11.
|
||||
439. **DONE — memory discovery is git-root bounded and reports memory origins** — fixed 2026-06-04 in `fix: bound parent memory discovery`. Project memory discovery now walks only from the current directory up to the nearest git root when one exists, and otherwise stays cwd-local, so stale parent `CLAUDE.md` files outside the project no longer bleed into scratch workspaces. Loaded memory JSON in both `status --output-format json` and `system-prompt --output-format json` now includes `origin`, `scope_path`, and `outside_project` alongside `path`, `source`, `chars`, and `contributes`; origins distinguish `workspace`, `parent_dir`, `ancestor`, `home`, and `outside_project`. The `memory` doctor check warns if an outside-project memory file is ever loaded, while still listing loaded and skipped memory candidates structurally. Docs in `USAGE.md` and `rust/README.md` describe the git-root boundary and expanded JSON fields. Regression coverage: `discovery_stops_at_git_root_boundary_439`, `discovery_without_git_root_stays_cwd_local_439`, and `memory_discovery_stops_at_git_root_and_reports_origins_439`.
|
||||
|
||||
|
||||
440. **One invalid `mcpServers` entry blocks ALL OTHER valid MCP servers from loading — `mcp list --output-format json` returns `configured_servers: 0, servers: []` when even one server has a missing/invalid `command` field, despite other servers in the same config being well-formed; sibling: config parser halts on first invalid entry, never reports the remaining invalid entries** — dogfooded 2026-05-11 by Jobdori on `bd126905` in response to Clawhip pinpoint nudge at `1503343442904879156`. Reproduction: write `.claw.json` containing six `mcpServers` entries — one valid (`valid-server: {command:"/bin/echo", args:["hello"]}`) and five with progressive defects (missing-command, empty-command, null-command, wrong-type-command, extra-unknown-field). Run `claw mcp list --output-format json` → `{"action":"list","config_load_error":"/private/tmp/claw-mcp-probe/.claw.json: mcpServers.missing-command-server: missing string field command","configured_servers":0,"kind":"mcp","servers":[],"status":"degraded"}`. The error mentions only `missing-command-server` (the first invalid entry in JSON-object iteration order); the other four invalid entries are never surfaced. The valid `valid-server` entry is silently dropped because the parser bails on the first error. `status --output-format json` correctly propagates the same `config_load_error` and sets `status:"degraded"`, but no field tells automation which servers are valid vs broken — `servers:[]` is the only signal. **Three problems compounded:** (a) **all-or-nothing loading**: ROADMAP product principle #5 says "partial success is first-class," but mcp config loading is binary. One bad server kills the entire MCP plane; (b) **first-error-only reporting**: a `.claw.json` with five invalid entries surfaces only one error message — the user fixes that one and runs again, gets the next error, and so on. Five iterations needed to discover all errors; (c) **no per-server status**: even with the partial-success fix, the JSON envelope needs `servers:[{name, valid:bool, error?, command?, args?}]` so automation can see which entries are usable. **Required fix shape:** (a) the MCP config parser must collect ALL invalid entries into an `invalid_servers:[{name, error_field, reason}]` array and load all valid ones into `servers:[]`; do not abort on first error; (b) `configured_servers` reflects the count of *valid* loaded servers (not zero) when there are valid entries alongside invalid ones; (c) expose `total_configured:int` (count of entries in source `.claw.json`) AND `valid_count:int` (loaded), AND `invalid_count:int` (rejected) — three distinct counts; (d) `doctor --output-format json` adds an `mcp_validation` check that lists each invalid entry with its error message; (e) regression test: `.claw.json` with one valid + one invalid entry results in `configured_servers: 1, invalid_servers: [{name:"...", reason:"..."}]`. **Why this matters:** users iterate on MCP server lists during onboarding — one typo kills the entire plane, including servers they got working previously. The first-error-only reporting forces N iterations through N invalid entries instead of a single fix-everything-at-once pass. Cross-references #407 (config files no load_error per-file), #415 (config section merged_keys count only), #416 (plugins list prose), #428 (default permission mode), and Product Principle #5. Source: Jobdori live dogfood, `bd126905`, 2026-05-11.
|
||||
440. **DONE — invalid `mcpServers` siblings no longer drop valid MCP servers** — fixed 2026-06-04 in `fix: load partial MCP configs`. MCP config loading now records every invalid server entry as `invalid_servers:[{name, scope, path, error_field, reason, valid:false}]` while retaining valid siblings in `servers[]`; valid entries carry `valid:true`, `configured_servers` and `valid_count` report loaded valid servers, `invalid_count` reports rejected entries, and `total_configured` reports all discovered entries. `status --output-format json` mirrors the `mcp_validation` summary, and `doctor --output-format json` includes an `mcp validation` check for one-pass repair. Empty stdio commands and unknown per-transport fields are per-server validation errors instead of global config failures. Regression coverage: `loads_valid_mcp_servers_and_collects_all_invalid_siblings_440`, `records_invalid_mcp_server_shapes_without_rejecting_config_440`, `mcp_loads_valid_servers_and_reports_invalid_siblings_440`, and `mcp_degraded_config_and_failed_usage_are_distinct_json_contracts`.
|
||||
|
||||
|
||||
441. **`hooks` config schema diverges from Claude Code documented format — claw-code expects `{"hooks":{"PreToolUse":["command-string"]}}` (array of command strings) while Claude Code documentation specifies `{"hooks":{"PreToolUse":[{"matcher":"Read","hooks":[{"type":"command","command":"..."}]}]}}` (structured matcher objects); users copy-pasting from Claude Code docs see `field "hooks.PreToolUse" must be an array of strings`** — dogfooded 2026-05-11 by Jobdori on `86ff83c2` in response to Clawhip pinpoint nudge at `1503350990680887418`. Reproduction: write `.claw.json` with the Claude-Code-documented hook format `{"hooks":{"PreToolUse":[{"matcher":"Read","hooks":[{"type":"command","command":"/bin/echo pretool"}]}]}}`. Run `claw status --output-format json` → `config_load_error: "/private/tmp/claw-hook-probe/.claw.json: field \"hooks.PreToolUse\" must be an array of strings, got an array (line 3)"`, `status: "degraded"`. The error wording ("must be an array of strings, got an array") is confusingly tautological — the user did provide an array; the parser objects that the array contains objects instead of strings. Replacing with the claw-code-actual format `{"hooks":{"PreToolUse":["/bin/echo pretool"]}}` succeeds: `config_load_error: null, status: "ok"`. The two formats are fundamentally incompatible: claw-code drops the `matcher` field (no tool-specific filtering at the config layer), drops the `type:"command"` discriminator (no future expansion to other hook types), and treats each entry as a bare command string instead of a structured hook spec. **Sibling: PR #3000 (justcode049) was attempting to tolerate object-style hook entries** — that PR's title `fix: tolerate object-style hook entries in config parser` confirms this is a known user complaint, but the PR is still conflicting and unmerged. **Three sibling findings in same probe:** (a) **unknown event names reject entire hooks config**: `.claw.json` with `hooks.InvalidEvent` (not a real event name like `PreToolUse`/`PostToolUse`/`Stop`/`Notification`) triggers `config_load_error: "unknown key \"hooks.InvalidEvent\""` and rejects ALL hooks in the same file, even valid ones — same "one bad apple kills all" pattern as #440 (MCP servers). (b) **`kind:"unknown"` for the validation error** — should be `kind:"invalid_hooks_config"` or `kind:"unknown_hook_event"` (catch-all cluster #422/#423/#424/#428/#430/#431/#432/#433/#435 — 13th occurrence). (c) **first-error-only halting**: a `.claw.json` with `hooks.Stop:"not-an-array"` (type mismatch) AND `hooks.InvalidEvent` (unknown name) AND `hooks.Notification:[{}]` (empty entry) surfaces only the FIRST error in iteration order — user must fix one at a time across 3 iterations. **Required fix shape:** (a) **adopt Claude Code's structured hook format as the canonical**: support `{matcher, hooks:[{type, command}]}` natively, with `matcher` for tool-filtering, `type` for hook-type discriminator (future-proof for `inline`/`webhook`/etc beyond just `command`); (b) **keep backward compat for bare command strings**: legacy `["command-string"]` arrays still load, but emit a deprecation warning suggesting migration to the structured form; (c) **partial-success loading**: invalid hook entries surface in `invalid_hooks:[{event, index, reason}]` while valid ones load — same fix as #440 for MCP; (d) **typed `kind:"invalid_hooks_config"` envelope** instead of `kind:"unknown"`; (e) **rebase and merge PR #3000** which addresses this directly; (f) regression test: Claude-Code-documented hook config loads without error on claw-code. **Why this matters:** users migrating from Claude Code to Claw Code hit this on their first `.claw.json` write. The error message ("array of strings, got an array") is unhelpful; the documentation doesn't surface the schema divergence; and Claude Code's structured format is strictly more expressive (matchers, types) than claw-code's bare-string format. Cross-references #407 (config files no load_error), #410 (list-envelope schema drift), #428 (default permission mode), #440 (one invalid MCP entry blocks all), PR #3000 (justcode049's pending fix). Source: Jobdori live dogfood, `86ff83c2`, 2026-05-11.
|
||||
@@ -7756,7 +7755,7 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
|
||||
|
||||
794. **`claw plugins install /nonexistent/path` returned `error_kind:"unknown"` + `hint:null`** — dogfooded 2026-05-27 on `57a57ef7`. The error message `"plugin source '/path' was not found"` had no classifier arm, falling to `"unknown"`. Fix: added `plugin_source_not_found` classifier arm (`message.contains("plugin source") && message.contains("was not found")`); added `"plugin_source_not_found"` → `"Check that the path or URL is correct..."` to `fallback_hint_for_error_kind`. Unit test assertion added to `test_classify_error_kind`; integration test `plugins_install_not_found_path_returns_typed_kind_794` added. 56 CLI contract tests pass. [SCOPE: claw-code] Source: Jobdori plugins install probe on `57a57ef7`, 2026-05-27.
|
||||
|
||||
795. **`claw skills install /nonexistent` returned `skill_not_found + hint:null` and `claw skills uninstall x` returned `unsupported_skills_action + hint:null`** — dogfooded 2026-05-27 on `491f179a`. Both error kinds were missing from `fallback_hint_for_error_kind` table, so even though classify returned a typed kind, the hint field was always null. Fix: added `"skill_not_found"` → hint suggesting `claw skills list` / `claw skills install`; added `"unsupported_skills_action"` → hint listing supported actions. Integration test `skills_install_not_found_and_unsupported_action_have_hints_795` covers both paths. 57 CLI contract tests pass. [SCOPE: claw-code] Source: Jobdori skills lifecycle probe on `491f179a`, 2026-05-27.
|
||||
795. **`claw skills install /nonexistent` returned `skill_not_found + hint:null` and `claw skills uninstall x` returned `unsupported_skills_action + hint:null`** — dogfooded 2026-05-27 on `491f179a`. Both error kinds were missing from `fallback_hint_for_error_kind` table, so even though classify returned a typed kind, the hint field was always null. Fix: added `skill_not_found` and `unsupported_skills_action` fallback hints. ROADMAP #431 later moved the lifecycle surface fully local: install failures now emit typed `invalid_install_source`, uninstall failures emit local `skill_not_found` with `skills_dir` and `available_names`, and the combined regression is covered by `skills_lifecycle_errors_have_typed_local_json_795_431` plus the install/uninstall roundtrip test. [SCOPE: claw-code] Source: Jobdori skills lifecycle probe on `491f179a`, 2026-05-27.
|
||||
|
||||
796. **`claw agents show <name> <extra>` and `claw skills show <name> <extra>` returned confusing `agent_not_found`/`skill_not_found` for the concatenated "name extra" string** — dogfooded 2026-05-27 on `18b4cee5`. `join_optional_args` passes all tokens as a space-joined string; both `show` handlers called `split_once(' ')` to extract the name but did not check if the remainder (after the first split) contained additional tokens. Extra positional args (including `--flags`) became part of the "name", silently mangling the lookup. Fix: added second `split_once(' ')` on the extracted name; if the result has two parts, return `unexpected_extra_args` with a usage hint. Valid single-name lookups are unaffected. Two new integration tests `agents_show_extra_positional_arg_returns_unexpected_extra_796`, `skills_show_extra_positional_arg_returns_unexpected_extra_796`. 59 CLI contract tests pass. [SCOPE: claw-code] Source: Jobdori agents/skills show extra-arg probe on `18b4cee5`, 2026-05-27.
|
||||
797. **DONE — Installed `claw version --output-format json` reports `git_sha:null` / `Git SHA unknown`, so dogfood cannot tie the binary under test to a source revision** — dogfooded 2026-05-27 from `#clawcode-building-in-public` using an installed binary in a clean `ultraworkers/claw-code` checkout. The gap was that version/status/doctor did not provide a structured executable-vs-workspace provenance object when build metadata was missing or stale. [SCOPE: claw-code]
|
||||
|
||||
@@ -51,26 +51,27 @@ cd rust
|
||||
```
|
||||
|
||||
**Note:** Diagnostic verbs (`doctor`, `status`, `sandbox`, `version`) support `--output-format json` for machine-readable output. Invalid suffix arguments (e.g., `--json`) are now rejected at parse time rather than falling through to prompt dispatch.
|
||||
`version --output-format json` reports structured build provenance including full `git_sha`, derived `git_sha_short`, `is_dirty`, `branch`, `commit_date`, `commit_timestamp`, `rustc_version`, runtime `executable_path`, and `binary_provenance`; JSON keeps the prose report in `human_readable` instead of duplicating it under `message`. `status --output-format json` exposes `workspace.memory_files[]` with `path`, `source`, `origin`, `scope_path`, `outside_project`, `chars`, and `contributes` for every loaded project memory file.
|
||||
|
||||
### Initialize a repository
|
||||
|
||||
Set up a new repository with `.claw` config, `.claw.json`, `.gitignore` entries, and a `CLAUDE.md` guidance file:
|
||||
Set up a new repository with `.claw/settings.json`, `.claw.json`, `.gitignore` entries, and a `CLAUDE.md` guidance file:
|
||||
|
||||
```bash
|
||||
cd /path/to/your/repo
|
||||
./target/debug/claw init
|
||||
```
|
||||
|
||||
Text mode (human-readable) shows artifact creation summary with project path and next steps. Idempotent — running multiple times in the same repo marks already-created files as "skipped".
|
||||
Text mode (human-readable) shows artifact creation summary with project path and next steps. Idempotent — running multiple times in the same repo marks already-created files as "skipped", reports `.claw/` as "partial" when missing sub-files are materialized, and keeps `.claw/sessions/` deferred until the first successful session save.
|
||||
|
||||
JSON mode for scripting:
|
||||
```bash
|
||||
./target/debug/claw init --output-format json
|
||||
```
|
||||
|
||||
Returns structured output with `project_path`, `created[]`, `updated[]`, `skipped[]` arrays (one per artifact), and `artifacts[]` carrying each file's `name` and machine-stable `status` tag. The legacy `message` field preserves backward compatibility.
|
||||
Returns structured output with `project_path`, `created[]`, `updated[]`, `partial[]`, `deferred[]`, and `skipped[]` arrays (one per artifact status), and `artifacts[]` carrying each file's `name` and machine-stable `status` tag. The legacy `message` field preserves backward compatibility.
|
||||
|
||||
**Why structured fields matter:** Claws can detect per-artifact state (`created` vs `updated` vs `skipped`) without substring-matching human prose. Use the `created[]`, `updated[]`, and `skipped[]` arrays for conditional follow-up logic (e.g., only commit if files were actually created, not just updated).
|
||||
**Why structured fields matter:** Claws can detect per-artifact state (`created`, `updated`, `partial`, `deferred`, or `skipped`) without substring-matching human prose. Use the status arrays for conditional follow-up logic (e.g., only commit if files were actually created, not just updated).
|
||||
|
||||
### Interactive REPL
|
||||
|
||||
@@ -86,6 +87,12 @@ cd rust
|
||||
./target/debug/claw prompt "summarize this repository"
|
||||
```
|
||||
|
||||
Pipe prompt text through stdin when automation already produces the prompt body:
|
||||
|
||||
```bash
|
||||
printf 'summarize this repository\n' | ./target/debug/claw prompt --output-format json
|
||||
```
|
||||
|
||||
### Shorthand prompt mode
|
||||
|
||||
```bash
|
||||
@@ -93,6 +100,12 @@ cd rust
|
||||
./target/debug/claw "explain rust/crates/runtime/src/lib.rs"
|
||||
```
|
||||
|
||||
Use the POSIX `--` end-of-flags separator when the shorthand prompt itself begins with `-` or `--`:
|
||||
|
||||
```bash
|
||||
./target/debug/claw -- "-summarize this dash-prefixed text"
|
||||
```
|
||||
|
||||
### JSON output for scripting
|
||||
|
||||
```bash
|
||||
@@ -187,17 +200,24 @@ cd rust
|
||||
./target/debug/claw --permission-mode read-only prompt "summarize Cargo.toml"
|
||||
./target/debug/claw --permission-mode workspace-write prompt "update README.md"
|
||||
./target/debug/claw --allowedTools read,glob "inspect the runtime crate"
|
||||
./target/debug/claw --cwd ../other-workspace status --output-format json
|
||||
```
|
||||
|
||||
Supported permission modes:
|
||||
Global workspace override flags: `--cwd PATH`, `-C PATH`, and `--directory PATH` are accepted before any subcommand. They are validated before command dispatch and take precedence over the process `$PWD`; invalid paths return typed `invalid_cwd` JSON errors in JSON mode.
|
||||
|
||||
- `read-only`
|
||||
- `workspace-write`
|
||||
- `danger-full-access`
|
||||
`--allowedTools` accepts canonical snake_case tool names (for example `read_file`, `glob_search`, `web_fetch`) plus documented aliases such as `read`, `glob`, `Read`, and `WebFetch`. `claw status --output-format json` exposes `allowed_tools.available` and `allowed_tools.aliases`, and invalid values return typed `invalid_tool_name` JSON with `tool_name`, `available`, and `tool_aliases`. A missing value before a subcommand or another flag returns `missing_argument` with `argument:"--allowedTools"`.
|
||||
|
||||
`--output-format` accepts `text` or `json` case-insensitively and normalizes to the canonical lowercase modes. `CLAW_OUTPUT_FORMAT=json` sets the default output format for scripts, while an explicit `--output-format` flag takes precedence. Repeating the flag emits a stderr warning and JSON status envelopes expose `format_source`, `format_raw`, and `format_overridden` so composed flag arrays are auditable; invalid values return typed `invalid_output_format` JSON with `value` and `expected:["text","json"]`.
|
||||
|
||||
Supported permission modes (default: `workspace-write`):
|
||||
|
||||
- `read-only` allows inspection-only local tools such as file reads, glob/grep searches, local skills, and status-style reporting. It does not allow workspace mutation, network-fetch/search tools, or arbitrary command execution.
|
||||
- `workspace-write` is the safe default. It allows reads plus direct file-editing tools inside the current workspace, including write/edit/notebook/config/plan-mode updates, while still gating network-fetch/search tools, arbitrary shell execution, subagent launches, REPL subprocesses, and other full-access tools behind an explicit escalation.
|
||||
- `danger-full-access` allows every registered tool requirement, including arbitrary command execution, web fetch/search, subagent launches, subprocess REPLs, and unrestricted tool access. Select it only with an explicit `--permission-mode danger-full-access`, `--dangerously-skip-permissions`, `--skip-permissions`, env, or config opt-in.
|
||||
|
||||
Model aliases currently supported by the CLI:
|
||||
|
||||
- `opus` → `claude-opus-4-6`
|
||||
- `opus` → `claude-opus-4-7`
|
||||
- `sonnet` → `claude-sonnet-4-6`
|
||||
- `haiku` → `claude-haiku-4-5-20251213`
|
||||
|
||||
@@ -292,6 +312,18 @@ cd rust
|
||||
./target/debug/claw --model "llama3.2" prompt "summarize this repository in one sentence"
|
||||
```
|
||||
|
||||
For Ollama tags with punctuation (for example `qwen2.5-coder:7b`), `OPENAI_BASE_URL` selects the local OpenAI-compatible route even when `OPENAI_API_KEY` is unset:
|
||||
|
||||
```bash
|
||||
export OPENAI_BASE_URL="http://127.0.0.1:11434/v1"
|
||||
unset OPENAI_API_KEY
|
||||
|
||||
cd rust
|
||||
./target/debug/claw --model "qwen2.5-coder:7b" prompt "reply with ready"
|
||||
```
|
||||
|
||||
If the local server exposes a slash-containing model ID, prefix it with `local/` so Claw selects the OpenAI-compatible transport while sending the remainder verbatim on the wire: `--model "local/Qwen/Qwen3.6-27B-FP8"`.
|
||||
|
||||
### OpenRouter
|
||||
|
||||
```bash
|
||||
@@ -334,7 +366,7 @@ Reasoning variants (`qwen-qwq-*`, `qwq-*`, `*-thinking`) automatically strip `te
|
||||
|
||||
The OpenAI-compatible backend also serves as the gateway for **OpenRouter**, **Ollama**, and any other service that speaks the OpenAI `/v1/chat/completions` wire format — just point `OPENAI_BASE_URL` at the service.
|
||||
|
||||
**Model-name prefix routing:** If a model name starts with `openai/`, `gpt-`, `qwen/`, `qwen-`, `kimi/`, or `kimi-`, the provider is selected by the prefix regardless of which env vars are set. This prevents accidental misrouting to Anthropic when multiple credentials exist in the environment. For the default OpenAI API, `openai/` is a routing prefix and is stripped before the request hits the wire. For a custom `OPENAI_BASE_URL`, slash-containing OpenAI-compatible slugs (for example OpenRouter-style `openai/gpt-4.1-mini`) are preserved so the gateway receives the model ID it expects.
|
||||
**Model-name prefix routing:** If a model name starts with `openai/`, `local/`, `gpt-`, `qwen/`, `qwen-`, `kimi/`, or `kimi-`, the provider is selected by the prefix regardless of which env vars are set. This prevents accidental misrouting to Anthropic when multiple credentials exist in the environment. For the default OpenAI API and local/private OpenAI-compatible endpoints, `openai/` is a routing prefix and is stripped before the request hits the wire. For non-local custom `OPENAI_BASE_URL` gateways, slash-containing OpenAI-compatible slugs (for example OpenRouter-style `openai/gpt-4.1-mini`) are preserved so the gateway receives the model ID it expects. The `local/` prefix is an explicit escape hatch for local slash-containing model IDs: it is stripped while the rest of the model ID is sent verbatim.
|
||||
|
||||
### Tested models and aliases
|
||||
|
||||
@@ -342,7 +374,7 @@ These are the models registered in the built-in alias table with known token lim
|
||||
|
||||
| Alias | Resolved model name | Provider | Max output tokens | Context window |
|
||||
|---|---|---|---|---|
|
||||
| `opus` | `claude-opus-4-6` | Anthropic | 32 000 | 200 000 |
|
||||
| `opus` | `claude-opus-4-7` | Anthropic | 32 000 | 200 000 |
|
||||
| `sonnet` | `claude-sonnet-4-6` | Anthropic | 64 000 | 200 000 |
|
||||
| `haiku` | `claude-haiku-4-5-20251213` | Anthropic | 64 000 | 200 000 |
|
||||
| `grok` / `grok-3` | `grok-3` | xAI | 64 000 | 131 072 |
|
||||
@@ -354,7 +386,7 @@ These are the models registered in the built-in alias table with known token lim
|
||||
| `gpt-4.1` / `gpt-4.1-mini` / `gpt-4.1-nano` | same | OpenAI-compatible | 32 768 | 1 047 576 |
|
||||
| `gpt-5.4` / `gpt-5.4-mini` / `gpt-5.4-nano` | same | OpenAI-compatible | 128 000 | 1 000 000 / 400 000 |
|
||||
|
||||
Any model name that does not match an alias is passed through verbatim after provider routing is resolved. This is how you use OpenRouter model slugs (`openai/gpt-4.1-mini` with a custom `OPENAI_BASE_URL`), Ollama tags (`llama3.2`), or full Anthropic model IDs (`claude-sonnet-4-20250514`).
|
||||
Any model name that does not match an alias is passed through verbatim after provider routing is resolved. This is how you use OpenRouter model slugs (`openai/gpt-4.1-mini` with a custom `OPENAI_BASE_URL`), Ollama tags (`llama3.2` or `qwen2.5-coder:7b`), slash-containing local IDs (`local/Qwen/Qwen3.6-27B-FP8`), or full Anthropic model IDs (`claude-sonnet-4-20250514`).
|
||||
|
||||
### User-defined aliases
|
||||
|
||||
@@ -364,7 +396,7 @@ You can add custom aliases in any settings file (`~/.claw/settings.json`, `.claw
|
||||
{
|
||||
"aliases": {
|
||||
"fast": "claude-haiku-4-5-20251213",
|
||||
"smart": "claude-opus-4-6",
|
||||
"smart": "claude-opus-4-7",
|
||||
"cheap": "grok-3-mini"
|
||||
}
|
||||
}
|
||||
@@ -372,13 +404,15 @@ You can add custom aliases in any settings file (`~/.claw/settings.json`, `.claw
|
||||
|
||||
Local project settings override user-level settings. Aliases resolve through the built-in table, so `"fast": "haiku"` also works.
|
||||
|
||||
Model selection precedence is CLI flag, environment, config, then default. The environment model slot accepts `CLAW_MODEL`, `ANTHROPIC_MODEL`, and `ANTHROPIC_DEFAULT_MODEL` in that order; aliases from those variables are resolved and validated before provider startup. `claw --output-format json status` exposes `model_raw`, `model_alias_resolved_to`, and `model_env_var` so automation can see the winning value.
|
||||
|
||||
### How provider detection works
|
||||
|
||||
1. If the resolved model name starts with `claude` → Anthropic.
|
||||
2. If it starts with `grok` → xAI.
|
||||
3. If it starts with `openai/` or `gpt-` → OpenAI-compatible.
|
||||
3. If it starts with `openai/`, `local/`, or `gpt-` → OpenAI-compatible.
|
||||
4. If it starts with `qwen/`, `qwen-`, `kimi/`, or `kimi-` → DashScope-compatible OpenAI wire format.
|
||||
5. If `OPENAI_BASE_URL` and `OPENAI_API_KEY` are set, unknown model names route to the OpenAI-compatible client for local/gateway servers.
|
||||
5. If `OPENAI_BASE_URL` is set, local-looking unknown model names such as `llama3.2` or `qwen2.5-coder:7b` route to the OpenAI-compatible client for local/gateway servers.
|
||||
6. Otherwise, `claw` checks which credential is set: Anthropic first, then OpenAI, then xAI. If only `OPENAI_BASE_URL` is set, it still routes to OpenAI-compatible for authless local servers.
|
||||
7. If nothing matches, it defaults to Anthropic.
|
||||
|
||||
@@ -419,6 +453,9 @@ The name "codex" appears in the Claw Code ecosystem but it does **not** refer to
|
||||
export HTTPS_PROXY="http://proxy.corp.example:3128"
|
||||
export HTTP_PROXY="http://proxy.corp.example:3128"
|
||||
export NO_PROXY="localhost,127.0.0.1,.corp.example"
|
||||
export CLAW_OUTPUT_FORMAT="json" # default non-interactive output format; flags override it
|
||||
export CLAW_LOG="debug" # claw-specific log level selector surfaced by help/doctor
|
||||
export RUST_LOG="claw=debug" # Rust logging convention surfaced by help/doctor
|
||||
|
||||
cd rust
|
||||
./target/debug/claw prompt "hello via the corporate proxy"
|
||||
@@ -454,11 +491,12 @@ let client = build_http_client_with(&config).expect("proxy client");
|
||||
|
||||
## Skills
|
||||
|
||||
Use `/skills list` in the interactive REPL or `claw skills --output-format json` from the direct CLI to inspect installed skills. For offline/local installs, install the directory that contains `SKILL.md`, then verify the discovered name before invoking it:
|
||||
Use `/skills list` in the interactive REPL or `claw skills --output-format json` from the direct CLI to inspect installed skills. For offline/local installs, install the directory that contains `SKILL.md`, then verify the discovered name before invoking it. `skills install`, `skills uninstall`, and `agents create` are local filesystem lifecycle commands; they do not require provider credentials.
|
||||
|
||||
```text
|
||||
/skills install /absolute/path/to/my-skill
|
||||
/skills list
|
||||
/skills uninstall my-skill
|
||||
/skills my-skill
|
||||
```
|
||||
|
||||
@@ -471,6 +509,7 @@ cd rust
|
||||
./target/debug/claw status
|
||||
./target/debug/claw sandbox
|
||||
./target/debug/claw agents
|
||||
./target/debug/claw agents create my-agent
|
||||
./target/debug/claw mcp
|
||||
./target/debug/claw skills
|
||||
./target/debug/claw system-prompt --cwd .. --date 2026-04-04
|
||||
@@ -490,6 +529,7 @@ git clone https://github.com/Xquik-dev/tweetclaw
|
||||
cd claw-code/rust
|
||||
./target/debug/claw skills install ../../tweetclaw/skills/tweetclaw
|
||||
./target/debug/claw skills show tweetclaw
|
||||
./target/debug/claw skills uninstall tweetclaw
|
||||
```
|
||||
|
||||
TweetClaw gives `claw` users a local skill guide for OpenClaw/Xquik workflows
|
||||
@@ -497,6 +537,15 @@ such as tweet search, reply search, follower export, monitors, webhooks, and
|
||||
approval-gated posting. Configure any Xquik credentials outside the prompt and
|
||||
avoid pasting API keys into chat.
|
||||
|
||||
## Author a local agent
|
||||
|
||||
`claw agents create <name>` scaffolds a local `.claw/agents/<name>.toml` file for the current workspace. The scaffold is intentionally small so you can edit the description, model, and reasoning effort before listing or invoking agents:
|
||||
|
||||
```bash
|
||||
./target/debug/claw agents create release-checker
|
||||
./target/debug/claw agents list
|
||||
```
|
||||
|
||||
## Session management
|
||||
|
||||
REPL turns are persisted under `.claw/sessions/` in the current workspace.
|
||||
@@ -519,13 +568,63 @@ Runtime config is loaded in this order, with later entries overriding earlier on
|
||||
4. `<repo>/.claw/settings.json`
|
||||
5. `<repo>/.claw/settings.local.json`
|
||||
|
||||
The list is also the precedence chain: project-local settings override project settings, project settings override the legacy project `.claw.json`, and project files override user files. `claw --output-format json config` includes each discovered file's `precedence_rank`, `wins_for_keys`, and `shadowed_keys` so automation can see which file controls each effective key without reimplementing the merge order.
|
||||
|
||||
## MCP server validation
|
||||
|
||||
`claw mcp --output-format json` loads valid `mcpServers` entries even when sibling entries are malformed. The JSON list envelope distinguishes the total configured entries from the valid and invalid subsets:
|
||||
|
||||
```json
|
||||
{
|
||||
"configured_servers": 1,
|
||||
"total_configured": 2,
|
||||
"valid_count": 1,
|
||||
"invalid_count": 1,
|
||||
"servers": [{ "name": "valid-server", "valid": true }],
|
||||
"invalid_servers": [
|
||||
{
|
||||
"name": "missing-command",
|
||||
"error_field": "command",
|
||||
"reason": ".claw.json: mcpServers.missing-command: missing string field command",
|
||||
"valid": false
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
`status --output-format json` mirrors this under `mcp_validation`, and `doctor --output-format json` includes an `mcp validation` check so automation can repair every rejected server entry without losing usable MCP servers.
|
||||
|
||||
## Hook configuration
|
||||
|
||||
`hooks.PreToolUse`, `hooks.PostToolUse`, and `hooks.PostToolUseFailure` accept either legacy command strings or object-style entries with a `matcher` and nested command hooks:
|
||||
|
||||
```json
|
||||
{
|
||||
"hooks": {
|
||||
"PreToolUse": [
|
||||
"echo legacy hook",
|
||||
{
|
||||
"matcher": "Bash",
|
||||
"hooks": [
|
||||
{ "type": "command", "command": "scripts/audit-bash.sh" }
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Object-style matchers are optional. When present, they match tool names case-insensitively and support `*` wildcards plus comma or pipe separated alternatives. Nested hook `type` may be omitted or set to `"command"`; each nested command runs in configuration order.
|
||||
|
||||
## Project instruction rules
|
||||
|
||||
In addition to root instruction files such as `CLAUDE.md`, `AGENTS.md`, `.claw/CLAUDE.md`, `.claude/CLAUDE.md`, and `.claw/instructions.md`, `claw` loads sorted Markdown/text rule files from:
|
||||
In addition to root instruction files such as `CLAUDE.md`, `CLAW.md`, `AGENTS.md`, `.claw/CLAUDE.md`, `.claude/CLAUDE.md`, and `.claw/instructions.md`, `claw` loads sorted Markdown/text rule files from:
|
||||
|
||||
- `<repo>/.claw/rules/` (`.md`, `.txt`, `.mdc`) for shared project rules.
|
||||
- `<repo>/.claw/rules.local/` for personal local rules; this path is gitignored.
|
||||
|
||||
Root instruction-file priority is `CLAUDE.md`, then `CLAW.md`, then `AGENTS.md` for each discovered directory. Discovery is bounded to the current git root when one exists, otherwise to the current directory only, so stale parent files outside the project do not silently bleed into the prompt. All loaded files contribute to the system prompt and to `status --output-format json` as `workspace.memory_files:[{path, source, origin, scope_path, outside_project, chars, contributes}]`; `claw doctor --output-format json` includes a `memory` check so automation can detect loaded and unexpected unloaded memory-file candidates without parsing prompt text.
|
||||
|
||||
By default, `claw` also imports detected rules from common AI coding tools such as Cursor (`.cursorrules`, `.cursor/rules/`), GitHub Copilot (`.github/copilot-instructions.md`), Windsurf, Plandex, and Crush. Control this with `rulesImport` in any settings file:
|
||||
|
||||
```json
|
||||
|
||||
@@ -148,12 +148,12 @@ pub const DEFAULT_DASHSCOPE_BASE_URL: &str = "https://dashscope.aliyuncs.com/com
|
||||
**Affected models:** Slash-containing model IDs routed through the OpenAI-compatible provider, especially custom gateways configured with `OPENAI_BASE_URL` such as OpenRouter, local routers, or other `/v1/chat/completions` services.
|
||||
|
||||
**Behavior:**
|
||||
- The default OpenAI API treats `openai/` as a routing prefix and sends the bare model name on the wire.
|
||||
- Custom OpenAI-compatible base URLs preserve slash-containing slugs such as `openai/gpt-4.1-mini` so the gateway receives the exact model ID it expects.
|
||||
- The default OpenAI API and local/private OpenAI-compatible base URLs treat `openai/` as a routing prefix and send the bare model name on the wire.
|
||||
- Non-local custom OpenAI-compatible base URLs preserve slash-containing slugs such as `openai/gpt-4.1-mini` so gateways like OpenRouter receive the exact model ID they expect. Local slash-containing model IDs can use `local/`, which strips only that escape-hatch prefix and sends the remainder verbatim.
|
||||
- `MessageRequest::extra_body` passes through custom request JSON after core fields are populated. This supports provider-specific options such as `web_search_options` and `parallel_tool_calls`.
|
||||
- Protected core fields (`model`, `messages`, `stream`, `tools`, `tool_choice`, `max_tokens`, `max_completion_tokens`) cannot be overridden through `extra_body`.
|
||||
|
||||
**Testing:** See `custom_openai_gateway_preserves_slash_model_ids_and_extra_body_params` in `openai_compat_integration.rs` and `extra_body_params_are_passed_through_without_overriding_core_fields` in `openai_compat.rs`.
|
||||
**Testing:** See `custom_openai_gateway_preserves_slash_model_ids_and_extra_body_params` in `openai_compat_integration.rs`, `wire_model_strips_openai_prefix_for_default_and_local_preserves_custom_gateways`, `local_routing_prefix_strips_only_escape_hatch`, and `extra_body_params_are_passed_through_without_overriding_core_fields` in `openai_compat.rs`.
|
||||
|
||||
## Implementation Details
|
||||
|
||||
|
||||
@@ -13,7 +13,7 @@ If you need the most polished daily-driver experience for a specific non-Claude
|
||||
|
||||
## OpenAI-compatible routing basics
|
||||
|
||||
Set `OPENAI_BASE_URL` to the server’s `/v1` endpoint and set `OPENAI_API_KEY` to either the required token or a harmless placeholder for local servers that expect an Authorization header. The model name must match what the server exposes.
|
||||
Set `OPENAI_BASE_URL` to the server’s `/v1` endpoint and set `OPENAI_API_KEY` to either the required token or a harmless placeholder for local servers that expect an Authorization header. Authless local/private OpenAI-compatible servers can leave `OPENAI_API_KEY` unset. The model name must match what the server exposes.
|
||||
|
||||
```bash
|
||||
export OPENAI_BASE_URL="http://127.0.0.1:11434/v1"
|
||||
@@ -24,8 +24,8 @@ claw --model "qwen3:latest" prompt "Reply exactly HELLO_WORLD_123"
|
||||
Routing notes:
|
||||
|
||||
- Use the `openai/` prefix for OpenAI-compatible gateways when you need prefix routing to win over ambient Anthropic credentials, for example `--model "openai/gpt-4.1-mini"` with OpenRouter.
|
||||
- For local servers, prefer the exact model ID reported by the server (`qwen3:latest`, `llama3.2`, `Qwen/Qwen2.5-Coder-7B-Instruct`, etc.). If your local gateway exposes slash-containing IDs, use that exact slug.
|
||||
- If you have multiple provider keys in your environment, remove unrelated keys while smoke-testing a local route or choose a model prefix that unambiguously selects the intended provider.
|
||||
- For local servers, prefer the exact model ID reported by the server (`qwen3:latest`, `llama3.2`, etc.). If your local gateway exposes slash-containing IDs, prefix the exact slug with `local/` so Claw routes through OpenAI-compatible transport while sending the rest verbatim, for example `--model "local/Qwen/Qwen2.5-Coder-7B-Instruct"`.
|
||||
- If you have multiple provider keys in your environment, `OPENAI_BASE_URL` plus local-looking tags such as `llama3.2` or `qwen2.5-coder:7b` selects the local OpenAI-compatible route; use `local/` for slash-containing local IDs.
|
||||
- Tool workflows need model/server support for OpenAI-compatible tool calls. Plain prompt smoke tests can pass even when slash/tool workflows still fail because the server returns an incompatible tool-call shape.
|
||||
|
||||
## Raw `/v1/chat/completions` smoke test
|
||||
@@ -58,11 +58,11 @@ In another shell:
|
||||
|
||||
```bash
|
||||
export OPENAI_BASE_URL="http://127.0.0.1:11434/v1"
|
||||
export OPENAI_API_KEY="local-dev-token"
|
||||
unset OPENAI_API_KEY
|
||||
claw --model "qwen3:latest" prompt "Reply exactly HELLO_WORLD_123"
|
||||
```
|
||||
|
||||
If Ollama is running without auth and your build accepts authless local OpenAI-compatible servers, `unset OPENAI_API_KEY` is also acceptable. Use a placeholder token rather than a real cloud API key for local testing.
|
||||
If Ollama is running without auth, `unset OPENAI_API_KEY` is acceptable. Use a placeholder token rather than a real cloud API key if your local server requires an Authorization header.
|
||||
|
||||
## llama.cpp server
|
||||
|
||||
|
||||
Generated
-1
@@ -2244,7 +2244,6 @@ version = "0.1.3"
|
||||
dependencies = [
|
||||
"api",
|
||||
"commands",
|
||||
"compat-harness",
|
||||
"crossterm",
|
||||
"log",
|
||||
"mock-anthropic-service",
|
||||
|
||||
+20
-13
@@ -15,7 +15,7 @@ cargo run -p rusty-claude-cli -- --help
|
||||
cargo build --workspace
|
||||
|
||||
# Run the interactive REPL
|
||||
cargo run -p rusty-claude-cli -- --model claude-opus-4-6
|
||||
cargo run -p rusty-claude-cli -- --model claude-opus-4-7
|
||||
|
||||
# One-shot prompt
|
||||
cargo run -p rusty-claude-cli -- prompt "explain this codebase"
|
||||
@@ -87,7 +87,7 @@ Primary artifacts:
|
||||
| Sub-agent / agent surfaces | ✅ |
|
||||
| Todo tracking | ✅ |
|
||||
| Notebook editing | ✅ |
|
||||
| CLAUDE.md / project memory | ✅ |
|
||||
| CLAUDE.md / CLAW.md / AGENTS.md project memory | ✅ |
|
||||
| Config file hierarchy (`.claw.json` + merged config sections) | ✅ |
|
||||
| Permission system | ✅ |
|
||||
| MCP server lifecycle + inspection | ✅ |
|
||||
@@ -100,7 +100,7 @@ Primary artifacts:
|
||||
| Slash commands (including `/skills`, `/agents`, `/mcp`, `/doctor`, `/plugin`, `/subagent`) | ✅ |
|
||||
| Hooks (`/hooks`, config-backed lifecycle hooks) | ✅ |
|
||||
| Plugin management surfaces | ✅ |
|
||||
| Skills inventory / install surfaces | ✅ |
|
||||
| Skills inventory / install / uninstall surfaces | ✅ |
|
||||
| Machine-readable JSON output across core CLI surfaces | ✅ |
|
||||
|
||||
## Model Aliases
|
||||
@@ -109,7 +109,7 @@ Short names resolve to the latest model versions:
|
||||
|
||||
| Alias | Resolves To |
|
||||
|-------|------------|
|
||||
| `opus` | `claude-opus-4-6` |
|
||||
| `opus` | `claude-opus-4-7` |
|
||||
| `sonnet` | `claude-sonnet-4-6` |
|
||||
| `haiku` | `claude-haiku-4-5-20251213` |
|
||||
|
||||
@@ -122,10 +122,11 @@ claw [OPTIONS] [COMMAND]
|
||||
|
||||
Flags:
|
||||
--model MODEL
|
||||
--output-format text|json
|
||||
--output-format text|json (case-insensitive; CLAW_OUTPUT_FORMAT supplies the default, flags override env)
|
||||
--permission-mode MODE
|
||||
--dangerously-skip-permissions
|
||||
--allowedTools TOOLS
|
||||
--cwd PATH, -C PATH, --directory PATH
|
||||
--dangerously-skip-permissions, --skip-permissions
|
||||
--allowedTools TOOLS canonical snake_case names or aliases; status JSON exposes allowed_tools.available/aliases
|
||||
--resume [SESSION.jsonl|session-id|latest]
|
||||
--version, -V
|
||||
|
||||
@@ -146,6 +147,12 @@ Top-level commands:
|
||||
```
|
||||
|
||||
`claw acp` is a local discoverability surface for editor-first users: it reports the current ACP/Zed status without starting the runtime. As of April 16, 2026, claw-code does **not** ship an ACP/Zed daemon or JSON-RPC entrypoint yet, and `claw acp serve` is only a status alias until the real protocol surface lands. Status queries exit 0 and expose the same machine-readable contract via `--output-format json`; malformed ACP invocations exit 1 with `kind: unsupported_acp_invocation`.
|
||||
`--output-format` accepts `text` or `json` in any casing. `CLAW_OUTPUT_FORMAT=json` selects JSON as the default for non-interactive commands, explicit flags override it, repeated flags warn on stderr, and status JSON exposes `format_source`, `format_raw`, and `format_overridden`. Help and doctor output also surface `CLAW_LOG` / `RUST_LOG` as the logging environment knobs.
|
||||
`claw version --output-format json` is the provenance probe for automation: it reports full `git_sha`, derived `git_sha_short`, `is_dirty`, `branch`, `commit_date`, `commit_timestamp`, `rustc_version`, runtime `executable_path`, and `binary_provenance`; the text report is available as `human_readable` instead of a duplicate `message` field.
|
||||
`status --output-format json` reports loaded project memory files under `workspace.memory_files[]` with each file's `path`, `source` (`claude_md`, `claw_md`, `agents_md`, or scoped/rule sources), `origin`, `scope_path`, `outside_project`, `chars`, and `contributes`; `claw doctor --output-format json` includes a dedicated `memory` check. Root instruction-file priority is `CLAUDE.md`, then `CLAW.md`, then `AGENTS.md`, discovery is bounded to the current git root when present (otherwise cwd only), and all non-duplicate loaded files contribute to the rendered system prompt.
|
||||
`claw mcp --output-format json` reports partial MCP config success: valid servers remain in `servers[]` while malformed siblings appear in `invalid_servers[]`, with `total_configured`, `valid_count`, and `invalid_count` split out for automation. `status` mirrors this as `mcp_validation`, and doctor includes an `mcp validation` check.
|
||||
Shorthand prompt mode honors the POSIX `--` end-of-flags separator, so `claw -- "-prompt-with-dash"` and unknown dash-prefixed non-flag text stay on the prompt path instead of being treated as CLI options.
|
||||
`claw dump-manifests` is self-contained: it emits the Rust resolver inventory for the selected workspace (commands, tools, agents, skills, and bootstrap phases) without requiring an upstream Claude Code TypeScript checkout. Use `--manifests-dir PATH` only to scope resolver discovery to another directory.
|
||||
|
||||
The command surface is moving quickly. For the canonical live help text, run:
|
||||
|
||||
@@ -166,8 +173,8 @@ The REPL now exposes a much broader surface than the original minimal shell:
|
||||
- plugin management: `/plugin` (with aliases `/plugins`, `/marketplace`)
|
||||
|
||||
Notable claw-first surfaces now available directly in slash form:
|
||||
- `/skills [list|install <path>|help]`
|
||||
- `/agents [list|help]`
|
||||
- `/skills [list|show <name>|install <path>|uninstall <name>|help]`
|
||||
- `/agents [list|show <name>|create <name>|help]`
|
||||
- `/mcp [list|show <server>|help]`
|
||||
- `/doctor`
|
||||
- `/plugin [list|install <path>|enable <name>|disable <name>|uninstall <id>|update <id>]`
|
||||
@@ -184,7 +191,7 @@ rust/
|
||||
└── crates/
|
||||
├── api/ # Provider clients + streaming + request preflight
|
||||
├── commands/ # Shared slash-command registry + help rendering
|
||||
├── compat-harness/ # TS manifest extraction harness
|
||||
├── compat-harness/ # Compatibility/parity harness utilities
|
||||
├── mock-anthropic-service/ # Deterministic local Anthropic-compatible mock
|
||||
├── plugins/ # Plugin metadata, manager, install/enable/disable surfaces
|
||||
├── runtime/ # Session, config, permissions, MCP, prompts, auth/runtime loop
|
||||
@@ -197,7 +204,7 @@ rust/
|
||||
|
||||
- **api** — provider clients, SSE streaming, request/response types, auth (`ANTHROPIC_API_KEY` + bearer-token support), request-size/context-window preflight
|
||||
- **commands** — slash command definitions, parsing, help text generation, JSON/text command rendering
|
||||
- **compat-harness** — extracts tool/prompt manifests from upstream TS source
|
||||
- **compat-harness** — compatibility and parity helpers for comparing behavior with upstream fixtures
|
||||
- **mock-anthropic-service** — deterministic `/v1/messages` mock for CLI parity tests and local harness runs
|
||||
- **plugins** — plugin metadata, install/enable/disable/update flows, plugin tool definitions, hook integration surfaces
|
||||
- **runtime** — `ConversationRuntime`, config loading, session persistence, permission policy, MCP client lifecycle, system prompt assembly, usage tracking
|
||||
@@ -210,8 +217,8 @@ rust/
|
||||
- **~20K lines** of Rust
|
||||
- **9 crates** in workspace
|
||||
- **Binary name:** `claw`
|
||||
- **Default model:** `claude-opus-4-6`
|
||||
- **Default permissions:** `danger-full-access`
|
||||
- **Default model:** `claude-opus-4-7`
|
||||
- **Default permissions:** `workspace-write`
|
||||
|
||||
## License
|
||||
|
||||
|
||||
@@ -38,6 +38,7 @@ fn create_sample_request(message_count: usize) -> MessageRequest {
|
||||
id: format!("call_{}", i),
|
||||
name: "read_file".to_string(),
|
||||
input: json!({"path": format!("/tmp/file{}", i)}),
|
||||
thought_signature: None,
|
||||
},
|
||||
],
|
||||
}),
|
||||
@@ -57,6 +58,7 @@ fn create_sample_request(message_count: usize) -> MessageRequest {
|
||||
id: format!("call_{}", i),
|
||||
name: "write_file".to_string(),
|
||||
input: json!({"path": format!("/tmp/out{}", i), "content": "data"}),
|
||||
thought_signature: None,
|
||||
}],
|
||||
}),
|
||||
}
|
||||
@@ -105,11 +107,13 @@ fn bench_translate_message(c: &mut Criterion) {
|
||||
id: "call_1".to_string(),
|
||||
name: "read_file".to_string(),
|
||||
input: json!({"path": "/tmp/test"}),
|
||||
thought_signature: None,
|
||||
},
|
||||
InputContentBlock::ToolUse {
|
||||
id: "call_2".to_string(),
|
||||
name: "write_file".to_string(),
|
||||
input: json!({"path": "/tmp/out", "content": "data"}),
|
||||
thought_signature: None,
|
||||
},
|
||||
],
|
||||
};
|
||||
|
||||
@@ -161,7 +161,7 @@ mod tests {
|
||||
|
||||
#[test]
|
||||
fn resolves_existing_and_grok_aliases() {
|
||||
assert_eq!(resolve_model_alias("opus"), "claude-opus-4-6");
|
||||
assert_eq!(resolve_model_alias("opus"), "claude-opus-4-7");
|
||||
assert_eq!(resolve_model_alias("grok"), "grok-3");
|
||||
assert_eq!(resolve_model_alias("grok-mini"), "grok-3-mini");
|
||||
}
|
||||
@@ -235,4 +235,22 @@ mod tests {
|
||||
other => panic!("Expected ProviderClient::OpenAi for qwen-plus, got: {other:?}"),
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn local_openai_base_url_routes_authless_ollama_models() {
|
||||
let _lock = env_lock();
|
||||
let _base_url = EnvVarGuard::set("OPENAI_BASE_URL", Some("http://127.0.0.1:11434/v1"));
|
||||
let _openai_key = EnvVarGuard::set("OPENAI_API_KEY", None);
|
||||
let _anthropic_key = EnvVarGuard::set("ANTHROPIC_API_KEY", Some("test-anthropic-key"));
|
||||
let _anthropic_token = EnvVarGuard::set("ANTHROPIC_AUTH_TOKEN", None);
|
||||
|
||||
let client = ProviderClient::from_model("qwen2.5-coder:7b")
|
||||
.expect("local model should route to OpenAI-compatible client without auth");
|
||||
match client {
|
||||
ProviderClient::OpenAi(openai_client) => {
|
||||
assert_eq!(openai_client.base_url(), "http://127.0.0.1:11434/v1")
|
||||
}
|
||||
other => panic!("Expected ProviderClient::OpenAi for local model, got: {other:?}"),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -468,8 +468,7 @@ impl AnthropicClient {
|
||||
request: &MessageRequest,
|
||||
) -> Result<reqwest::Response, ApiError> {
|
||||
let request_url = format!("{}/v1/messages", self.base_url.trim_end_matches('/'));
|
||||
let mut request_body = self.request_profile.render_json_body(request)?;
|
||||
strip_unsupported_beta_body_fields(&mut request_body);
|
||||
let request_body = render_standard_messages_body(&self.request_profile, request)?;
|
||||
let request_builder = self.build_request(&request_url).json(&request_body);
|
||||
request_builder.send().await.map_err(ApiError::from)
|
||||
}
|
||||
@@ -529,8 +528,7 @@ impl AnthropicClient {
|
||||
"{}/v1/messages/count_tokens",
|
||||
self.base_url.trim_end_matches('/')
|
||||
);
|
||||
let mut request_body = self.request_profile.render_json_body(request)?;
|
||||
strip_unsupported_beta_body_fields(&mut request_body);
|
||||
let request_body = render_standard_messages_body(&self.request_profile, request)?;
|
||||
let response = self
|
||||
.build_request(&request_url)
|
||||
.json(&request_body)
|
||||
@@ -977,6 +975,21 @@ fn enrich_bearer_auth_error(error: ApiError, auth: &AuthSource) -> ApiError {
|
||||
}
|
||||
}
|
||||
|
||||
fn anthropic_wire_model(model: &str) -> &str {
|
||||
model.strip_prefix("anthropic/").unwrap_or(model)
|
||||
}
|
||||
|
||||
fn render_standard_messages_body(
|
||||
request_profile: &AnthropicRequestProfile,
|
||||
request: &MessageRequest,
|
||||
) -> Result<Value, serde_json::Error> {
|
||||
let mut wire_request = request.clone();
|
||||
wire_request.model = anthropic_wire_model(&request.model).to_string();
|
||||
let mut body = request_profile.render_json_body(&wire_request)?;
|
||||
strip_unsupported_beta_body_fields(&mut body);
|
||||
Ok(body)
|
||||
}
|
||||
|
||||
/// Remove beta-only body fields that the standard `/v1/messages` and
|
||||
/// `/v1/messages/count_tokens` endpoints reject as `Extra inputs are not
|
||||
/// permitted`. The `betas` opt-in is communicated via the `anthropic-beta`
|
||||
@@ -1550,6 +1563,27 @@ mod tests {
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn standard_messages_body_strips_anthropic_routing_prefix() {
|
||||
let client = AnthropicClient::new("test-key");
|
||||
let request = MessageRequest {
|
||||
model: "anthropic/claude-opus-4-6".to_string(),
|
||||
max_tokens: 64,
|
||||
messages: vec![],
|
||||
system: None,
|
||||
tools: None,
|
||||
tool_choice: None,
|
||||
stream: false,
|
||||
..Default::default()
|
||||
};
|
||||
|
||||
let rendered = super::render_standard_messages_body(client.request_profile(), &request)
|
||||
.expect("body should render");
|
||||
|
||||
assert_eq!(rendered["model"], serde_json::json!("claude-opus-4-6"));
|
||||
assert!(rendered.get("betas").is_none());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn enrich_bearer_auth_error_appends_sk_ant_hint_on_401_with_pure_bearer_token() {
|
||||
// given
|
||||
|
||||
@@ -211,7 +211,7 @@ pub fn resolve_model_alias(model: &str) -> String {
|
||||
.find_map(|(alias, metadata)| {
|
||||
(*alias == lower).then_some(match metadata.provider {
|
||||
ProviderKind::Anthropic => match *alias {
|
||||
"opus" => "claude-opus-4-6",
|
||||
"opus" => "claude-opus-4-7",
|
||||
"sonnet" => "claude-sonnet-4-6",
|
||||
"haiku" => "claude-haiku-4-5-20251213",
|
||||
_ => trimmed,
|
||||
@@ -262,6 +262,14 @@ pub fn metadata_for_model(model: &str) -> Option<ProviderMetadata> {
|
||||
default_base_url: openai_compat::DEFAULT_OPENAI_BASE_URL,
|
||||
});
|
||||
}
|
||||
if canonical.starts_with("local/") {
|
||||
return Some(ProviderMetadata {
|
||||
provider: ProviderKind::OpenAi,
|
||||
auth_env: "OPENAI_API_KEY",
|
||||
base_url_env: "OPENAI_BASE_URL",
|
||||
default_base_url: openai_compat::DEFAULT_OPENAI_BASE_URL,
|
||||
});
|
||||
}
|
||||
// Alibaba DashScope compatible-mode endpoint. Routes qwen/* and bare
|
||||
// qwen-* model names (qwen-max, qwen-plus, qwen-turbo, qwen-qwq, etc.)
|
||||
// to the OpenAI-compat client pointed at DashScope's /compatible-mode/v1.
|
||||
@@ -337,17 +345,21 @@ pub fn provider_diagnostics_for_model(model: &str) -> ProviderDiagnostics {
|
||||
}
|
||||
}
|
||||
|
||||
fn looks_like_local_openai_model(model: &str) -> bool {
|
||||
model.contains(':') || model.contains('.')
|
||||
}
|
||||
|
||||
#[must_use]
|
||||
pub fn detect_provider_kind(model: &str) -> ProviderKind {
|
||||
if let Some(metadata) = metadata_for_model(model) {
|
||||
let resolved_model = resolve_model_alias(model);
|
||||
if let Some(metadata) = metadata_for_model(&resolved_model) {
|
||||
return metadata.provider;
|
||||
}
|
||||
// When OPENAI_BASE_URL is set, the user explicitly configured an
|
||||
// OpenAI-compatible endpoint. Prefer it over the Anthropic fallback
|
||||
// even when the model name has no recognized prefix — this is the
|
||||
// common case for local providers (Ollama, LM Studio, vLLM, etc.)
|
||||
// where model names like "qwen2.5-coder:7b" don't match any prefix.
|
||||
if std::env::var_os("OPENAI_BASE_URL").is_some() && openai_compat::has_api_key("OPENAI_API_KEY")
|
||||
// When OPENAI_BASE_URL is set and the unknown model name looks like a
|
||||
// local server tag (for example `llama3.2` or `qwen2.5-coder:7b`), prefer
|
||||
// the OpenAI-compatible endpoint over ambient Anthropic credentials.
|
||||
if std::env::var_os("OPENAI_BASE_URL").is_some()
|
||||
&& looks_like_local_openai_model(&resolved_model)
|
||||
{
|
||||
return ProviderKind::OpenAi;
|
||||
}
|
||||
@@ -608,7 +620,7 @@ pub fn model_token_limit(model: &str) -> Option<ModelTokenLimit> {
|
||||
let canonical = resolve_model_alias(model);
|
||||
let base_model = canonical.rsplit('/').next().unwrap_or(canonical.as_str());
|
||||
match base_model {
|
||||
"claude-opus-4-6" => Some(ModelTokenLimit {
|
||||
"claude-opus-4-7" | "claude-opus-4-6" => Some(ModelTokenLimit {
|
||||
max_output_tokens: 32_000,
|
||||
context_window_tokens: 200_000,
|
||||
}),
|
||||
@@ -1042,6 +1054,18 @@ mod tests {
|
||||
assert_eq!(kind2, ProviderKind::OpenAi);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn local_prefix_routes_to_openai_not_anthropic() {
|
||||
let meta = super::metadata_for_model("local/Qwen/Qwen3.6-27B-FP8")
|
||||
.expect("local/ prefix must resolve to OpenAI-compatible metadata");
|
||||
assert_eq!(meta.provider, ProviderKind::OpenAi);
|
||||
assert_eq!(meta.auth_env, "OPENAI_API_KEY");
|
||||
assert_eq!(meta.base_url_env, "OPENAI_BASE_URL");
|
||||
|
||||
let kind = detect_provider_kind("local/Qwen/Qwen3.6-27B-FP8");
|
||||
assert_eq!(kind, ProviderKind::OpenAi);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn qwen_prefix_routes_to_dashscope_not_anthropic() {
|
||||
// User request from Discord #clawcode-get-help: web3g wants to use
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
use std::borrow::Cow;
|
||||
use std::collections::{BTreeMap, VecDeque};
|
||||
use std::net::Ipv4Addr;
|
||||
use std::sync::atomic::{AtomicU64, Ordering};
|
||||
use std::time::{Duration, SystemTime, UNIX_EPOCH};
|
||||
|
||||
@@ -131,13 +132,22 @@ impl OpenAiCompatClient {
|
||||
}
|
||||
|
||||
pub fn from_env(config: OpenAiCompatConfig) -> Result<Self, ApiError> {
|
||||
let Some(api_key) = read_env_non_empty(config.api_key_env)? else {
|
||||
return Err(ApiError::missing_credentials(
|
||||
config.provider_name,
|
||||
config.credential_env_vars(),
|
||||
));
|
||||
let base_url = read_base_url(config);
|
||||
let api_key = match read_env_non_empty(config.api_key_env)? {
|
||||
Some(api_key) => api_key,
|
||||
None if config.provider_name == "OpenAI"
|
||||
&& is_local_openai_compatible_base_url(&base_url) =>
|
||||
{
|
||||
"local-dev-token".to_string()
|
||||
}
|
||||
None => {
|
||||
return Err(ApiError::missing_credentials(
|
||||
config.provider_name,
|
||||
config.credential_env_vars(),
|
||||
));
|
||||
}
|
||||
};
|
||||
Ok(Self::new(api_key, config))
|
||||
Ok(Self::new(api_key, config).with_base_url(base_url))
|
||||
}
|
||||
|
||||
#[must_use]
|
||||
@@ -576,6 +586,21 @@ impl StreamState {
|
||||
}
|
||||
}
|
||||
|
||||
if let Some(delta_extra) = &choice.delta.extra_content {
|
||||
if let Some(delta_sig) = delta_extra
|
||||
.get("google")
|
||||
.and_then(|g| g.get("thought_signature"))
|
||||
.and_then(|v| v.as_str())
|
||||
.filter(|s| !s.is_empty())
|
||||
{
|
||||
for state in self.tool_calls.values_mut() {
|
||||
if state.thought_signature.is_none() {
|
||||
state.thought_signature.get_or_insert(delta_sig.to_string());
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if let Some(finish_reason) = choice.finish_reason {
|
||||
self.stop_reason = Some(normalize_finish_reason(&finish_reason));
|
||||
if finish_reason == "tool_calls" {
|
||||
@@ -683,6 +708,7 @@ struct ToolCallState {
|
||||
id: Option<String>,
|
||||
name: Option<String>,
|
||||
arguments: String,
|
||||
thought_signature: Option<String>,
|
||||
emitted_len: usize,
|
||||
started: bool,
|
||||
stopped: bool,
|
||||
@@ -700,6 +726,24 @@ impl ToolCallState {
|
||||
if let Some(arguments) = tool_call.function.arguments {
|
||||
self.arguments.push_str(&arguments);
|
||||
}
|
||||
|
||||
if let Some(sig) = tool_call.thought_signature.filter(|s| !s.is_empty()) {
|
||||
self.thought_signature.get_or_insert(sig);
|
||||
}
|
||||
|
||||
// https://ai.google.dev/gemini-api/docs/thought-signatures
|
||||
if self.thought_signature.is_none() {
|
||||
if let Some(sig) = tool_call
|
||||
.extra_content
|
||||
.as_ref()
|
||||
.and_then(|ec| ec.get("google"))
|
||||
.and_then(|g| g.get("thought_signature"))
|
||||
.and_then(|v| v.as_str())
|
||||
.filter(|s| !s.is_empty())
|
||||
{
|
||||
self.thought_signature.get_or_insert(sig.to_string());
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
const fn block_index(&self, offset: u32) -> u32 {
|
||||
@@ -721,6 +765,7 @@ impl ToolCallState {
|
||||
id,
|
||||
name,
|
||||
input: json!({}),
|
||||
thought_signature: self.thought_signature.clone(),
|
||||
},
|
||||
}))
|
||||
}
|
||||
@@ -772,6 +817,10 @@ struct ChatMessage {
|
||||
struct ResponseToolCall {
|
||||
id: String,
|
||||
function: ResponseToolFunction,
|
||||
#[serde(default)]
|
||||
thought_signature: Option<String>,
|
||||
#[serde(default)]
|
||||
extra_content: Option<serde_json::Value>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Deserialize)]
|
||||
@@ -842,6 +891,8 @@ struct ChunkDelta {
|
||||
thinking: Option<ThinkingDelta>,
|
||||
#[serde(default, deserialize_with = "deserialize_null_as_empty_vec")]
|
||||
tool_calls: Vec<DeltaToolCall>,
|
||||
#[serde(default)]
|
||||
extra_content: Option<serde_json::Value>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Default, Deserialize)]
|
||||
@@ -850,7 +901,7 @@ struct ThinkingDelta {
|
||||
content: Option<String>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Deserialize)]
|
||||
#[derive(Debug, Default, Deserialize)]
|
||||
struct DeltaToolCall {
|
||||
#[serde(default)]
|
||||
index: u32,
|
||||
@@ -858,6 +909,10 @@ struct DeltaToolCall {
|
||||
id: Option<String>,
|
||||
#[serde(default)]
|
||||
function: DeltaFunction,
|
||||
#[serde(default)]
|
||||
thought_signature: Option<String>,
|
||||
#[serde(default)]
|
||||
extra_content: Option<serde_json::Value>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Default, Deserialize)]
|
||||
@@ -913,16 +968,36 @@ pub fn model_requires_reasoning_content_in_history(model: &str) -> bool {
|
||||
canonical.starts_with("deepseek-v4")
|
||||
}
|
||||
|
||||
/// Dummy thought signature accepted by Gemini as a validation bypass for
|
||||
/// conversation history that lacks a real signature. Source:
|
||||
/// - LiteLLM: https://github.com/BerriAI/litellm/pull/16812
|
||||
/// - Google: https://ai.google.dev/gemini-api/docs/thought-signatures#faqs
|
||||
const GEMINI_DUMMY_THOUGHT_SIGNATURE: &str = "c2tpcF90aG91Z2h0X3NpZ25hdHVyZV92YWxpZGF0b3I=";
|
||||
|
||||
/// Returns true if the model is a Gemini model (Gemini 2.5+, 3+ etc) that
|
||||
/// requires `thought_signature` on function calls in conversation history.
|
||||
#[must_use]
|
||||
pub fn is_gemini_model(model: &str) -> bool {
|
||||
let lowered = model.to_ascii_lowercase();
|
||||
let canonical = lowered.rsplit('/').next().unwrap_or(lowered.as_str());
|
||||
canonical.starts_with("gemini")
|
||||
}
|
||||
|
||||
|
||||
/// Strip routing prefix (e.g., "openai/gpt-4" → "gpt-4") for the wire.
|
||||
/// The prefix is used only to select transport; the backend expects the
|
||||
/// bare model id.
|
||||
/// bare model id. Use `local/` to force OpenAI-compatible routing while
|
||||
/// preserving any slashes that follow the prefix.
|
||||
#[allow(dead_code)]
|
||||
fn strip_routing_prefix(model: &str) -> &str {
|
||||
if let Some(pos) = model.find('/') {
|
||||
let prefix = &model[..pos];
|
||||
// Only strip if the prefix before "/" is a known routing prefix,
|
||||
// not if "/" appears in the middle of the model name for other reasons.
|
||||
if matches!(prefix, "openai" | "xai" | "grok" | "qwen" | "kimi") {
|
||||
if matches!(
|
||||
prefix,
|
||||
"openai" | "xai" | "grok" | "qwen" | "kimi" | "local"
|
||||
) {
|
||||
&model[pos + 1..]
|
||||
} else {
|
||||
model
|
||||
@@ -932,6 +1007,44 @@ fn strip_routing_prefix(model: &str) -> &str {
|
||||
}
|
||||
}
|
||||
|
||||
fn normalize_base_url_for_model_routing(url: &str) -> &str {
|
||||
let trimmed = url.trim_end_matches('/');
|
||||
trimmed
|
||||
.strip_suffix("/chat/completions")
|
||||
.map(|value| value.trim_end_matches('/'))
|
||||
.unwrap_or(trimmed)
|
||||
}
|
||||
|
||||
fn url_host(url: &str) -> &str {
|
||||
let after_scheme = url.split_once("://").map_or(url, |(_, rest)| rest);
|
||||
let authority = after_scheme.split(['/', '?', '#']).next().unwrap_or("");
|
||||
let host_port = authority
|
||||
.rsplit_once('@')
|
||||
.map_or(authority, |(_, host_port)| host_port);
|
||||
if host_port.starts_with('[') {
|
||||
return host_port
|
||||
.split(']')
|
||||
.next()
|
||||
.unwrap_or("")
|
||||
.trim_start_matches('[');
|
||||
}
|
||||
host_port.split(':').next().unwrap_or("")
|
||||
}
|
||||
|
||||
fn is_local_openai_compatible_base_url(url: &str) -> bool {
|
||||
let host = url_host(url.trim());
|
||||
if host.eq_ignore_ascii_case("localhost") || host == "::1" {
|
||||
return true;
|
||||
}
|
||||
let Ok(address) = host.parse::<Ipv4Addr>() else {
|
||||
return false;
|
||||
};
|
||||
let [first, second, ..] = address.octets();
|
||||
matches!(first, 10 | 127)
|
||||
|| first == 192 && second == 168
|
||||
|| first == 172 && (16..=31).contains(&second)
|
||||
}
|
||||
|
||||
fn wire_model_for_base_url<'a>(
|
||||
model: &'a str,
|
||||
config: OpenAiCompatConfig,
|
||||
@@ -944,26 +1057,22 @@ fn wire_model_for_base_url<'a>(
|
||||
let lowered_prefix = prefix.to_ascii_lowercase();
|
||||
|
||||
if lowered_prefix == "openai" {
|
||||
let trimmed_base_url = base_url.trim_end_matches('/');
|
||||
let default_openai = DEFAULT_OPENAI_BASE_URL.trim_end_matches('/');
|
||||
if matches!(
|
||||
lowered_prefix.as_str(),
|
||||
"xai" | "grok" | "kimi" | "gemini" | "gemma"
|
||||
) {
|
||||
let normalized_base_url = normalize_base_url_for_model_routing(base_url);
|
||||
let default_base_url = normalize_base_url_for_model_routing(config.default_base_url);
|
||||
if normalized_base_url.eq_ignore_ascii_case(default_base_url)
|
||||
|| is_local_openai_compatible_base_url(base_url)
|
||||
{
|
||||
return Cow::Borrowed(&model[pos + 1..]);
|
||||
}
|
||||
if config.provider_name == "OpenAI" && trimmed_base_url != default_openai {
|
||||
// Only preserve the full slug if it's NOT a model we want to strip
|
||||
if !model.contains("gemini") && !model.contains("gemma") {
|
||||
return Cow::Borrowed(model);
|
||||
}
|
||||
}
|
||||
return Cow::Borrowed(&model[pos + 1..]);
|
||||
return Cow::Borrowed(model);
|
||||
}
|
||||
|
||||
if matches!(lowered_prefix.as_str(), "xai" | "grok" | "qwen" | "kimi") {
|
||||
return Cow::Borrowed(&model[pos + 1..]);
|
||||
}
|
||||
if lowered_prefix == "local" {
|
||||
return Cow::Borrowed(&model[pos + 1..]);
|
||||
}
|
||||
|
||||
Cow::Borrowed(model)
|
||||
}
|
||||
@@ -1115,6 +1224,13 @@ fn build_chat_completion_request_for_base_url(
|
||||
payload[key] = value.clone();
|
||||
}
|
||||
|
||||
// DeepSeek V4 Pro/Flash thinking mode requires this provider-specific opt-in
|
||||
// and also requires assistant reasoning history to be echoed as `reasoning_content`.
|
||||
// Apply it after extra_body so callers cannot accidentally override the required shape.
|
||||
if model_requires_reasoning_content_in_history(wire_model) {
|
||||
payload["thinking"] = json!({"type": "enabled"});
|
||||
}
|
||||
|
||||
payload
|
||||
}
|
||||
|
||||
@@ -1161,27 +1277,48 @@ pub fn translate_message(message: &InputMessage, model: &str) -> Vec<Value> {
|
||||
InputContentBlock::Thinking {
|
||||
thinking: value, ..
|
||||
} => reasoning.push_str(value),
|
||||
InputContentBlock::ToolUse { id, name, input } => tool_calls.push(json!({
|
||||
"id": id,
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": name,
|
||||
"arguments": input.to_string(),
|
||||
InputContentBlock::ToolUse { id, name, input, thought_signature } => {
|
||||
let mut tc = json!({
|
||||
"id": id,
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": name,
|
||||
"arguments": input.to_string(),
|
||||
}
|
||||
});
|
||||
|
||||
let sig_for_gemini = thought_signature.clone().or_else(|| {
|
||||
if is_gemini_model(model) {
|
||||
Some(GEMINI_DUMMY_THOUGHT_SIGNATURE.to_string())
|
||||
} else {
|
||||
None
|
||||
}
|
||||
});
|
||||
if let Some(sig) = sig_for_gemini {
|
||||
tc["extra_content"] = json!({
|
||||
"google": {
|
||||
"thought_signature": sig
|
||||
}
|
||||
});
|
||||
}
|
||||
})),
|
||||
tool_calls.push(tc);
|
||||
}
|
||||
InputContentBlock::ToolResult { .. } => {}
|
||||
}
|
||||
}
|
||||
let include_reasoning =
|
||||
model_requires_reasoning_content_in_history(model) && !reasoning.is_empty();
|
||||
if text.is_empty() && tool_calls.is_empty() && !include_reasoning {
|
||||
let needs_reasoning = model_requires_reasoning_content_in_history(model);
|
||||
if text.is_empty() && tool_calls.is_empty() && reasoning.is_empty() {
|
||||
Vec::new()
|
||||
} else {
|
||||
let mut msg = serde_json::json!({
|
||||
"role": "assistant",
|
||||
"content": (!text.is_empty()).then_some(text),
|
||||
});
|
||||
if include_reasoning {
|
||||
if !text.is_empty() {
|
||||
msg["content"] = json!(text);
|
||||
} else if !needs_reasoning {
|
||||
msg["content"] = Value::Null;
|
||||
}
|
||||
if needs_reasoning {
|
||||
msg["reasoning_content"] = json!(reasoning);
|
||||
}
|
||||
// Only include tool_calls when non-empty: some providers reject
|
||||
@@ -1410,10 +1547,22 @@ fn normalize_response(
|
||||
content.push(OutputContentBlock::Text { text });
|
||||
}
|
||||
for tool_call in choice.message.tool_calls {
|
||||
let thought_signature = tool_call.thought_signature.or_else(|| {
|
||||
tool_call
|
||||
.extra_content
|
||||
.as_ref()
|
||||
.and_then(|ec| ec.get("google"))
|
||||
.and_then(|g| g.get("thought_signature"))
|
||||
.and_then(|v| v.as_str())
|
||||
.filter(|s| !s.is_empty())
|
||||
.map(String::from)
|
||||
});
|
||||
|
||||
content.push(OutputContentBlock::ToolUse {
|
||||
id: tool_call.id,
|
||||
name: tool_call.function.name,
|
||||
input: parse_tool_arguments(&tool_call.function.arguments),
|
||||
thought_signature,
|
||||
});
|
||||
}
|
||||
|
||||
@@ -1698,6 +1847,7 @@ mod tests {
|
||||
ToolChoice, ToolDefinition, ToolResultContentBlock,
|
||||
};
|
||||
use serde_json::json;
|
||||
use std::borrow::Cow;
|
||||
use std::collections::BTreeMap;
|
||||
use std::sync::{Mutex, OnceLock};
|
||||
|
||||
@@ -1796,6 +1946,32 @@ mod tests {
|
||||
assert_eq!(assistant["content"], json!("answer"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn deepseek_v4_assistant_with_only_tool_calls_omits_content_and_includes_reasoning() {
|
||||
let request = MessageRequest {
|
||||
model: "deepseek-v4-pro".to_string(),
|
||||
max_tokens: 100,
|
||||
messages: vec![InputMessage {
|
||||
role: "assistant".to_string(),
|
||||
content: vec![InputContentBlock::ToolUse {
|
||||
id: "call_1".to_string(),
|
||||
name: "get_weather".to_string(),
|
||||
input: json!({"city": "Paris"}),
|
||||
thought_signature: None,
|
||||
}],
|
||||
}],
|
||||
stream: false,
|
||||
..Default::default()
|
||||
};
|
||||
|
||||
let payload = build_chat_completion_request(&request, OpenAiCompatConfig::openai());
|
||||
let assistant = &payload["messages"][0];
|
||||
|
||||
assert!(assistant.get("content").is_none());
|
||||
assert_eq!(assistant["reasoning_content"], json!(""));
|
||||
assert_eq!(assistant["tool_calls"].as_array().map(Vec::len), Some(1));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn deepseek_v4_flash_request_includes_reasoning_content_for_assistant_history() {
|
||||
// Given an assistant history turn containing thinking.
|
||||
@@ -1859,6 +2035,7 @@ mod tests {
|
||||
reasoning_content: Some("think".to_string()),
|
||||
thinking: None,
|
||||
tool_calls: Vec::new(),
|
||||
extra_content: None,
|
||||
},
|
||||
finish_reason: None,
|
||||
}],
|
||||
@@ -1876,6 +2053,7 @@ mod tests {
|
||||
reasoning_content: None,
|
||||
thinking: None,
|
||||
tool_calls: Vec::new(),
|
||||
extra_content: None,
|
||||
},
|
||||
finish_reason: Some("stop".to_string()),
|
||||
}],
|
||||
@@ -1982,6 +2160,49 @@ mod tests {
|
||||
assert_eq!(payload["reasoning_effort"], json!("high"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn deepseek_v4_request_includes_thinking_parameter() {
|
||||
let payload = build_chat_completion_request(
|
||||
&MessageRequest {
|
||||
model: "deepseek-v4-pro".to_string(),
|
||||
max_tokens: 1024,
|
||||
messages: vec![InputMessage::user_text("hello")],
|
||||
..Default::default()
|
||||
},
|
||||
OpenAiCompatConfig::openai(),
|
||||
);
|
||||
assert_eq!(payload["thinking"], json!({"type": "enabled"}));
|
||||
assert_eq!(payload["model"], json!("deepseek-v4-pro"));
|
||||
|
||||
let mut extra_body = BTreeMap::new();
|
||||
extra_body.insert("thinking".to_string(), json!({"type": "disabled"}));
|
||||
let payload_with_override = build_chat_completion_request(
|
||||
&MessageRequest {
|
||||
model: "openai/deepseek-v4-flash".to_string(),
|
||||
max_tokens: 1024,
|
||||
messages: vec![InputMessage::user_text("hello")],
|
||||
extra_body,
|
||||
..Default::default()
|
||||
},
|
||||
OpenAiCompatConfig::openai(),
|
||||
);
|
||||
assert_eq!(
|
||||
payload_with_override["thinking"],
|
||||
json!({"type": "enabled"})
|
||||
);
|
||||
|
||||
let non_deepseek_payload = build_chat_completion_request(
|
||||
&MessageRequest {
|
||||
model: "gpt-4o".to_string(),
|
||||
max_tokens: 64,
|
||||
messages: vec![InputMessage::user_text("hello")],
|
||||
..Default::default()
|
||||
},
|
||||
OpenAiCompatConfig::openai(),
|
||||
);
|
||||
assert!(non_deepseek_payload.get("thinking").is_none());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn reasoning_effort_omitted_when_not_set() {
|
||||
let payload = build_chat_completion_request(
|
||||
@@ -2069,6 +2290,28 @@ mod tests {
|
||||
));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn local_openai_base_url_does_not_require_api_key() {
|
||||
let _lock = env_lock();
|
||||
let original_base_url = std::env::var_os("OPENAI_BASE_URL");
|
||||
let original_api_key = std::env::var_os("OPENAI_API_KEY");
|
||||
std::env::set_var("OPENAI_BASE_URL", "http://127.0.0.1:11434/v1");
|
||||
std::env::remove_var("OPENAI_API_KEY");
|
||||
|
||||
let client = OpenAiCompatClient::from_env(OpenAiCompatConfig::openai())
|
||||
.expect("local OpenAI-compatible endpoint should not require an API key");
|
||||
assert_eq!(client.base_url(), "http://127.0.0.1:11434/v1");
|
||||
|
||||
match original_base_url {
|
||||
Some(value) => std::env::set_var("OPENAI_BASE_URL", value),
|
||||
None => std::env::remove_var("OPENAI_BASE_URL"),
|
||||
}
|
||||
match original_api_key {
|
||||
Some(value) => std::env::set_var("OPENAI_API_KEY", value),
|
||||
None => std::env::remove_var("OPENAI_API_KEY"),
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn endpoint_builder_accepts_base_urls_and_full_endpoints() {
|
||||
assert_eq!(
|
||||
@@ -2355,6 +2598,7 @@ mod tests {
|
||||
id: "call_1".to_string(),
|
||||
name: "read_file".to_string(),
|
||||
input: serde_json::json!({"path": "/tmp/test"}),
|
||||
thought_signature: None,
|
||||
}],
|
||||
}],
|
||||
stream: false,
|
||||
@@ -2573,6 +2817,7 @@ mod tests {
|
||||
id: "call_1".to_string(),
|
||||
name: "read_file".to_string(),
|
||||
input: serde_json::json!({"path": "/tmp/test"}),
|
||||
thought_signature: None,
|
||||
}],
|
||||
},
|
||||
InputMessage {
|
||||
@@ -2684,6 +2929,66 @@ mod tests {
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn wire_model_strips_openai_prefix_for_default_and_local_preserves_custom_gateways() {
|
||||
assert_eq!(
|
||||
super::wire_model_for_base_url(
|
||||
"openai/gpt-4o",
|
||||
OpenAiCompatConfig::openai(),
|
||||
super::DEFAULT_OPENAI_BASE_URL,
|
||||
),
|
||||
Cow::Borrowed("gpt-4o")
|
||||
);
|
||||
assert_eq!(
|
||||
super::wire_model_for_base_url(
|
||||
"openai/qwen2.5-coder:7b",
|
||||
OpenAiCompatConfig::openai(),
|
||||
"http://127.0.0.1:11434/v1",
|
||||
),
|
||||
Cow::Borrowed("qwen2.5-coder:7b")
|
||||
);
|
||||
assert_eq!(
|
||||
super::wire_model_for_base_url(
|
||||
"openai/llama3.2",
|
||||
OpenAiCompatConfig::openai(),
|
||||
"http://localhost:11434/v1/chat/completions",
|
||||
),
|
||||
Cow::Borrowed("llama3.2")
|
||||
);
|
||||
assert_eq!(
|
||||
super::wire_model_for_base_url(
|
||||
"openai/gpt-4.1-mini",
|
||||
OpenAiCompatConfig::openai(),
|
||||
"https://openrouter.ai/api/v1",
|
||||
),
|
||||
Cow::Borrowed("openai/gpt-4.1-mini")
|
||||
);
|
||||
assert_eq!(
|
||||
super::wire_model_for_base_url(
|
||||
"openai/gpt-4.1-mini",
|
||||
OpenAiCompatConfig::openai(),
|
||||
"https://not-localhost.example.com/v1",
|
||||
),
|
||||
Cow::Borrowed("openai/gpt-4.1-mini")
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn local_routing_prefix_strips_only_escape_hatch() {
|
||||
assert_eq!(
|
||||
super::strip_routing_prefix("local/Qwen/Qwen3.6-27B-FP8"),
|
||||
"Qwen/Qwen3.6-27B-FP8"
|
||||
);
|
||||
assert_eq!(
|
||||
super::wire_model_for_base_url(
|
||||
"local/Qwen/Qwen3.6-27B-FP8",
|
||||
OpenAiCompatConfig::openai(),
|
||||
"http://127.0.0.1:8000/v1",
|
||||
),
|
||||
Cow::Borrowed("Qwen/Qwen3.6-27B-FP8")
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn check_request_body_size_allows_large_requests_for_openai() {
|
||||
// Create a request that exceeds DashScope's limit but is under OpenAI's 100MB limit
|
||||
|
||||
@@ -100,6 +100,8 @@ pub enum InputContentBlock {
|
||||
id: String,
|
||||
name: String,
|
||||
input: Value,
|
||||
#[serde(default, skip_serializing_if = "Option::is_none")]
|
||||
thought_signature: Option<String>,
|
||||
},
|
||||
ToolResult {
|
||||
tool_use_id: String,
|
||||
@@ -167,6 +169,8 @@ pub enum OutputContentBlock {
|
||||
id: String,
|
||||
name: String,
|
||||
input: Value,
|
||||
#[serde(default, skip_serializing_if = "Option::is_none")]
|
||||
thought_signature: Option<String>,
|
||||
},
|
||||
Thinking {
|
||||
#[serde(default)]
|
||||
|
||||
@@ -103,6 +103,58 @@ async fn send_message_posts_json_and_parses_response() {
|
||||
);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn send_message_strips_anthropic_routing_prefix_on_wire() {
|
||||
let state = Arc::new(Mutex::new(Vec::<CapturedRequest>::new()));
|
||||
let server = spawn_server(
|
||||
state.clone(),
|
||||
vec![
|
||||
http_response("200 OK", "application/json", "{\"input_tokens\":1}"),
|
||||
http_response(
|
||||
"200 OK",
|
||||
"application/json",
|
||||
concat!(
|
||||
"{",
|
||||
"\"id\":\"msg_prefixed\",",
|
||||
"\"type\":\"message\",",
|
||||
"\"role\":\"assistant\",",
|
||||
"\"content\":[{\"type\":\"text\",\"text\":\"ok\"}],",
|
||||
"\"model\":\"claude-opus-4-6\",",
|
||||
"\"stop_reason\":\"end_turn\",",
|
||||
"\"stop_sequence\":null,",
|
||||
"\"usage\":{\"input_tokens\":1,\"output_tokens\":1}",
|
||||
"}"
|
||||
),
|
||||
),
|
||||
],
|
||||
)
|
||||
.await;
|
||||
|
||||
let client = AnthropicClient::new("test-key").with_base_url(server.base_url());
|
||||
client
|
||||
.send_message(&MessageRequest {
|
||||
model: "anthropic/claude-opus-4-6".to_string(),
|
||||
..sample_request(false)
|
||||
})
|
||||
.await
|
||||
.expect("request should succeed");
|
||||
|
||||
let captured = state.lock().await;
|
||||
assert_eq!(
|
||||
captured.len(),
|
||||
2,
|
||||
"count_tokens and messages requests should be captured"
|
||||
);
|
||||
let count_tokens_body: serde_json::Value =
|
||||
serde_json::from_str(&captured[0].body).expect("count_tokens body should be json");
|
||||
let messages_body: serde_json::Value =
|
||||
serde_json::from_str(&captured[1].body).expect("request body should be json");
|
||||
assert_eq!(captured[0].path, "/v1/messages/count_tokens");
|
||||
assert_eq!(captured[1].path, "/v1/messages");
|
||||
assert_eq!(count_tokens_body["model"], json!("claude-opus-4-6"));
|
||||
assert_eq!(messages_body["model"], json!("claude-opus-4-6"));
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn send_message_blocks_oversized_requests_before_the_http_call() {
|
||||
let state = Arc::new(Mutex::new(Vec::<CapturedRequest>::new()));
|
||||
|
||||
@@ -159,10 +159,15 @@ async fn send_message_preserves_deepseek_reasoning_content_before_text() {
|
||||
},
|
||||
]
|
||||
);
|
||||
|
||||
let captured = state.lock().await;
|
||||
let request = captured.first().expect("server should capture request");
|
||||
let body: serde_json::Value = serde_json::from_str(&request.body).expect("json body");
|
||||
assert_eq!(body["thinking"], json!({"type": "enabled"}));
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn custom_openai_gateway_preserves_slash_model_ids_and_extra_body_params() {
|
||||
async fn local_openai_gateway_strips_routing_prefix_and_preserves_extra_body_params() {
|
||||
let state = Arc::new(Mutex::new(Vec::<CapturedRequest>::new()));
|
||||
let body = concat!(
|
||||
"{",
|
||||
@@ -206,7 +211,7 @@ async fn custom_openai_gateway_preserves_slash_model_ids_and_extra_body_params()
|
||||
let captured = state.lock().await;
|
||||
let request = captured.first().expect("captured request");
|
||||
let body: serde_json::Value = serde_json::from_str(&request.body).expect("json body");
|
||||
assert_eq!(body["model"], json!("openai/gpt-4.1-mini"));
|
||||
assert_eq!(body["model"], json!("gpt-4.1-mini"));
|
||||
assert_eq!(
|
||||
body["web_search_options"],
|
||||
json!({"search_context_size": "low"})
|
||||
|
||||
@@ -768,7 +768,7 @@ fn tool_calls_for_json(content: &[OutputContentBlock]) -> Vec<Value> {
|
||||
content
|
||||
.iter()
|
||||
.filter_map(|b| {
|
||||
if let OutputContentBlock::ToolUse { id, name, input } = b {
|
||||
if let OutputContentBlock::ToolUse { id, name, input, .. } = b {
|
||||
Some(json!({
|
||||
"id": id,
|
||||
"name": name,
|
||||
@@ -1474,7 +1474,7 @@ async fn stream_to_message_response(
|
||||
block_kind.insert(index, BlockKind::Text);
|
||||
text_buf.insert(index, text);
|
||||
}
|
||||
OutputContentBlock::ToolUse { id, name, input } => {
|
||||
OutputContentBlock::ToolUse { id, name, input, .. } => {
|
||||
let json = if input.as_object().is_some_and(|m| m.is_empty()) {
|
||||
String::new()
|
||||
} else {
|
||||
@@ -1523,7 +1523,7 @@ async fn stream_to_message_response(
|
||||
Some(BlockKind::Tool { id, name, json }) => {
|
||||
let input = serde_json::from_str::<Value>(&json)
|
||||
.unwrap_or_else(|_| json!({ "raw": json }));
|
||||
finished.insert(idx, OutputContentBlock::ToolUse { id, name, input });
|
||||
finished.insert(idx, OutputContentBlock::ToolUse { id, name, input, thought_signature: None });
|
||||
}
|
||||
None => {}
|
||||
}
|
||||
@@ -1581,7 +1581,7 @@ fn collect_tool_uses(content: &[OutputContentBlock]) -> Vec<ToolUse<'_>> {
|
||||
content
|
||||
.iter()
|
||||
.filter_map(|b| {
|
||||
if let OutputContentBlock::ToolUse { id, name, input } = b {
|
||||
if let OutputContentBlock::ToolUse { id, name, input, .. } = b {
|
||||
Some(ToolUse {
|
||||
id: id.as_str(),
|
||||
name: name.as_str(),
|
||||
@@ -1601,10 +1601,11 @@ fn output_to_input_blocks(blocks: &[OutputContentBlock]) -> Vec<InputContentBloc
|
||||
OutputContentBlock::Text { text } => {
|
||||
Some(InputContentBlock::Text { text: text.clone() })
|
||||
}
|
||||
OutputContentBlock::ToolUse { id, name, input } => Some(InputContentBlock::ToolUse {
|
||||
OutputContentBlock::ToolUse { id, name, input, .. } => Some(InputContentBlock::ToolUse {
|
||||
id: id.clone(),
|
||||
name: name.clone(),
|
||||
input: input.clone(),
|
||||
thought_signature: None,
|
||||
}),
|
||||
OutputContentBlock::Thinking { .. } | OutputContentBlock::RedactedThinking { .. } => {
|
||||
None
|
||||
|
||||
+656
-152
File diff suppressed because it is too large
Load Diff
@@ -746,6 +746,7 @@ fn tool_message_response_many(id: &str, tool_uses: &[ToolUseMessage<'_>]) -> Mes
|
||||
id: tool_use.tool_id.to_string(),
|
||||
name: tool_use.tool_name.to_string(),
|
||||
input: tool_use.input.clone(),
|
||||
thought_signature: None,
|
||||
})
|
||||
.collect(),
|
||||
model: DEFAULT_MODEL.to_string(),
|
||||
|
||||
@@ -780,6 +780,7 @@ mod tests {
|
||||
id: tool_id.to_string(),
|
||||
name: "search".to_string(),
|
||||
input: "{\"q\":\"*.rs\"}".to_string(),
|
||||
thought_signature: None,
|
||||
},
|
||||
]))
|
||||
.unwrap();
|
||||
|
||||
+1017
-133
File diff suppressed because it is too large
Load Diff
@@ -92,6 +92,7 @@ enum FieldType {
|
||||
Bool,
|
||||
Object,
|
||||
StringArray,
|
||||
HookArray,
|
||||
RulesImport,
|
||||
Number,
|
||||
}
|
||||
@@ -104,6 +105,7 @@ impl FieldType {
|
||||
Self::Object => "an object",
|
||||
Self::StringArray => "an array of strings",
|
||||
Self::RulesImport => "a string or an array of strings",
|
||||
Self::HookArray => "an array of strings or hook objects",
|
||||
Self::Number => "a number",
|
||||
}
|
||||
}
|
||||
@@ -116,6 +118,10 @@ impl FieldType {
|
||||
Self::StringArray => value
|
||||
.as_array()
|
||||
.is_some_and(|arr| arr.iter().all(|v| v.as_str().is_some())),
|
||||
Self::HookArray => value.as_array().is_some_and(|arr| {
|
||||
arr.iter()
|
||||
.all(|entry| entry.as_str().is_some() || entry.as_object().is_some())
|
||||
}),
|
||||
Self::RulesImport => {
|
||||
value.as_str().is_some()
|
||||
|| value
|
||||
@@ -218,15 +224,15 @@ const TOP_LEVEL_FIELDS: &[FieldSpec] = &[
|
||||
const HOOKS_FIELDS: &[FieldSpec] = &[
|
||||
FieldSpec {
|
||||
name: "PreToolUse",
|
||||
expected: FieldType::StringArray,
|
||||
expected: FieldType::HookArray,
|
||||
},
|
||||
FieldSpec {
|
||||
name: "PostToolUse",
|
||||
expected: FieldType::StringArray,
|
||||
expected: FieldType::HookArray,
|
||||
},
|
||||
FieldSpec {
|
||||
name: "PostToolUseFailure",
|
||||
expected: FieldType::StringArray,
|
||||
expected: FieldType::HookArray,
|
||||
},
|
||||
];
|
||||
|
||||
@@ -418,9 +424,10 @@ fn validate_object_keys(
|
||||
} else if DEPRECATED_FIELDS.iter().any(|d| d.name == key) {
|
||||
// Deprecated key — handled separately, not an unknown-key error.
|
||||
} else {
|
||||
// Unknown key.
|
||||
// Unknown key — preserve compatibility by surfacing it as a warning
|
||||
// instead of blocking otherwise valid config files.
|
||||
let suggestion = suggest_field(key, &known_names);
|
||||
result.errors.push(ConfigDiagnostic {
|
||||
result.warnings.push(ConfigDiagnostic {
|
||||
path: path_display.to_string(),
|
||||
field: field_path,
|
||||
line: find_key_line(source, key),
|
||||
@@ -599,10 +606,11 @@ mod tests {
|
||||
let result = validate_config_file(object, source, &test_path());
|
||||
|
||||
// then
|
||||
assert_eq!(result.errors.len(), 1);
|
||||
assert_eq!(result.errors[0].field, "unknownField");
|
||||
assert!(result.errors.is_empty());
|
||||
assert_eq!(result.warnings.len(), 1);
|
||||
assert_eq!(result.warnings[0].field, "unknownField");
|
||||
assert!(matches!(
|
||||
result.errors[0].kind,
|
||||
result.warnings[0].kind,
|
||||
DiagnosticKind::UnknownKey { .. }
|
||||
));
|
||||
}
|
||||
@@ -682,9 +690,10 @@ mod tests {
|
||||
let result = validate_config_file(object, source, &test_path());
|
||||
|
||||
// then
|
||||
assert_eq!(result.errors.len(), 1);
|
||||
assert_eq!(result.errors[0].line, Some(3));
|
||||
assert_eq!(result.errors[0].field, "badKey");
|
||||
assert!(result.errors.is_empty());
|
||||
assert_eq!(result.warnings.len(), 1);
|
||||
assert_eq!(result.warnings[0].line, Some(3));
|
||||
assert_eq!(result.warnings[0].field, "badKey");
|
||||
}
|
||||
|
||||
#[test]
|
||||
@@ -713,8 +722,32 @@ mod tests {
|
||||
let result = validate_config_file(object, source, &test_path());
|
||||
|
||||
// then
|
||||
assert!(result.errors.is_empty());
|
||||
assert_eq!(result.warnings.len(), 1);
|
||||
assert_eq!(result.warnings[0].field, "hooks.BadHook");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn validates_object_style_hook_entries() {
|
||||
let source = r#"{"hooks":{"PreToolUse":["legacy",{"matcher":"Bash","hooks":[{"type":"command","command":"echo ok"}]}]}}"#;
|
||||
let parsed = JsonValue::parse(source).expect("valid json");
|
||||
let object = parsed.as_object().expect("object");
|
||||
|
||||
let result = validate_config_file(object, source, &test_path());
|
||||
|
||||
assert!(result.errors.is_empty(), "{:?}", result.errors);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn rejects_wrong_hook_entry_types() {
|
||||
let source = r#"{"hooks":{"PreToolUse":[42]}}"#;
|
||||
let parsed = JsonValue::parse(source).expect("valid json");
|
||||
let object = parsed.as_object().expect("object");
|
||||
|
||||
let result = validate_config_file(object, source, &test_path());
|
||||
|
||||
assert_eq!(result.errors.len(), 1);
|
||||
assert_eq!(result.errors[0].field, "hooks.BadHook");
|
||||
assert_eq!(result.errors[0].field, "hooks.PreToolUse");
|
||||
}
|
||||
|
||||
#[test]
|
||||
@@ -756,8 +789,9 @@ mod tests {
|
||||
let result = validate_config_file(object, source, &test_path());
|
||||
|
||||
// then
|
||||
assert_eq!(result.errors.len(), 1);
|
||||
assert_eq!(result.errors[0].field, "permissions.denyAll");
|
||||
assert!(result.errors.is_empty());
|
||||
assert_eq!(result.warnings.len(), 1);
|
||||
assert_eq!(result.warnings[0].field, "permissions.denyAll");
|
||||
}
|
||||
|
||||
#[test]
|
||||
@@ -771,8 +805,9 @@ mod tests {
|
||||
let result = validate_config_file(object, source, &test_path());
|
||||
|
||||
// then
|
||||
assert_eq!(result.errors.len(), 1);
|
||||
assert_eq!(result.errors[0].field, "sandbox.containerMode");
|
||||
assert!(result.errors.is_empty());
|
||||
assert_eq!(result.warnings.len(), 1);
|
||||
assert_eq!(result.warnings[0].field, "sandbox.containerMode");
|
||||
}
|
||||
|
||||
#[test]
|
||||
@@ -786,8 +821,9 @@ mod tests {
|
||||
let result = validate_config_file(object, source, &test_path());
|
||||
|
||||
// then
|
||||
assert_eq!(result.errors.len(), 1);
|
||||
assert_eq!(result.errors[0].field, "plugins.autoUpdate");
|
||||
assert!(result.errors.is_empty());
|
||||
assert_eq!(result.warnings.len(), 1);
|
||||
assert_eq!(result.warnings[0].field, "plugins.autoUpdate");
|
||||
}
|
||||
|
||||
#[test]
|
||||
@@ -801,8 +837,9 @@ mod tests {
|
||||
let result = validate_config_file(object, source, &test_path());
|
||||
|
||||
// then
|
||||
assert_eq!(result.errors.len(), 1);
|
||||
assert_eq!(result.errors[0].field, "oauth.secret");
|
||||
assert!(result.errors.is_empty());
|
||||
assert_eq!(result.warnings.len(), 1);
|
||||
assert_eq!(result.warnings[0].field, "oauth.secret");
|
||||
}
|
||||
|
||||
#[test]
|
||||
@@ -837,8 +874,9 @@ mod tests {
|
||||
let result = validate_config_file(object, source, &test_path());
|
||||
|
||||
// then
|
||||
assert_eq!(result.errors.len(), 1);
|
||||
match &result.errors[0].kind {
|
||||
assert!(result.errors.is_empty());
|
||||
assert_eq!(result.warnings.len(), 1);
|
||||
match &result.warnings[0].kind {
|
||||
DiagnosticKind::UnknownKey {
|
||||
suggestion: Some(s),
|
||||
} => assert_eq!(s, "model"),
|
||||
@@ -849,7 +887,7 @@ mod tests {
|
||||
#[test]
|
||||
fn format_diagnostics_includes_all_entries() {
|
||||
// given
|
||||
let source = r#"{"permissionMode": "plan", "badKey": 1}"#;
|
||||
let source = r#"{"model": 42, "badKey": 1}"#;
|
||||
let parsed = JsonValue::parse(source).expect("valid json");
|
||||
let object = parsed.as_object().expect("object");
|
||||
let result = validate_config_file(object, source, &test_path());
|
||||
@@ -861,7 +899,7 @@ mod tests {
|
||||
assert!(output.contains("warning:"));
|
||||
assert!(output.contains("error:"));
|
||||
assert!(output.contains("badKey"));
|
||||
assert!(output.contains("permissionMode"));
|
||||
assert!(output.contains("model"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
|
||||
@@ -37,6 +37,7 @@ pub enum AssistantEvent {
|
||||
id: String,
|
||||
name: String,
|
||||
input: String,
|
||||
thought_signature: Option<String>,
|
||||
},
|
||||
Usage(TokenUsage),
|
||||
PromptCache(PromptCacheEvent),
|
||||
@@ -381,7 +382,7 @@ where
|
||||
.blocks
|
||||
.iter()
|
||||
.filter_map(|block| match block {
|
||||
ContentBlock::ToolUse { id, name, input } => {
|
||||
ContentBlock::ToolUse { id, name, input, .. } => {
|
||||
Some((id.clone(), name.clone(), input.clone()))
|
||||
}
|
||||
_ => None,
|
||||
@@ -741,9 +742,9 @@ fn build_assistant_message(
|
||||
});
|
||||
}
|
||||
AssistantEvent::TextDelta(delta) => text.push_str(&delta),
|
||||
AssistantEvent::ToolUse { id, name, input } => {
|
||||
AssistantEvent::ToolUse { id, name, input, thought_signature } => {
|
||||
flush_text_block(&mut text, &mut blocks);
|
||||
blocks.push(ContentBlock::ToolUse { id, name, input });
|
||||
blocks.push(ContentBlock::ToolUse { id, name, input, thought_signature });
|
||||
}
|
||||
AssistantEvent::Usage(value) => usage = Some(value),
|
||||
AssistantEvent::PromptCache(event) => prompt_cache_events.push(event),
|
||||
@@ -880,6 +881,7 @@ mod tests {
|
||||
id: "tool-1".to_string(),
|
||||
name: "add".to_string(),
|
||||
input: "2,2".to_string(),
|
||||
thought_signature: None,
|
||||
},
|
||||
AssistantEvent::Usage(TokenUsage {
|
||||
input_tokens: 20,
|
||||
@@ -1046,6 +1048,7 @@ mod tests {
|
||||
id: "tool-1".to_string(),
|
||||
name: "blocked".to_string(),
|
||||
input: "secret".to_string(),
|
||||
thought_signature: None,
|
||||
},
|
||||
AssistantEvent::MessageStop,
|
||||
])
|
||||
@@ -1091,6 +1094,7 @@ mod tests {
|
||||
id: "tool-1".to_string(),
|
||||
name: "blocked".to_string(),
|
||||
input: r#"{"path":"secret.txt"}"#.to_string(),
|
||||
thought_signature: None,
|
||||
},
|
||||
AssistantEvent::MessageStop,
|
||||
])
|
||||
@@ -1153,6 +1157,7 @@ mod tests {
|
||||
id: "tool-1".to_string(),
|
||||
name: "blocked".to_string(),
|
||||
input: r#"{"path":"secret.txt"}"#.to_string(),
|
||||
thought_signature: None,
|
||||
},
|
||||
AssistantEvent::MessageStop,
|
||||
])
|
||||
@@ -1213,6 +1218,7 @@ mod tests {
|
||||
id: "tool-1".to_string(),
|
||||
name: "add".to_string(),
|
||||
input: r#"{"lhs":2,"rhs":2}"#.to_string(),
|
||||
thought_signature: None,
|
||||
},
|
||||
AssistantEvent::MessageStop,
|
||||
]),
|
||||
@@ -1288,6 +1294,7 @@ mod tests {
|
||||
id: "tool-1".to_string(),
|
||||
name: "fail".to_string(),
|
||||
input: r#"{"path":"README.md"}"#.to_string(),
|
||||
thought_signature: None,
|
||||
},
|
||||
AssistantEvent::MessageStop,
|
||||
]),
|
||||
@@ -1755,6 +1762,7 @@ mod tests {
|
||||
id: "tool-1".to_string(),
|
||||
name: "echo".to_string(),
|
||||
input: "payload".to_string(),
|
||||
thought_signature: None,
|
||||
},
|
||||
AssistantEvent::MessageStop,
|
||||
];
|
||||
@@ -1778,6 +1786,7 @@ mod tests {
|
||||
id: "tool-1".to_string(),
|
||||
name: "echo".to_string(),
|
||||
input: "payload".to_string(),
|
||||
thought_signature: None,
|
||||
},
|
||||
]
|
||||
);
|
||||
@@ -1811,6 +1820,7 @@ mod tests {
|
||||
id: "tool-1".to_string(),
|
||||
name: "echo".to_string(),
|
||||
input: "payload".to_string(),
|
||||
thought_signature: None,
|
||||
},
|
||||
AssistantEvent::MessageStop,
|
||||
])
|
||||
|
||||
@@ -11,7 +11,7 @@ use std::time::Duration;
|
||||
|
||||
use serde_json::{json, Value};
|
||||
|
||||
use crate::config::{RuntimeFeatureConfig, RuntimeHookConfig};
|
||||
use crate::config::{RuntimeFeatureConfig, RuntimeHookCommand, RuntimeHookConfig};
|
||||
use crate::permissions::PermissionOverride;
|
||||
|
||||
const HOOK_PREVIEW_CHAR_LIMIT: usize = 160;
|
||||
@@ -182,7 +182,7 @@ impl HookRunner {
|
||||
) -> HookRunResult {
|
||||
Self::run_commands(
|
||||
HookEvent::PreToolUse,
|
||||
self.config.pre_tool_use(),
|
||||
self.config.pre_tool_use_entries(),
|
||||
tool_name,
|
||||
tool_input,
|
||||
None,
|
||||
@@ -232,7 +232,7 @@ impl HookRunner {
|
||||
) -> HookRunResult {
|
||||
Self::run_commands(
|
||||
HookEvent::PostToolUse,
|
||||
self.config.post_tool_use(),
|
||||
self.config.post_tool_use_entries(),
|
||||
tool_name,
|
||||
tool_input,
|
||||
Some(tool_output),
|
||||
@@ -282,7 +282,7 @@ impl HookRunner {
|
||||
) -> HookRunResult {
|
||||
Self::run_commands(
|
||||
HookEvent::PostToolUseFailure,
|
||||
self.config.post_tool_use_failure(),
|
||||
self.config.post_tool_use_failure_entries(),
|
||||
tool_name,
|
||||
tool_input,
|
||||
Some(tool_error),
|
||||
@@ -312,7 +312,7 @@ impl HookRunner {
|
||||
#[allow(clippy::too_many_arguments)]
|
||||
fn run_commands(
|
||||
event: HookEvent,
|
||||
commands: &[String],
|
||||
commands: &[RuntimeHookCommand],
|
||||
tool_name: &str,
|
||||
tool_input: &str,
|
||||
tool_output: Option<&str>,
|
||||
@@ -342,17 +342,21 @@ impl HookRunner {
|
||||
let payload = hook_payload(event, tool_name, tool_input, tool_output, is_error).to_string();
|
||||
let mut result = HookRunResult::allow(Vec::new());
|
||||
|
||||
for command in commands {
|
||||
for command in commands
|
||||
.iter()
|
||||
.filter(|command| command.matches_tool(tool_name))
|
||||
{
|
||||
let command_text = command.command();
|
||||
if let Some(reporter) = reporter.as_deref_mut() {
|
||||
reporter.on_event(&HookProgressEvent::Started {
|
||||
event,
|
||||
tool_name: tool_name.to_string(),
|
||||
command: command.clone(),
|
||||
command: command_text.to_string(),
|
||||
});
|
||||
}
|
||||
|
||||
match Self::run_command(
|
||||
command,
|
||||
command_text,
|
||||
event,
|
||||
tool_name,
|
||||
tool_input,
|
||||
@@ -366,7 +370,7 @@ impl HookRunner {
|
||||
reporter.on_event(&HookProgressEvent::Completed {
|
||||
event,
|
||||
tool_name: tool_name.to_string(),
|
||||
command: command.clone(),
|
||||
command: command_text.to_string(),
|
||||
});
|
||||
}
|
||||
merge_parsed_hook_output(&mut result, parsed);
|
||||
@@ -376,7 +380,7 @@ impl HookRunner {
|
||||
reporter.on_event(&HookProgressEvent::Completed {
|
||||
event,
|
||||
tool_name: tool_name.to_string(),
|
||||
command: command.clone(),
|
||||
command: command_text.to_string(),
|
||||
});
|
||||
}
|
||||
merge_parsed_hook_output(&mut result, parsed);
|
||||
@@ -388,7 +392,7 @@ impl HookRunner {
|
||||
reporter.on_event(&HookProgressEvent::Completed {
|
||||
event,
|
||||
tool_name: tool_name.to_string(),
|
||||
command: command.clone(),
|
||||
command: command_text.to_string(),
|
||||
});
|
||||
}
|
||||
merge_parsed_hook_output(&mut result, parsed);
|
||||
@@ -400,7 +404,7 @@ impl HookRunner {
|
||||
reporter.on_event(&HookProgressEvent::Cancelled {
|
||||
event,
|
||||
tool_name: tool_name.to_string(),
|
||||
command: command.clone(),
|
||||
command: command_text.to_string(),
|
||||
});
|
||||
}
|
||||
result.cancelled = true;
|
||||
@@ -825,7 +829,7 @@ mod tests {
|
||||
HookAbortSignal, HookEvent, HookProgressEvent, HookProgressReporter, HookRunResult,
|
||||
HookRunner,
|
||||
};
|
||||
use crate::config::{RuntimeFeatureConfig, RuntimeHookConfig};
|
||||
use crate::config::{RuntimeFeatureConfig, RuntimeHookCommand, RuntimeHookConfig};
|
||||
use crate::permissions::PermissionOverride;
|
||||
|
||||
struct RecordingReporter {
|
||||
@@ -851,6 +855,37 @@ mod tests {
|
||||
assert_eq!(result, HookRunResult::allow(vec!["pre ok".to_string()]));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn object_style_hook_matchers_filter_runtime_execution() {
|
||||
let runner = HookRunner::new(RuntimeHookConfig::from_hook_commands(
|
||||
vec![
|
||||
RuntimeHookCommand::new(shell_snippet("printf 'legacy'")),
|
||||
RuntimeHookCommand::with_matcher(
|
||||
shell_snippet("printf 'bash only'"),
|
||||
Some("Bash".to_string()),
|
||||
),
|
||||
RuntimeHookCommand::with_matcher(
|
||||
shell_snippet("printf 'read only'"),
|
||||
Some("Read*".to_string()),
|
||||
),
|
||||
],
|
||||
Vec::new(),
|
||||
Vec::new(),
|
||||
));
|
||||
|
||||
let read_result = runner.run_pre_tool_use("ReadFile", r#"{"path":"README.md"}"#);
|
||||
let bash_result = runner.run_pre_tool_use("Bash", r#"{"command":"pwd"}"#);
|
||||
|
||||
assert_eq!(
|
||||
read_result,
|
||||
HookRunResult::allow(vec!["legacy".to_string(), "read only".to_string()])
|
||||
);
|
||||
assert_eq!(
|
||||
bash_result,
|
||||
HookRunResult::allow(vec!["legacy".to_string(), "bash only".to_string()])
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn denies_exit_code_two() {
|
||||
let runner = HookRunner::new(RuntimeHookConfig::new(
|
||||
|
||||
@@ -65,11 +65,12 @@ pub use compact::{
|
||||
get_compact_continuation_message, should_compact, CompactionConfig, CompactionResult,
|
||||
};
|
||||
pub use config::{
|
||||
suppress_config_warnings_for_json_mode, ConfigEntry, ConfigError, ConfigLoader, ConfigSource,
|
||||
McpConfigCollection, McpManagedProxyServerConfig, McpOAuthConfig, McpRemoteServerConfig,
|
||||
suppress_config_warnings_for_json_mode, ConfigEntry, ConfigError, ConfigFileReport,
|
||||
ConfigFileStatus, ConfigInspection, ConfigLoader, ConfigSource, McpConfigCollection,
|
||||
McpInvalidServerConfig, McpManagedProxyServerConfig, McpOAuthConfig, McpRemoteServerConfig,
|
||||
McpSdkServerConfig, McpServerConfig, McpStdioServerConfig, McpTransport,
|
||||
McpWebSocketServerConfig, OAuthConfig, ProviderFallbackConfig, ResolvedPermissionMode,
|
||||
RulesImportConfig, RuntimeConfig, RuntimeFeatureConfig, RuntimeHookConfig,
|
||||
RulesImportConfig, RuntimeConfig, RuntimeFeatureConfig, RuntimeHookCommand, RuntimeHookConfig,
|
||||
RuntimePermissionRuleConfig, RuntimePluginConfig, ScopedMcpServerConfig,
|
||||
CLAW_SETTINGS_SCHEMA_NAME,
|
||||
};
|
||||
@@ -142,8 +143,9 @@ pub use policy_engine::{
|
||||
PolicyEvaluation, PolicyRule, ReconcileReason, ReviewStatus,
|
||||
};
|
||||
pub use prompt::{
|
||||
load_system_prompt, prepend_bullets, ContextFile, ModelFamilyIdentity, ProjectContext,
|
||||
PromptBuildError, SystemPromptBuilder, FRONTIER_MODEL_NAME, SYSTEM_PROMPT_DYNAMIC_BOUNDARY,
|
||||
load_system_prompt, load_system_prompt_with_context, prepend_bullets, ContextFile,
|
||||
ModelFamilyIdentity, ProjectContext, PromptBuildError, SystemPromptBuilder,
|
||||
FRONTIER_MODEL_NAME, SYSTEM_PROMPT_DYNAMIC_BOUNDARY,
|
||||
};
|
||||
pub use recovery_recipes::{
|
||||
attempt_recovery, recipe_for, EscalationPolicy, FailureScenario, RecoveryAttemptState,
|
||||
|
||||
@@ -69,6 +69,18 @@ pub struct ContextFile {
|
||||
pub content: String,
|
||||
}
|
||||
|
||||
impl ContextFile {
|
||||
#[must_use]
|
||||
pub fn source(&self) -> &'static str {
|
||||
instruction_file_source(&self.path)
|
||||
}
|
||||
|
||||
#[must_use]
|
||||
pub fn char_count(&self) -> usize {
|
||||
self.content.chars().count()
|
||||
}
|
||||
}
|
||||
|
||||
/// Project-local context injected into the rendered system prompt.
|
||||
#[derive(Debug, Clone, Default, PartialEq, Eq)]
|
||||
pub struct ProjectContext {
|
||||
@@ -256,22 +268,36 @@ pub fn prepend_bullets(items: Vec<String>) -> Vec<String> {
|
||||
items.into_iter().map(|item| format!(" - {item}")).collect()
|
||||
}
|
||||
|
||||
fn instruction_file_source(path: &Path) -> &'static str {
|
||||
let file_name = path.file_name().and_then(|name| name.to_str());
|
||||
let parent_name = path
|
||||
.parent()
|
||||
.and_then(|parent| parent.file_name())
|
||||
.and_then(|name| name.to_str());
|
||||
|
||||
match (parent_name, file_name) {
|
||||
(Some(".claw"), Some("CLAUDE.md")) => "claw_claude_md",
|
||||
(Some(".claude"), Some("CLAUDE.md")) => "claude_claude_md",
|
||||
(_, Some("CLAUDE.md")) => "claude_md",
|
||||
(_, Some("CLAW.md")) => "claw_md",
|
||||
(_, Some("AGENTS.md")) => "agents_md",
|
||||
(_, Some("CLAUDE.local.md")) => "claude_local_md",
|
||||
(Some(".claw"), Some("instructions.md")) => "claw_instructions",
|
||||
_ => "rule_file",
|
||||
}
|
||||
}
|
||||
fn discover_instruction_files(
|
||||
cwd: &Path,
|
||||
rules_import: &RulesImportConfig,
|
||||
) -> std::io::Result<Vec<ContextFile>> {
|
||||
let mut directories = Vec::new();
|
||||
let mut cursor = Some(cwd);
|
||||
while let Some(dir) = cursor {
|
||||
directories.push(dir.to_path_buf());
|
||||
cursor = dir.parent();
|
||||
}
|
||||
let mut directories = instruction_discovery_dirs(cwd);
|
||||
directories.reverse();
|
||||
|
||||
let mut files = Vec::new();
|
||||
for dir in directories {
|
||||
for candidate in [
|
||||
dir.join("CLAUDE.md"),
|
||||
dir.join("CLAW.md"),
|
||||
dir.join("AGENTS.md"),
|
||||
dir.join("CLAUDE.local.md"),
|
||||
dir.join(".claw").join("CLAUDE.md"),
|
||||
@@ -287,6 +313,32 @@ fn discover_instruction_files(
|
||||
Ok(dedupe_instruction_files(files))
|
||||
}
|
||||
|
||||
fn instruction_discovery_dirs(cwd: &Path) -> Vec<PathBuf> {
|
||||
let boundary = nearest_git_root(cwd).unwrap_or_else(|| cwd.to_path_buf());
|
||||
let mut directories = Vec::new();
|
||||
let mut cursor = Some(cwd);
|
||||
while let Some(dir) = cursor {
|
||||
directories.push(dir.to_path_buf());
|
||||
if dir == boundary {
|
||||
break;
|
||||
}
|
||||
cursor = dir.parent();
|
||||
}
|
||||
directories
|
||||
}
|
||||
|
||||
fn nearest_git_root(cwd: &Path) -> Option<PathBuf> {
|
||||
let mut cursor = Some(cwd);
|
||||
while let Some(dir) = cursor {
|
||||
let git_marker = dir.join(".git");
|
||||
if git_marker.is_dir() || git_marker.is_file() {
|
||||
return Some(dir.to_path_buf());
|
||||
}
|
||||
cursor = dir.parent();
|
||||
}
|
||||
None
|
||||
}
|
||||
|
||||
fn push_context_file(files: &mut Vec<ContextFile>, path: PathBuf) -> std::io::Result<()> {
|
||||
if path.is_dir() {
|
||||
return Ok(());
|
||||
@@ -430,7 +482,7 @@ fn render_project_context(project_context: &ProjectContext) -> String {
|
||||
];
|
||||
if !project_context.instruction_files.is_empty() {
|
||||
bullets.push(format!(
|
||||
"Claude instruction files discovered: {}.",
|
||||
"Project instruction files discovered: {}.",
|
||||
project_context.instruction_files.len()
|
||||
));
|
||||
}
|
||||
@@ -465,7 +517,7 @@ fn render_project_context(project_context: &ProjectContext) -> String {
|
||||
}
|
||||
|
||||
fn render_instruction_files(files: &[ContextFile]) -> String {
|
||||
let mut sections = vec!["# Claude instructions".to_string()];
|
||||
let mut sections = vec!["# Project instructions".to_string()];
|
||||
let mut remaining_chars = MAX_TOTAL_INSTRUCTION_CHARS;
|
||||
for file in files {
|
||||
if remaining_chars == 0 {
|
||||
@@ -573,16 +625,31 @@ pub fn load_system_prompt(
|
||||
os_version: impl Into<String>,
|
||||
model_family: ModelFamilyIdentity,
|
||||
) -> Result<Vec<String>, PromptBuildError> {
|
||||
let cwd = cwd.into();
|
||||
let (sections, _) =
|
||||
load_system_prompt_with_context(cwd, current_date, os_name, os_version, model_family)?;
|
||||
Ok(sections)
|
||||
}
|
||||
|
||||
/// Loads config and project context, then renders the system prompt text plus metadata.
|
||||
pub fn load_system_prompt_with_context(
|
||||
cwd: impl Into<PathBuf>,
|
||||
current_date: impl Into<String>,
|
||||
os_name: impl Into<String>,
|
||||
os_version: impl Into<String>,
|
||||
model_family: ModelFamilyIdentity,
|
||||
) -> Result<(Vec<String>, ProjectContext), PromptBuildError> {
|
||||
let cwd = cwd.into();
|
||||
let config = ConfigLoader::default_for(&cwd).load()?;
|
||||
let project_context =
|
||||
discover_with_git_and_rules_import(&cwd, current_date.into(), config.rules_import())?;
|
||||
Ok(SystemPromptBuilder::new()
|
||||
let sections = SystemPromptBuilder::new()
|
||||
.with_os(os_name, os_version)
|
||||
.with_model_family(model_family)
|
||||
.with_project_context(project_context)
|
||||
.with_project_context(project_context.clone())
|
||||
.with_runtime_config(config)
|
||||
.build())
|
||||
.build();
|
||||
Ok((sections, project_context))
|
||||
}
|
||||
|
||||
fn render_config_section(config: &RuntimeConfig) -> String {
|
||||
@@ -766,6 +833,7 @@ mod tests {
|
||||
let root = temp_dir();
|
||||
let nested = root.join("apps").join("api");
|
||||
fs::create_dir_all(nested.join(".claw")).expect("nested claw dir");
|
||||
fs::create_dir(root.join(".git")).expect("git boundary");
|
||||
fs::write(root.join("CLAUDE.md"), "root instructions").expect("write root instructions");
|
||||
fs::write(root.join("CLAUDE.local.md"), "local instructions")
|
||||
.expect("write local instructions");
|
||||
@@ -844,10 +912,11 @@ mod tests {
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn discovers_claude_agents_and_dot_claude_instruction_files_together() {
|
||||
fn discovers_claude_claw_agents_and_dot_claude_instruction_files_together() {
|
||||
let root = temp_dir();
|
||||
fs::create_dir_all(root.join(".claude")).expect("dot claude dir");
|
||||
fs::write(root.join("CLAUDE.md"), "claude instructions").expect("write CLAUDE.md");
|
||||
fs::write(root.join("CLAW.md"), "claw instructions").expect("write CLAW.md");
|
||||
fs::write(root.join("AGENTS.md"), "agents instructions").expect("write AGENTS.md");
|
||||
fs::write(
|
||||
root.join(".claude").join("CLAUDE.md"),
|
||||
@@ -857,8 +926,18 @@ mod tests {
|
||||
|
||||
let context = ProjectContext::discover(&root, "2026-03-31").expect("context should load");
|
||||
let rendered = render_instruction_files(&context.instruction_files);
|
||||
let sources = context
|
||||
.instruction_files
|
||||
.iter()
|
||||
.map(ContextFile::source)
|
||||
.collect::<Vec<_>>();
|
||||
|
||||
assert_eq!(
|
||||
sources,
|
||||
vec!["claude_md", "claw_md", "agents_md", "claude_claude_md"]
|
||||
);
|
||||
assert!(rendered.contains("claude instructions"));
|
||||
assert!(rendered.contains("claw instructions"));
|
||||
assert!(rendered.contains("agents instructions"));
|
||||
assert!(rendered.contains("dot claude instructions"));
|
||||
fs::remove_dir_all(root).expect("cleanup temp dir");
|
||||
@@ -869,6 +948,7 @@ mod tests {
|
||||
let root = temp_dir();
|
||||
let nested = root.join("apps").join("api");
|
||||
fs::create_dir_all(&nested).expect("nested dir");
|
||||
fs::create_dir(root.join(".git")).expect("git boundary");
|
||||
fs::write(root.join("CLAUDE.md"), "same rules\n\n").expect("write root");
|
||||
fs::write(nested.join("CLAUDE.md"), "same rules\n").expect("write nested");
|
||||
|
||||
@@ -881,6 +961,50 @@ mod tests {
|
||||
fs::remove_dir_all(root).expect("cleanup temp dir");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn discovery_stops_at_git_root_boundary_439() {
|
||||
let root = temp_dir();
|
||||
let repo = root.join("repo");
|
||||
let nested = repo.join("subproj").join("deep").join("nest");
|
||||
fs::create_dir_all(&nested).expect("nested dir");
|
||||
fs::create_dir(repo.join(".git")).expect("git boundary");
|
||||
fs::write(root.join("CLAUDE.md"), "PARENT_CLAUDE").expect("write parent");
|
||||
fs::write(repo.join("CLAUDE.md"), "REPO_CLAUDE").expect("write repo");
|
||||
fs::write(repo.join("subproj").join("CLAUDE.md"), "CHILD_CLAUDE").expect("write child");
|
||||
fs::write(
|
||||
repo.join("subproj").join("deep").join("CLAUDE.md"),
|
||||
"DEEP_CLAUDE",
|
||||
)
|
||||
.expect("write deep");
|
||||
|
||||
let context = ProjectContext::discover(&nested, "2026-03-31").expect("context should load");
|
||||
let rendered = render_instruction_files(&context.instruction_files);
|
||||
|
||||
assert!(!rendered.contains("PARENT_CLAUDE"));
|
||||
assert!(rendered.contains("REPO_CLAUDE"));
|
||||
assert!(rendered.contains("CHILD_CLAUDE"));
|
||||
assert!(rendered.contains("DEEP_CLAUDE"));
|
||||
assert_eq!(context.instruction_files.len(), 3);
|
||||
fs::remove_dir_all(root).expect("cleanup temp dir");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn discovery_without_git_root_stays_cwd_local_439() {
|
||||
let root = temp_dir();
|
||||
let nested = root.join("scratch");
|
||||
fs::create_dir_all(&nested).expect("nested dir");
|
||||
fs::write(root.join("CLAUDE.md"), "PARENT_CLAUDE").expect("write parent");
|
||||
fs::write(nested.join("CLAUDE.md"), "SCRATCH_CLAUDE").expect("write scratch");
|
||||
|
||||
let context = ProjectContext::discover(&nested, "2026-03-31").expect("context should load");
|
||||
let rendered = render_instruction_files(&context.instruction_files);
|
||||
|
||||
assert!(!rendered.contains("PARENT_CLAUDE"));
|
||||
assert!(rendered.contains("SCRATCH_CLAUDE"));
|
||||
assert_eq!(context.instruction_files.len(), 1);
|
||||
fs::remove_dir_all(root).expect("cleanup temp dir");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn truncates_large_instruction_content_for_rendering() {
|
||||
let rendered = render_instruction_content(&"x".repeat(4500));
|
||||
@@ -1218,7 +1342,7 @@ mod tests {
|
||||
|
||||
assert!(prompt.contains("# System"));
|
||||
assert!(prompt.contains("# Project context"));
|
||||
assert!(prompt.contains("# Claude instructions"));
|
||||
assert!(prompt.contains("# Project instructions"));
|
||||
assert!(prompt.contains("Project rules"));
|
||||
assert!(prompt.contains("permissionMode"));
|
||||
assert!(prompt.contains(SYSTEM_PROMPT_DYNAMIC_BOUNDARY));
|
||||
@@ -1263,7 +1387,7 @@ mod tests {
|
||||
path: PathBuf::from("/tmp/project/CLAUDE.md"),
|
||||
content: "Project rules".to_string(),
|
||||
}]);
|
||||
assert!(rendered.contains("# Claude instructions"));
|
||||
assert!(rendered.contains("# Project instructions"));
|
||||
assert!(rendered.contains("scope: /tmp/project"));
|
||||
assert!(rendered.contains("Project rules"));
|
||||
}
|
||||
|
||||
@@ -42,6 +42,7 @@ pub enum ContentBlock {
|
||||
id: String,
|
||||
name: String,
|
||||
input: String,
|
||||
thought_signature: Option<String>,
|
||||
},
|
||||
ToolResult {
|
||||
tool_use_id: String,
|
||||
@@ -817,7 +818,7 @@ impl ContentBlock {
|
||||
);
|
||||
}
|
||||
}
|
||||
Self::ToolUse { id, name, input } => {
|
||||
Self::ToolUse { id, name, input, thought_signature } => {
|
||||
object.insert(
|
||||
"type".to_string(),
|
||||
JsonValue::String("tool_use".to_string()),
|
||||
@@ -825,6 +826,12 @@ impl ContentBlock {
|
||||
object.insert("id".to_string(), JsonValue::String(id.clone()));
|
||||
object.insert("name".to_string(), JsonValue::String(name.clone()));
|
||||
object.insert("input".to_string(), JsonValue::String(input.clone()));
|
||||
if let Some(sig) = thought_signature {
|
||||
object.insert(
|
||||
"thought_signature".to_string(),
|
||||
JsonValue::String(sig.clone()),
|
||||
);
|
||||
}
|
||||
}
|
||||
Self::ToolResult {
|
||||
tool_use_id,
|
||||
@@ -874,6 +881,7 @@ impl ContentBlock {
|
||||
id: required_string(object, "id")?,
|
||||
name: required_string(object, "name")?,
|
||||
input: required_string(object, "input")?,
|
||||
thought_signature: object.get("thought_signature").and_then(JsonValue::as_str).map(String::from)
|
||||
}),
|
||||
"tool_result" => Ok(Self::ToolResult {
|
||||
tool_use_id: required_string(object, "tool_use_id")?,
|
||||
@@ -1069,7 +1077,7 @@ fn persisted_block_json(block: &ContentBlock) -> JsonValue {
|
||||
);
|
||||
}
|
||||
}
|
||||
ContentBlock::ToolUse { id, name, input } => {
|
||||
ContentBlock::ToolUse { id, name, input, thought_signature } => {
|
||||
object.insert(
|
||||
"type".to_string(),
|
||||
JsonValue::String("tool_use".to_string()),
|
||||
@@ -1083,6 +1091,12 @@ fn persisted_block_json(block: &ContentBlock) -> JsonValue {
|
||||
"input".to_string(),
|
||||
JsonValue::String(sanitize_jsonl_field(input)),
|
||||
);
|
||||
if let Some(sig) = thought_signature {
|
||||
object.insert(
|
||||
"thought_signature".to_string(),
|
||||
JsonValue::String(sanitize_jsonl_field(sig)),
|
||||
);
|
||||
}
|
||||
}
|
||||
ContentBlock::ToolResult {
|
||||
tool_use_id,
|
||||
@@ -1433,6 +1447,7 @@ mod tests {
|
||||
id: "tool-1".to_string(),
|
||||
name: "bash".to_string(),
|
||||
input: "echo hi".to_string(),
|
||||
thought_signature: None,
|
||||
},
|
||||
],
|
||||
Some(TokenUsage {
|
||||
@@ -1596,6 +1611,7 @@ mod tests {
|
||||
id: "tool-1".to_string(),
|
||||
name: "bash".to_string(),
|
||||
input: format!("Authorization: Bearer {secret}"),
|
||||
thought_signature: None,
|
||||
},
|
||||
]))
|
||||
.expect("tool use should append");
|
||||
|
||||
@@ -28,7 +28,8 @@ pub struct SessionStore {
|
||||
impl SessionStore {
|
||||
/// Build a store from the server's current working directory.
|
||||
///
|
||||
/// The on-disk layout becomes `<cwd>/.claw/sessions/<workspace_hash>/`.
|
||||
/// The on-disk layout is `<cwd>/.claw/sessions/<workspace_hash>/`,
|
||||
/// created lazily on first successful session save.
|
||||
pub fn from_cwd(cwd: impl AsRef<Path>) -> Result<Self, SessionControlError> {
|
||||
let cwd = cwd.as_ref();
|
||||
// #151: canonicalize so equivalent paths (symlinks, relative vs
|
||||
@@ -40,7 +41,6 @@ impl SessionStore {
|
||||
.join(".claw")
|
||||
.join("sessions")
|
||||
.join(workspace_fingerprint(&canonical_cwd));
|
||||
fs::create_dir_all(&sessions_root)?;
|
||||
Ok(Self {
|
||||
sessions_root,
|
||||
workspace_root: canonical_cwd,
|
||||
@@ -49,7 +49,8 @@ impl SessionStore {
|
||||
|
||||
/// Build a store from an explicit `--data-dir` flag.
|
||||
///
|
||||
/// The on-disk layout becomes `<data_dir>/sessions/<workspace_hash>/`
|
||||
/// The on-disk layout is `<data_dir>/sessions/<workspace_hash>/`,
|
||||
/// created lazily on first successful session save.
|
||||
/// where `<workspace_hash>` is derived from `workspace_root`.
|
||||
pub fn from_data_dir(
|
||||
data_dir: impl AsRef<Path>,
|
||||
@@ -64,7 +65,6 @@ impl SessionStore {
|
||||
.as_ref()
|
||||
.join("sessions")
|
||||
.join(workspace_fingerprint(&canonical_workspace));
|
||||
fs::create_dir_all(&sessions_root)?;
|
||||
Ok(Self {
|
||||
sessions_root,
|
||||
workspace_root: canonical_workspace,
|
||||
@@ -760,14 +760,21 @@ mod tests {
|
||||
use crate::session::Session;
|
||||
use std::fs;
|
||||
use std::path::{Path, PathBuf};
|
||||
use std::sync::atomic::{AtomicU64, Ordering};
|
||||
use std::time::{SystemTime, UNIX_EPOCH};
|
||||
|
||||
static TEMP_COUNTER: AtomicU64 = AtomicU64::new(0);
|
||||
|
||||
fn temp_dir() -> PathBuf {
|
||||
let nanos = SystemTime::now()
|
||||
.duration_since(UNIX_EPOCH)
|
||||
.expect("time should be after epoch")
|
||||
.as_nanos();
|
||||
std::env::temp_dir().join(format!("runtime-session-control-{nanos}"))
|
||||
let counter = TEMP_COUNTER.fetch_add(1, Ordering::Relaxed);
|
||||
std::env::temp_dir().join(format!(
|
||||
"runtime-session-control-{}-{nanos}-{counter}",
|
||||
std::process::id()
|
||||
))
|
||||
}
|
||||
|
||||
fn persist_session(root: &Path, text: &str) -> Session {
|
||||
@@ -981,6 +988,38 @@ mod tests {
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn session_store_from_cwd_is_side_effect_free_until_save() {
|
||||
// given
|
||||
let base = temp_dir();
|
||||
let workspace = base.join("fresh-workspace");
|
||||
fs::create_dir_all(&workspace).expect("workspace should exist");
|
||||
|
||||
// when
|
||||
let store = SessionStore::from_cwd(&workspace).expect("store should build");
|
||||
|
||||
// then — resolving the store must not create .claw/session partitions.
|
||||
assert!(
|
||||
!workspace.join(".claw").exists(),
|
||||
"session store construction must not create .claw side effects"
|
||||
);
|
||||
assert!(
|
||||
!store.sessions_dir().exists(),
|
||||
"session partition should be created lazily on save"
|
||||
);
|
||||
|
||||
let session = persist_session_via_store(&store, "first saved turn");
|
||||
assert!(
|
||||
store
|
||||
.sessions_dir()
|
||||
.join(format!("{}.jsonl", session.session_id))
|
||||
.exists(),
|
||||
"saving a managed session should create the lazy session partition"
|
||||
);
|
||||
|
||||
fs::remove_dir_all(base).expect("temp dir should clean up");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn session_store_from_cwd_isolates_sessions_by_workspace() {
|
||||
// given
|
||||
|
||||
@@ -686,6 +686,7 @@ mod tests {
|
||||
id: "1".to_string(),
|
||||
name: "read_file".to_string(),
|
||||
input: r#"{"path":"src/main.rs"}"#.to_string(),
|
||||
thought_signature: None,
|
||||
}]),
|
||||
ConversationMessage::tool_result(
|
||||
"1",
|
||||
@@ -697,6 +698,7 @@ mod tests {
|
||||
id: "2".to_string(),
|
||||
name: "edit_file".to_string(),
|
||||
input: r#"{"path":"src/main.rs","old":"old","new":"new"}"#.to_string(),
|
||||
thought_signature: None,
|
||||
}]),
|
||||
ConversationMessage::tool_result(
|
||||
"2",
|
||||
@@ -718,6 +720,7 @@ mod tests {
|
||||
id: "1".to_string(),
|
||||
name: "read_file".to_string(),
|
||||
input: r#"{"path":"src/main.rs"}"#.to_string(),
|
||||
thought_signature: None,
|
||||
}]),
|
||||
ConversationMessage::tool_result(
|
||||
"1",
|
||||
@@ -746,6 +749,7 @@ mod tests {
|
||||
id: "t".to_string(),
|
||||
name: "bash".to_string(),
|
||||
input: r#"{"command":"ls"}"#.to_string(),
|
||||
thought_signature: None,
|
||||
},
|
||||
]));
|
||||
|
||||
@@ -764,6 +768,7 @@ mod tests {
|
||||
id: format!("read_{i}"),
|
||||
name: "read_file".to_string(),
|
||||
input: format!(r#"{{"path":"src/{i}.rs"}}"#),
|
||||
thought_signature: None,
|
||||
},
|
||||
]));
|
||||
messages.push(ConversationMessage::tool_result(
|
||||
@@ -789,6 +794,7 @@ mod tests {
|
||||
id: "1".to_string(),
|
||||
name: "read_file".to_string(),
|
||||
input: r#"{"path":"src/main.rs"}"#.to_string(),
|
||||
thought_signature: None,
|
||||
}]),
|
||||
ConversationMessage::tool_result(
|
||||
"1",
|
||||
@@ -800,6 +806,7 @@ mod tests {
|
||||
id: "2".to_string(),
|
||||
name: "edit_file".to_string(),
|
||||
input: r#"{"path":"src/main.rs","old":"buggy","new":"fixed"}"#.to_string(),
|
||||
thought_signature: None,
|
||||
}]),
|
||||
ConversationMessage::tool_result(
|
||||
"2",
|
||||
|
||||
@@ -12,7 +12,6 @@ path = "src/main.rs"
|
||||
[dependencies]
|
||||
api = { path = "../api" }
|
||||
commands = { path = "../commands" }
|
||||
compat-harness = { path = "../compat-harness" }
|
||||
crossterm = "0.28"
|
||||
pulldown-cmark = "0.13"
|
||||
rustyline = "15"
|
||||
|
||||
@@ -1,10 +1,9 @@
|
||||
use std::env;
|
||||
use std::process::Command;
|
||||
|
||||
fn main() {
|
||||
// Get git SHA (short hash)
|
||||
let git_sha = Command::new("git")
|
||||
.args(["rev-parse", "--short", "HEAD"])
|
||||
fn command_output(program: &str, args: &[&str]) -> Option<String> {
|
||||
Command::new(program)
|
||||
.args(args)
|
||||
.output()
|
||||
.ok()
|
||||
.and_then(|output| {
|
||||
@@ -14,11 +13,37 @@ fn main() {
|
||||
None
|
||||
}
|
||||
})
|
||||
.map_or_else(|| "unknown".to_string(), |s| s.trim().to_string());
|
||||
.map(|value| value.trim().to_string())
|
||||
.filter(|value| !value.is_empty())
|
||||
}
|
||||
|
||||
fn main() {
|
||||
let git_sha =
|
||||
command_output("git", &["rev-parse", "HEAD"]).unwrap_or_else(|| "unknown".to_string());
|
||||
let git_sha_short = command_output("git", &["rev-parse", "--short=12", "HEAD"])
|
||||
.or_else(|| git_sha.get(..git_sha.len().min(12)).map(str::to_string))
|
||||
.unwrap_or_else(|| "unknown".to_string());
|
||||
let git_dirty = command_output("git", &["status", "--porcelain"])
|
||||
.map(|status| (!status.trim().is_empty()).to_string())
|
||||
.unwrap_or_else(|| "false".to_string());
|
||||
let git_branch = command_output("git", &["branch", "--show-current"])
|
||||
.unwrap_or_else(|| "unknown".to_string());
|
||||
let git_commit_date = command_output("git", &["show", "-s", "--format=%cI", "HEAD"])
|
||||
.unwrap_or_else(|| "unknown".to_string());
|
||||
let git_commit_timestamp = command_output("git", &["show", "-s", "--format=%ct", "HEAD"])
|
||||
.unwrap_or_else(|| "unknown".to_string());
|
||||
let rustc_version =
|
||||
command_output("rustc", &["--version"]).unwrap_or_else(|| "unknown".to_string());
|
||||
|
||||
println!("cargo:rustc-env=GIT_SHA={git_sha}");
|
||||
println!("cargo:rustc-env=GIT_SHA_SHORT={git_sha_short}");
|
||||
println!("cargo:rustc-env=GIT_DIRTY={git_dirty}");
|
||||
println!("cargo:rustc-env=GIT_BRANCH={git_branch}");
|
||||
println!("cargo:rustc-env=GIT_COMMIT_DATE={git_commit_date}");
|
||||
println!("cargo:rustc-env=GIT_COMMIT_TIMESTAMP={git_commit_timestamp}");
|
||||
println!("cargo:rustc-env=RUSTC_VERSION={rustc_version}");
|
||||
|
||||
// TARGET is always set by Cargo during build
|
||||
// TARGET is always set by Cargo during build.
|
||||
let target = env::var("TARGET").unwrap_or_else(|_| "unknown".to_string());
|
||||
println!("cargo:rustc-env=TARGET={target}");
|
||||
|
||||
@@ -35,23 +60,12 @@ fn main() {
|
||||
})
|
||||
.or_else(|| std::env::var("BUILD_DATE").ok())
|
||||
.unwrap_or_else(|| {
|
||||
// Fall back to current date via `date` command
|
||||
Command::new("date")
|
||||
.args(["+%Y-%m-%d"])
|
||||
.output()
|
||||
.ok()
|
||||
.and_then(|o| {
|
||||
if o.status.success() {
|
||||
String::from_utf8(o.stdout).ok()
|
||||
} else {
|
||||
None
|
||||
}
|
||||
})
|
||||
.map_or_else(|| "unknown".to_string(), |s| s.trim().to_string())
|
||||
command_output("date", &["+%Y-%m-%d"]).unwrap_or_else(|| "unknown".to_string())
|
||||
});
|
||||
println!("cargo:rustc-env=BUILD_DATE={build_date}");
|
||||
|
||||
// Rerun if git state changes
|
||||
println!("cargo:rerun-if-changed=.git/HEAD");
|
||||
println!("cargo:rerun-if-changed=.git/refs");
|
||||
// Rerun if git state changes. Paths are relative to this package root.
|
||||
println!("cargo:rerun-if-changed=../../../.git/HEAD");
|
||||
println!("cargo:rerun-if-changed=../../../.git/refs");
|
||||
println!("cargo:rerun-if-changed=../../../.git/index");
|
||||
}
|
||||
|
||||
@@ -4,7 +4,14 @@ use std::path::{Path, PathBuf};
|
||||
const STARTER_CLAW_JSON: &str = concat!(
|
||||
"{\n",
|
||||
" \"permissions\": {\n",
|
||||
" \"defaultMode\": \"dontAsk\"\n",
|
||||
" \"defaultMode\": \"acceptEdits\"\n",
|
||||
" }\n",
|
||||
"}\n",
|
||||
);
|
||||
const STARTER_SETTINGS_JSON: &str = concat!(
|
||||
"{\n",
|
||||
" \"permissions\": {\n",
|
||||
" \"defaultMode\": \"acceptEdits\"\n",
|
||||
" }\n",
|
||||
"}\n",
|
||||
);
|
||||
@@ -15,6 +22,8 @@ const GITIGNORE_ENTRIES: [&str; 3] = [".claw/settings.local.json", ".claw/sessio
|
||||
pub(crate) enum InitStatus {
|
||||
Created,
|
||||
Updated,
|
||||
Partial,
|
||||
Deferred,
|
||||
Skipped,
|
||||
}
|
||||
|
||||
@@ -24,6 +33,8 @@ impl InitStatus {
|
||||
match self {
|
||||
Self::Created => "created",
|
||||
Self::Updated => "updated",
|
||||
Self::Partial => "partial (created missing sub-files)",
|
||||
Self::Deferred => "deferred (created on first session save)",
|
||||
Self::Skipped => "skipped (already exists)",
|
||||
}
|
||||
}
|
||||
@@ -36,6 +47,8 @@ impl InitStatus {
|
||||
match self {
|
||||
Self::Created => "created",
|
||||
Self::Updated => "updated",
|
||||
Self::Partial => "partial",
|
||||
Self::Deferred => "deferred",
|
||||
Self::Skipped => "skipped",
|
||||
}
|
||||
}
|
||||
@@ -123,9 +136,30 @@ pub(crate) fn initialize_repo(cwd: &Path) -> Result<InitReport, Box<dyn std::err
|
||||
let mut artifacts = Vec::new();
|
||||
|
||||
let claw_dir = cwd.join(".claw");
|
||||
let claw_dir_status = ensure_dir(&claw_dir)?;
|
||||
let settings_json = claw_dir.join("settings.json");
|
||||
let settings_status = write_file_if_missing(&settings_json, STARTER_SETTINGS_JSON)?;
|
||||
let claw_dir_status =
|
||||
if claw_dir_status == InitStatus::Skipped && settings_status == InitStatus::Created {
|
||||
InitStatus::Partial
|
||||
} else {
|
||||
claw_dir_status
|
||||
};
|
||||
artifacts.push(InitArtifact {
|
||||
name: ".claw/",
|
||||
status: ensure_dir(&claw_dir)?,
|
||||
status: claw_dir_status,
|
||||
});
|
||||
artifacts.push(InitArtifact {
|
||||
name: ".claw/settings.json",
|
||||
status: settings_status,
|
||||
});
|
||||
artifacts.push(InitArtifact {
|
||||
name: ".claw/sessions/",
|
||||
status: if claw_dir.join("sessions").is_dir() {
|
||||
InitStatus::Skipped
|
||||
} else {
|
||||
InitStatus::Deferred
|
||||
},
|
||||
});
|
||||
|
||||
let claw_json = cwd.join(".claw.json");
|
||||
@@ -414,11 +448,26 @@ mod tests {
|
||||
concat!(
|
||||
"{\n",
|
||||
" \"permissions\": {\n",
|
||||
" \"defaultMode\": \"dontAsk\"\n",
|
||||
" \"defaultMode\": \"acceptEdits\"\n",
|
||||
" }\n",
|
||||
"}\n",
|
||||
)
|
||||
);
|
||||
assert_eq!(
|
||||
fs::read_to_string(root.join(".claw").join("settings.json"))
|
||||
.expect("read project settings"),
|
||||
concat!(
|
||||
"{\n",
|
||||
" \"permissions\": {\n",
|
||||
" \"defaultMode\": \"acceptEdits\"\n",
|
||||
" }\n",
|
||||
"}\n",
|
||||
)
|
||||
);
|
||||
assert!(
|
||||
!root.join(".claw").join("sessions").exists(),
|
||||
"sessions directory should be deferred until first session save"
|
||||
);
|
||||
let gitignore = fs::read_to_string(root.join(".gitignore")).expect("read gitignore");
|
||||
assert!(gitignore.contains(".claw/settings.local.json"));
|
||||
assert!(gitignore.contains(".claw/sessions/"));
|
||||
@@ -436,14 +485,24 @@ mod tests {
|
||||
fs::create_dir_all(&root).expect("create root");
|
||||
fs::write(root.join("CLAUDE.md"), "custom guidance\n").expect("write existing claude md");
|
||||
fs::write(root.join(".gitignore"), ".claw/settings.local.json\n").expect("write gitignore");
|
||||
fs::create_dir_all(root.join(".claw")).expect("create existing .claw dir");
|
||||
|
||||
let first = initialize_repo(&root).expect("first init should succeed");
|
||||
assert!(first
|
||||
.render()
|
||||
.contains("CLAUDE.md skipped (already exists)"));
|
||||
assert_eq!(
|
||||
first.artifacts_with_status(InitStatus::Partial),
|
||||
vec![".claw/".to_string()],
|
||||
"existing .claw/ should report partial when init creates missing settings.json"
|
||||
);
|
||||
assert!(root.join(".claw").join("settings.json").is_file());
|
||||
|
||||
let second = initialize_repo(&root).expect("second init should succeed");
|
||||
let second_rendered = second.render();
|
||||
assert!(second_rendered.contains(".claw/"));
|
||||
assert!(second_rendered.contains(".claw/settings.json"));
|
||||
assert!(second_rendered.contains(".claw/sessions/"));
|
||||
assert!(second_rendered.contains(".claw.json"));
|
||||
assert!(second_rendered.contains("skipped (already exists)"));
|
||||
assert!(second_rendered.contains(".gitignore skipped (already exists)"));
|
||||
@@ -474,16 +533,22 @@ mod tests {
|
||||
created_names,
|
||||
vec![
|
||||
".claw/".to_string(),
|
||||
".claw/settings.json".to_string(),
|
||||
".claw.json".to_string(),
|
||||
".gitignore".to_string(),
|
||||
"CLAUDE.md".to_string(),
|
||||
],
|
||||
"fresh init should place all four artifacts in created[]"
|
||||
"fresh init should place created artifacts in created[]"
|
||||
);
|
||||
assert!(
|
||||
fresh.artifacts_with_status(InitStatus::Skipped).is_empty(),
|
||||
"fresh init should have no skipped artifacts"
|
||||
);
|
||||
assert_eq!(
|
||||
fresh.artifacts_with_status(InitStatus::Deferred),
|
||||
vec![".claw/sessions/".to_string()],
|
||||
"fresh init should report session storage as deferred"
|
||||
);
|
||||
|
||||
let second = initialize_repo(&root).expect("second init should succeed");
|
||||
let skipped_names = second.artifacts_with_status(InitStatus::Skipped);
|
||||
@@ -491,27 +556,38 @@ mod tests {
|
||||
skipped_names,
|
||||
vec![
|
||||
".claw/".to_string(),
|
||||
".claw/settings.json".to_string(),
|
||||
".claw.json".to_string(),
|
||||
".gitignore".to_string(),
|
||||
"CLAUDE.md".to_string(),
|
||||
],
|
||||
"idempotent init should place all four artifacts in skipped[]"
|
||||
"idempotent init should place existing artifacts in skipped[]"
|
||||
);
|
||||
assert!(
|
||||
second.artifacts_with_status(InitStatus::Created).is_empty(),
|
||||
"idempotent init should have no created artifacts"
|
||||
);
|
||||
assert_eq!(
|
||||
second.artifacts_with_status(InitStatus::Deferred),
|
||||
vec![".claw/sessions/".to_string()],
|
||||
"idempotent init should keep session storage deferred until first save"
|
||||
);
|
||||
|
||||
// artifact_json_entries() uses the machine-stable `json_tag()` which
|
||||
// never changes wording (unlike `label()` which says "skipped (already exists)").
|
||||
let entries = second.artifact_json_entries();
|
||||
assert_eq!(entries.len(), 4);
|
||||
assert_eq!(entries.len(), 6);
|
||||
for entry in &entries {
|
||||
let name = entry.get("name").and_then(|v| v.as_str()).unwrap();
|
||||
let status = entry.get("status").and_then(|v| v.as_str()).unwrap();
|
||||
assert_eq!(
|
||||
status, "skipped",
|
||||
"machine status tag should be the bare word 'skipped', not label()'s 'skipped (already exists)'"
|
||||
);
|
||||
if name == ".claw/sessions/" {
|
||||
assert_eq!(status, "deferred");
|
||||
} else {
|
||||
assert_eq!(
|
||||
status, "skipped",
|
||||
"machine status tag should be the bare word 'skipped', not label()'s 'skipped (already exists)'"
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
fs::remove_dir_all(root).expect("cleanup temp dir");
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -1,6 +1,7 @@
|
||||
#![allow(clippy::while_let_on_iterator)]
|
||||
|
||||
use std::fs;
|
||||
use std::io::Write;
|
||||
use std::path::PathBuf;
|
||||
use std::process::{Command, Output, Stdio};
|
||||
use std::sync::atomic::{AtomicU64, Ordering};
|
||||
@@ -246,8 +247,121 @@ stderr:
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn compact_subcommand_json_help_fails_fast_when_stdin_closed() {
|
||||
let workspace = unique_temp_dir("compact-nontty-json-help");
|
||||
fn prompt_subcommand_reads_prompt_from_stdin_when_no_positional_arg_423() {
|
||||
let runtime = tokio::runtime::Runtime::new().expect("tokio runtime should build");
|
||||
let server = runtime
|
||||
.block_on(MockAnthropicService::spawn())
|
||||
.expect("mock service should start");
|
||||
let base_url = server.base_url();
|
||||
|
||||
let workspace = unique_temp_dir("prompt-stdin-423");
|
||||
let config_home = workspace.join("config-home");
|
||||
let home = workspace.join("home");
|
||||
fs::create_dir_all(&workspace).expect("workspace should exist");
|
||||
fs::create_dir_all(&config_home).expect("config home should exist");
|
||||
fs::create_dir_all(&home).expect("home should exist");
|
||||
|
||||
let prompt = format!("{SCENARIO_PREFIX}streaming_text\n");
|
||||
let output = run_claw_with_stdin(
|
||||
&workspace,
|
||||
&config_home,
|
||||
&home,
|
||||
&base_url,
|
||||
&[
|
||||
"prompt",
|
||||
"--output-format",
|
||||
"json",
|
||||
"--compact",
|
||||
"--permission-mode",
|
||||
"read-only",
|
||||
"--model",
|
||||
"sonnet",
|
||||
],
|
||||
&prompt,
|
||||
);
|
||||
|
||||
assert!(
|
||||
output.status.success(),
|
||||
"prompt stdin run should succeed\nstdout:\n{}\n\nstderr:\n{}",
|
||||
String::from_utf8_lossy(&output.stdout),
|
||||
String::from_utf8_lossy(&output.stderr),
|
||||
);
|
||||
let parsed: Value = serde_json::from_slice(&output.stdout).expect("stdout should parse");
|
||||
assert_eq!(
|
||||
parsed["message"],
|
||||
"Mock streaming says hello from the parity harness."
|
||||
);
|
||||
let captured = runtime.block_on(server.captured_requests());
|
||||
assert!(
|
||||
captured
|
||||
.iter()
|
||||
.any(|request| request.raw_body.contains("PARITY_SCENARIO:streaming_text")),
|
||||
"stdin prompt should reach the provider request: {captured:?}"
|
||||
);
|
||||
|
||||
fs::remove_dir_all(&workspace).expect("workspace cleanup should succeed");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn prompt_subcommand_stdin_flag_appends_pipe_context_423() {
|
||||
let runtime = tokio::runtime::Runtime::new().expect("tokio runtime should build");
|
||||
let server = runtime
|
||||
.block_on(MockAnthropicService::spawn())
|
||||
.expect("mock service should start");
|
||||
let base_url = server.base_url();
|
||||
|
||||
let workspace = unique_temp_dir("prompt-stdin-flag-423");
|
||||
let config_home = workspace.join("config-home");
|
||||
let home = workspace.join("home");
|
||||
fs::create_dir_all(&workspace).expect("workspace should exist");
|
||||
fs::create_dir_all(&config_home).expect("config home should exist");
|
||||
fs::create_dir_all(&home).expect("home should exist");
|
||||
|
||||
let prompt_context = format!("{SCENARIO_PREFIX}streaming_text\n");
|
||||
let output = run_claw_with_stdin(
|
||||
&workspace,
|
||||
&config_home,
|
||||
&home,
|
||||
&base_url,
|
||||
&[
|
||||
"prompt",
|
||||
"Use stdin context",
|
||||
"--stdin",
|
||||
"--output-format",
|
||||
"json",
|
||||
"--compact",
|
||||
"--permission-mode",
|
||||
"read-only",
|
||||
"--model",
|
||||
"sonnet",
|
||||
],
|
||||
&prompt_context,
|
||||
);
|
||||
|
||||
assert!(
|
||||
output.status.success(),
|
||||
"prompt --stdin run should succeed\nstdout:\n{}\n\nstderr:\n{}",
|
||||
String::from_utf8_lossy(&output.stdout),
|
||||
String::from_utf8_lossy(&output.stderr),
|
||||
);
|
||||
let captured = runtime.block_on(server.captured_requests());
|
||||
let provider_body = captured
|
||||
.iter()
|
||||
.find(|request| request.raw_body.contains("Use stdin context"))
|
||||
.expect("merged prompt should reach provider");
|
||||
assert!(
|
||||
provider_body
|
||||
.raw_body
|
||||
.contains("PARITY_SCENARIO:streaming_text"),
|
||||
"merged prompt should include stdin context: {provider_body:?}"
|
||||
);
|
||||
|
||||
fs::remove_dir_all(&workspace).expect("workspace cleanup should succeed");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn compact_subcommand_json_fails_fast_when_stdin_closed() {
|
||||
let workspace = unique_temp_dir("compact-nontty-json");
|
||||
let config_home = workspace.join("config-home");
|
||||
let home = workspace.join("home");
|
||||
fs::create_dir_all(&workspace).expect("workspace should exist");
|
||||
@@ -258,19 +372,19 @@ fn compact_subcommand_json_help_fails_fast_when_stdin_closed() {
|
||||
&workspace,
|
||||
&config_home,
|
||||
&home,
|
||||
&["compact", "--output-format", "json", "--help"],
|
||||
&["compact", "--output-format", "json"],
|
||||
Duration::from_secs(2),
|
||||
);
|
||||
|
||||
assert!(
|
||||
!output.status.success(),
|
||||
"compact json help should fail non-zero"
|
||||
"compact json should fail non-zero"
|
||||
);
|
||||
// #819/#820/#823: JSON abort envelopes route to stdout
|
||||
let stderr = String::from_utf8(output.stderr).expect("stderr should be utf8");
|
||||
assert!(
|
||||
stderr.trim().is_empty() || !stderr.trim_start().starts_with('{'),
|
||||
"compact json help should not emit JSON envelope to stderr (#819/#820/#823): {stderr}"
|
||||
"compact json should not emit JSON envelope to stderr (#819/#820/#823): {stderr}"
|
||||
);
|
||||
let stdout = String::from_utf8(output.stdout).expect("stdout should be utf8");
|
||||
let parsed: Value =
|
||||
@@ -356,6 +470,39 @@ fn run_claw(
|
||||
command.output().expect("claw should launch")
|
||||
}
|
||||
|
||||
fn run_claw_with_stdin(
|
||||
cwd: &std::path::Path,
|
||||
config_home: &std::path::Path,
|
||||
home: &std::path::Path,
|
||||
base_url: &str,
|
||||
args: &[&str],
|
||||
stdin: &str,
|
||||
) -> Output {
|
||||
let mut child = Command::new(env!("CARGO_BIN_EXE_claw"))
|
||||
.current_dir(cwd)
|
||||
.env_clear()
|
||||
.env("ANTHROPIC_API_KEY", "test-compact-key")
|
||||
.env("ANTHROPIC_BASE_URL", base_url)
|
||||
.env("CLAW_CONFIG_HOME", config_home)
|
||||
.env("HOME", home)
|
||||
.env("NO_COLOR", "1")
|
||||
.env("PATH", "/usr/bin:/bin")
|
||||
.stdin(Stdio::piped())
|
||||
.stdout(Stdio::piped())
|
||||
.stderr(Stdio::piped())
|
||||
.args(args)
|
||||
.spawn()
|
||||
.expect("claw should launch");
|
||||
child
|
||||
.stdin
|
||||
.as_mut()
|
||||
.expect("stdin should be piped")
|
||||
.write_all(stdin.as_bytes())
|
||||
.expect("stdin should write");
|
||||
child.stdin.take();
|
||||
child.wait_with_output().expect("output should collect")
|
||||
}
|
||||
|
||||
fn run_claw_closed_stdin_with_timeout(
|
||||
cwd: &std::path::Path,
|
||||
config_home: &std::path::Path,
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -222,6 +222,73 @@ fn resume_latest_restores_the_most_recent_managed_session() {
|
||||
assert!(stdout.contains(newer_path.to_str().expect("utf8 path")));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn resume_latest_missing_session_fails_without_creating_session_dirs_435() {
|
||||
// given
|
||||
let temp_dir = unique_temp_dir("resume-latest-missing-435");
|
||||
let project_dir = temp_dir.join("project");
|
||||
let config_home = temp_dir.join("config-home");
|
||||
let home = temp_dir.join("home");
|
||||
fs::create_dir_all(&project_dir).expect("project dir should exist");
|
||||
fs::create_dir_all(&config_home).expect("config home should exist");
|
||||
fs::create_dir_all(&home).expect("home should exist");
|
||||
let envs = [
|
||||
(
|
||||
"CLAW_CONFIG_HOME",
|
||||
config_home.to_str().expect("utf8 config home"),
|
||||
),
|
||||
("HOME", home.to_str().expect("utf8 home")),
|
||||
("ANTHROPIC_API_KEY", ""),
|
||||
("ANTHROPIC_AUTH_TOKEN", ""),
|
||||
("OPENAI_API_KEY", ""),
|
||||
];
|
||||
|
||||
// when — both text and JSON resume failures should be non-zero and read-only.
|
||||
let text = run_claw_with_env(&project_dir, &["--resume", "latest"], &envs);
|
||||
let json = run_claw_with_env(
|
||||
&project_dir,
|
||||
&["--output-format", "json", "--resume", "latest"],
|
||||
&envs,
|
||||
);
|
||||
|
||||
// then
|
||||
assert_eq!(
|
||||
text.status.code(),
|
||||
Some(1),
|
||||
"text resume failure must be non-zero"
|
||||
);
|
||||
assert!(
|
||||
text.stdout.is_empty(),
|
||||
"text resume failure should not claim success on stdout: {}",
|
||||
String::from_utf8_lossy(&text.stdout)
|
||||
);
|
||||
let text_stderr = String::from_utf8_lossy(&text.stderr);
|
||||
assert!(
|
||||
text_stderr.contains("no managed sessions found"),
|
||||
"text failure should explain missing sessions: {text_stderr}"
|
||||
);
|
||||
|
||||
assert_eq!(
|
||||
json.status.code(),
|
||||
Some(1),
|
||||
"JSON resume failure must be non-zero"
|
||||
);
|
||||
assert!(
|
||||
json.stderr.is_empty(),
|
||||
"JSON resume failure should keep stderr empty: {}",
|
||||
String::from_utf8_lossy(&json.stderr)
|
||||
);
|
||||
let parsed: Value = serde_json::from_slice(&json.stdout)
|
||||
.expect("JSON resume failure should emit JSON to stdout");
|
||||
assert_eq!(parsed["status"], "error");
|
||||
assert_eq!(parsed["action"], "restore");
|
||||
assert_eq!(parsed["error_kind"], "no_managed_sessions");
|
||||
assert!(
|
||||
!project_dir.join(".claw").exists(),
|
||||
"failed resume must not create .claw/session directories"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn resumed_status_command_emits_structured_json_when_requested() {
|
||||
// given
|
||||
@@ -268,7 +335,7 @@ fn resumed_status_command_emits_structured_json_when_requested() {
|
||||
assert_eq!(parsed["kind"], "status");
|
||||
// model is null in resume mode (not known without --model flag)
|
||||
assert!(parsed["model"].is_null());
|
||||
assert_eq!(parsed["permission_mode"], "danger-full-access");
|
||||
assert_eq!(parsed["permission_mode"], "workspace-write");
|
||||
assert_eq!(parsed["usage"]["messages"], 1);
|
||||
assert!(parsed["usage"]["turns"].is_number());
|
||||
assert!(parsed["workspace"]["cwd"].as_str().is_some());
|
||||
@@ -396,6 +463,9 @@ fn resumed_version_command_emits_structured_json() {
|
||||
assert!(parsed["version"].as_str().is_some());
|
||||
assert!(parsed["git_sha"].as_str().is_some());
|
||||
assert!(parsed["target"].as_str().is_some());
|
||||
assert!(parsed["git_sha_short"].as_str().is_some());
|
||||
assert!(parsed.get("message").is_none());
|
||||
assert!(parsed["human_readable"].as_str().is_some());
|
||||
}
|
||||
|
||||
#[test]
|
||||
|
||||
+221
-60
@@ -201,30 +201,20 @@ impl GlobalToolRegistry {
|
||||
return Ok(None);
|
||||
}
|
||||
|
||||
let builtin_specs = mvp_tool_specs();
|
||||
let canonical_names = builtin_specs
|
||||
.iter()
|
||||
.map(|spec| spec.name.to_string())
|
||||
.chain(
|
||||
self.plugin_tools
|
||||
.iter()
|
||||
.map(|tool| tool.definition().name.clone()),
|
||||
)
|
||||
.chain(self.runtime_tools.iter().map(|tool| tool.name.clone()))
|
||||
.collect::<Vec<_>>();
|
||||
let mut name_map = canonical_names
|
||||
.iter()
|
||||
.map(|name| (normalize_tool_name(name), name.clone()))
|
||||
.collect::<BTreeMap<_, _>>();
|
||||
let actual_names = self.actual_tool_names();
|
||||
let canonical_names = self.canonical_allowed_tool_names();
|
||||
let canonical_name_set = canonical_names.iter().cloned().collect::<BTreeSet<_>>();
|
||||
let mut name_map = BTreeMap::new();
|
||||
for actual in &actual_names {
|
||||
let canonical = canonical_allowed_tool_name(actual);
|
||||
name_map.insert(allowed_tool_lookup_key(actual), canonical.clone());
|
||||
name_map.insert(allowed_tool_lookup_key(&canonical), canonical);
|
||||
}
|
||||
|
||||
for (alias, canonical) in [
|
||||
("read", "read_file"),
|
||||
("write", "write_file"),
|
||||
("edit", "edit_file"),
|
||||
("glob", "glob_search"),
|
||||
("grep", "grep_search"),
|
||||
] {
|
||||
name_map.insert(alias.to_string(), canonical.to_string());
|
||||
for (alias, canonical) in self.allowed_tool_aliases() {
|
||||
if canonical_name_set.contains(&canonical) {
|
||||
name_map.insert(allowed_tool_lookup_key(&alias), canonical);
|
||||
}
|
||||
}
|
||||
|
||||
let mut allowed = BTreeSet::new();
|
||||
@@ -233,11 +223,11 @@ impl GlobalToolRegistry {
|
||||
.split(|ch: char| ch == ',' || ch.is_whitespace())
|
||||
.filter(|token| !token.is_empty())
|
||||
{
|
||||
let normalized = normalize_tool_name(token);
|
||||
let canonical = name_map.get(&normalized).ok_or_else(|| {
|
||||
let canonical = name_map.get(&allowed_tool_lookup_key(token)).ok_or_else(|| {
|
||||
format!(
|
||||
"unsupported tool in --allowedTools: {token} (expected one of: {})",
|
||||
canonical_names.join(", ")
|
||||
"invalid_tool_name: unsupported tool in --allowedTools: {token}\nAvailable: {}\nAliases: {}\nHint: Use canonical snake_case tool names from Available or aliases from Aliases.",
|
||||
canonical_names.join(", "),
|
||||
format_allowed_tool_aliases(&self.allowed_tool_aliases())
|
||||
)
|
||||
})?;
|
||||
allowed.insert(canonical.clone());
|
||||
@@ -258,7 +248,10 @@ impl GlobalToolRegistry {
|
||||
pub fn definitions(&self, allowed_tools: Option<&BTreeSet<String>>) -> Vec<ToolDefinition> {
|
||||
let builtin = mvp_tool_specs()
|
||||
.into_iter()
|
||||
.filter(|spec| allowed_tools.is_none_or(|allowed| allowed.contains(spec.name)))
|
||||
.filter(|spec| {
|
||||
allowed_tools
|
||||
.is_none_or(|allowed| allowed.contains(&canonical_allowed_tool_name(spec.name)))
|
||||
})
|
||||
.map(|spec| ToolDefinition {
|
||||
name: spec.name.to_string(),
|
||||
description: Some(spec.description.to_string()),
|
||||
@@ -267,7 +260,11 @@ impl GlobalToolRegistry {
|
||||
let runtime = self
|
||||
.runtime_tools
|
||||
.iter()
|
||||
.filter(|tool| allowed_tools.is_none_or(|allowed| allowed.contains(tool.name.as_str())))
|
||||
.filter(|tool| {
|
||||
allowed_tools.is_none_or(|allowed| {
|
||||
allowed.contains(&canonical_allowed_tool_name(&tool.name))
|
||||
})
|
||||
})
|
||||
.map(|tool| ToolDefinition {
|
||||
name: tool.name.clone(),
|
||||
description: tool.description.clone(),
|
||||
@@ -277,8 +274,11 @@ impl GlobalToolRegistry {
|
||||
.plugin_tools
|
||||
.iter()
|
||||
.filter(|tool| {
|
||||
allowed_tools
|
||||
.is_none_or(|allowed| allowed.contains(tool.definition().name.as_str()))
|
||||
allowed_tools.is_none_or(|allowed| {
|
||||
allowed.contains(&canonical_allowed_tool_name(
|
||||
tool.definition().name.as_str(),
|
||||
))
|
||||
})
|
||||
})
|
||||
.map(|tool| ToolDefinition {
|
||||
name: tool.definition().name.clone(),
|
||||
@@ -294,19 +294,29 @@ impl GlobalToolRegistry {
|
||||
) -> Result<Vec<(String, PermissionMode)>, String> {
|
||||
let builtin = mvp_tool_specs()
|
||||
.into_iter()
|
||||
.filter(|spec| allowed_tools.is_none_or(|allowed| allowed.contains(spec.name)))
|
||||
.filter(|spec| {
|
||||
allowed_tools
|
||||
.is_none_or(|allowed| allowed.contains(&canonical_allowed_tool_name(spec.name)))
|
||||
})
|
||||
.map(|spec| (spec.name.to_string(), spec.required_permission));
|
||||
let runtime = self
|
||||
.runtime_tools
|
||||
.iter()
|
||||
.filter(|tool| allowed_tools.is_none_or(|allowed| allowed.contains(tool.name.as_str())))
|
||||
.filter(|tool| {
|
||||
allowed_tools.is_none_or(|allowed| {
|
||||
allowed.contains(&canonical_allowed_tool_name(&tool.name))
|
||||
})
|
||||
})
|
||||
.map(|tool| (tool.name.clone(), tool.required_permission));
|
||||
let plugin = self
|
||||
.plugin_tools
|
||||
.iter()
|
||||
.filter(|tool| {
|
||||
allowed_tools
|
||||
.is_none_or(|allowed| allowed.contains(tool.definition().name.as_str()))
|
||||
allowed_tools.is_none_or(|allowed| {
|
||||
allowed.contains(&canonical_allowed_tool_name(
|
||||
tool.definition().name.as_str(),
|
||||
))
|
||||
})
|
||||
})
|
||||
.map(|tool| {
|
||||
permission_mode_from_plugin(tool.required_permission())
|
||||
@@ -316,6 +326,52 @@ impl GlobalToolRegistry {
|
||||
Ok(builtin.chain(runtime).chain(plugin).collect())
|
||||
}
|
||||
|
||||
#[must_use]
|
||||
pub fn actual_tool_names(&self) -> Vec<String> {
|
||||
mvp_tool_specs()
|
||||
.iter()
|
||||
.map(|spec| spec.name.to_string())
|
||||
.chain(
|
||||
self.plugin_tools
|
||||
.iter()
|
||||
.map(|tool| tool.definition().name.clone()),
|
||||
)
|
||||
.chain(self.runtime_tools.iter().map(|tool| tool.name.clone()))
|
||||
.collect()
|
||||
}
|
||||
|
||||
#[must_use]
|
||||
pub fn canonical_allowed_tool_names(&self) -> Vec<String> {
|
||||
self.actual_tool_names()
|
||||
.into_iter()
|
||||
.map(|name| canonical_allowed_tool_name(&name))
|
||||
.collect::<BTreeSet<_>>()
|
||||
.into_iter()
|
||||
.collect()
|
||||
}
|
||||
|
||||
#[must_use]
|
||||
pub fn allowed_tool_aliases(&self) -> BTreeMap<String, String> {
|
||||
let mut aliases = BTreeMap::from([
|
||||
("read".to_string(), "read_file".to_string()),
|
||||
("Read".to_string(), "read_file".to_string()),
|
||||
("write".to_string(), "write_file".to_string()),
|
||||
("Write".to_string(), "write_file".to_string()),
|
||||
("edit".to_string(), "edit_file".to_string()),
|
||||
("Edit".to_string(), "edit_file".to_string()),
|
||||
("glob".to_string(), "glob_search".to_string()),
|
||||
("Glob".to_string(), "glob_search".to_string()),
|
||||
("grep".to_string(), "grep_search".to_string()),
|
||||
("Grep".to_string(), "grep_search".to_string()),
|
||||
]);
|
||||
for actual in self.actual_tool_names() {
|
||||
let canonical = canonical_allowed_tool_name(&actual);
|
||||
if actual != canonical {
|
||||
aliases.insert(actual, canonical);
|
||||
}
|
||||
}
|
||||
aliases
|
||||
}
|
||||
#[must_use]
|
||||
pub fn has_runtime_tool(&self, name: &str) -> bool {
|
||||
self.runtime_tools.iter().any(|tool| tool.name == name)
|
||||
@@ -378,8 +434,40 @@ impl GlobalToolRegistry {
|
||||
}
|
||||
}
|
||||
|
||||
fn normalize_tool_name(value: &str) -> String {
|
||||
value.trim().replace('-', "_").to_ascii_lowercase()
|
||||
pub fn canonical_allowed_tool_name(value: &str) -> String {
|
||||
let trimmed = value.trim().replace('-', "_");
|
||||
let mut output = String::new();
|
||||
let chars = trimmed.chars().collect::<Vec<_>>();
|
||||
for (index, ch) in chars.iter().copied().enumerate() {
|
||||
if ch == '_' || ch.is_whitespace() {
|
||||
output.push('_');
|
||||
continue;
|
||||
}
|
||||
let previous = index.checked_sub(1).and_then(|i| chars.get(i)).copied();
|
||||
let next = chars.get(index + 1).copied();
|
||||
if ch.is_ascii_uppercase()
|
||||
&& index > 0
|
||||
&& !output.ends_with('_')
|
||||
&& (previous.is_some_and(|p| p.is_ascii_lowercase() || p.is_ascii_digit())
|
||||
|| next.is_some_and(|n| n.is_ascii_lowercase()))
|
||||
{
|
||||
output.push('_');
|
||||
}
|
||||
output.push(ch.to_ascii_lowercase());
|
||||
}
|
||||
output.trim_matches('_').to_string()
|
||||
}
|
||||
|
||||
fn allowed_tool_lookup_key(value: &str) -> String {
|
||||
canonical_allowed_tool_name(value).replace('_', "")
|
||||
}
|
||||
|
||||
fn format_allowed_tool_aliases(aliases: &BTreeMap<String, String>) -> String {
|
||||
aliases
|
||||
.iter()
|
||||
.map(|(alias, canonical)| format!("{alias}={canonical}"))
|
||||
.collect::<Vec<_>>()
|
||||
.join(", ")
|
||||
}
|
||||
|
||||
fn permission_mode_from_plugin(value: &str) -> Result<PermissionMode, String> {
|
||||
@@ -514,7 +602,7 @@ pub fn mvp_tool_specs() -> Vec<ToolSpec> {
|
||||
"required": ["url", "prompt"],
|
||||
"additionalProperties": false
|
||||
}),
|
||||
required_permission: PermissionMode::ReadOnly,
|
||||
required_permission: PermissionMode::DangerFullAccess,
|
||||
},
|
||||
ToolSpec {
|
||||
name: "WebSearch",
|
||||
@@ -535,7 +623,7 @@ pub fn mvp_tool_specs() -> Vec<ToolSpec> {
|
||||
"required": ["query"],
|
||||
"additionalProperties": false
|
||||
}),
|
||||
required_permission: PermissionMode::ReadOnly,
|
||||
required_permission: PermissionMode::DangerFullAccess,
|
||||
},
|
||||
ToolSpec {
|
||||
name: "TodoWrite",
|
||||
@@ -1321,8 +1409,26 @@ fn execute_tool_with_enforcer(
|
||||
maybe_enforce_permission_check_with_mode(enforcer, name, input, required_mode)?;
|
||||
run_grep_search(grep_input)
|
||||
}
|
||||
"WebFetch" => from_value::<WebFetchInput>(input).and_then(run_web_fetch),
|
||||
"WebSearch" => from_value::<WebSearchInput>(input).and_then(run_web_search),
|
||||
"WebFetch" => {
|
||||
let web_input = from_value::<WebFetchInput>(input)?;
|
||||
maybe_enforce_permission_check_with_mode(
|
||||
enforcer,
|
||||
name,
|
||||
input,
|
||||
PermissionMode::DangerFullAccess,
|
||||
)?;
|
||||
run_web_fetch(web_input)
|
||||
}
|
||||
"WebSearch" => {
|
||||
let web_input = from_value::<WebSearchInput>(input)?;
|
||||
maybe_enforce_permission_check_with_mode(
|
||||
enforcer,
|
||||
name,
|
||||
input,
|
||||
PermissionMode::DangerFullAccess,
|
||||
)?;
|
||||
run_web_search(web_input)
|
||||
}
|
||||
"TodoWrite" => from_value::<TodoWriteInput>(input).and_then(run_todo_write),
|
||||
"Skill" => from_value::<SkillInput>(input).and_then(run_skill),
|
||||
"Agent" => from_value::<AgentInput>(input).and_then(run_agent),
|
||||
@@ -4192,7 +4298,7 @@ fn allowed_tools_for_subagent(subagent_type: &str) -> BTreeSet<String> {
|
||||
"PowerShell",
|
||||
],
|
||||
};
|
||||
tools.into_iter().map(str::to_string).collect()
|
||||
tools.into_iter().map(canonical_allowed_tool_name).collect()
|
||||
}
|
||||
|
||||
fn agent_permission_policy() -> PermissionPolicy {
|
||||
@@ -5097,7 +5203,7 @@ async fn stream_with_provider(
|
||||
) -> Result<Vec<AssistantEvent>, ApiError> {
|
||||
let mut stream = client.stream_message(message_request).await?;
|
||||
let mut events = Vec::new();
|
||||
let mut pending_tools: BTreeMap<u32, (String, String, String)> = BTreeMap::new();
|
||||
let mut pending_tools: BTreeMap<u32, (String, String, String, Option<String>)> = BTreeMap::new();
|
||||
let mut pending_thinking: BTreeMap<u32, (String, Option<String>)> = BTreeMap::new();
|
||||
let mut saw_stop = false;
|
||||
|
||||
@@ -5132,7 +5238,7 @@ async fn stream_with_provider(
|
||||
}
|
||||
}
|
||||
ContentBlockDelta::InputJsonDelta { partial_json } => {
|
||||
if let Some((_, _, input)) = pending_tools.get_mut(&delta.index) {
|
||||
if let Some((_, _, input, _)) = pending_tools.get_mut(&delta.index) {
|
||||
input.push_str(&partial_json);
|
||||
}
|
||||
}
|
||||
@@ -5156,8 +5262,8 @@ async fn stream_with_provider(
|
||||
signature,
|
||||
});
|
||||
}
|
||||
if let Some((id, name, input)) = pending_tools.remove(&stop.index) {
|
||||
events.push(AssistantEvent::ToolUse { id, name, input });
|
||||
if let Some((id, name, input, thought_signature)) = pending_tools.remove(&stop.index) {
|
||||
events.push(AssistantEvent::ToolUse { id, name, input, thought_signature });
|
||||
}
|
||||
}
|
||||
ApiStreamEvent::MessageDelta(delta) => {
|
||||
@@ -5220,7 +5326,10 @@ impl SubagentToolExecutor {
|
||||
|
||||
impl ToolExecutor for SubagentToolExecutor {
|
||||
fn execute(&mut self, tool_name: &str, input: &str) -> Result<String, ToolError> {
|
||||
if !self.allowed_tools.contains(tool_name) {
|
||||
if !self
|
||||
.allowed_tools
|
||||
.contains(&canonical_allowed_tool_name(tool_name))
|
||||
{
|
||||
return Err(ToolError::new(format!(
|
||||
"tool `{tool_name}` is not enabled for this sub-agent"
|
||||
)));
|
||||
@@ -5235,7 +5344,10 @@ impl ToolExecutor for SubagentToolExecutor {
|
||||
fn tool_specs_for_allowed_tools(allowed_tools: Option<&BTreeSet<String>>) -> Vec<ToolSpec> {
|
||||
mvp_tool_specs()
|
||||
.into_iter()
|
||||
.filter(|spec| allowed_tools.is_none_or(|allowed| allowed.contains(spec.name)))
|
||||
.filter(|spec| {
|
||||
allowed_tools
|
||||
.is_none_or(|allowed| allowed.contains(&canonical_allowed_tool_name(spec.name)))
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
|
||||
@@ -5259,11 +5371,12 @@ fn convert_messages(messages: &[ConversationMessage]) -> Vec<InputMessage> {
|
||||
thinking: thinking.clone(),
|
||||
signature: signature.clone(),
|
||||
},
|
||||
ContentBlock::ToolUse { id, name, input } => InputContentBlock::ToolUse {
|
||||
ContentBlock::ToolUse { id, name, input, thought_signature } => InputContentBlock::ToolUse {
|
||||
id: id.clone(),
|
||||
name: name.clone(),
|
||||
input: serde_json::from_str(input)
|
||||
.unwrap_or_else(|_| serde_json::json!({ "raw": input })),
|
||||
thought_signature: thought_signature.clone(),
|
||||
},
|
||||
ContentBlock::ToolResult {
|
||||
tool_use_id,
|
||||
@@ -5294,7 +5407,7 @@ fn push_output_block(
|
||||
block: OutputContentBlock,
|
||||
block_index: u32,
|
||||
events: &mut Vec<AssistantEvent>,
|
||||
pending_tools: &mut BTreeMap<u32, (String, String, String)>,
|
||||
pending_tools: &mut BTreeMap<u32, (String, String, String, Option<String>)>,
|
||||
pending_thinking: &mut BTreeMap<u32, (String, Option<String>)>,
|
||||
streaming_tool_input: bool,
|
||||
) {
|
||||
@@ -5304,7 +5417,7 @@ fn push_output_block(
|
||||
events.push(AssistantEvent::TextDelta(text));
|
||||
}
|
||||
}
|
||||
OutputContentBlock::ToolUse { id, name, input } => {
|
||||
OutputContentBlock::ToolUse { id, name, input, thought_signature } => {
|
||||
let initial_input = if streaming_tool_input
|
||||
&& input.is_object()
|
||||
&& input.as_object().is_some_and(serde_json::Map::is_empty)
|
||||
@@ -5313,7 +5426,7 @@ fn push_output_block(
|
||||
} else {
|
||||
input.to_string()
|
||||
};
|
||||
pending_tools.insert(block_index, (id, name, initial_input));
|
||||
pending_tools.insert(block_index, (id, name, initial_input, thought_signature));
|
||||
}
|
||||
OutputContentBlock::Thinking {
|
||||
thinking,
|
||||
@@ -5347,8 +5460,8 @@ fn response_to_events(response: MessageResponse) -> Vec<AssistantEvent> {
|
||||
&mut pending_thinking,
|
||||
false,
|
||||
);
|
||||
if let Some((id, name, input)) = pending_tools.remove(&index) {
|
||||
events.push(AssistantEvent::ToolUse { id, name, input });
|
||||
if let Some((id, name, input, thought_signature)) = pending_tools.remove(&index) {
|
||||
events.push(AssistantEvent::ToolUse { id, name, input, thought_signature });
|
||||
}
|
||||
}
|
||||
|
||||
@@ -7585,6 +7698,29 @@ mod tests {
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn allowed_tools_normalize_to_canonical_snake_case_and_aliases_432() {
|
||||
let registry = GlobalToolRegistry::builtin();
|
||||
let allowed = registry
|
||||
.normalize_allowed_tools(&["Read,WebFetch,MCP".to_string()])
|
||||
.expect("aliases and legacy names should normalize")
|
||||
.expect("allow-list should be populated");
|
||||
assert!(allowed.contains("read_file"));
|
||||
assert!(allowed.contains("web_fetch"));
|
||||
assert!(allowed.contains("mcp"));
|
||||
assert!(!allowed.contains("Read"));
|
||||
assert!(!allowed.contains("WebFetch"));
|
||||
|
||||
let canonical = registry.canonical_allowed_tool_names();
|
||||
assert!(canonical.contains(&"web_fetch".to_string()));
|
||||
assert!(canonical.contains(&"todo_write".to_string()));
|
||||
assert!(!canonical.contains(&"WebFetch".to_string()));
|
||||
assert_eq!(
|
||||
registry.allowed_tool_aliases().get("WebFetch"),
|
||||
Some(&"web_fetch".to_string())
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn runtime_tools_extend_registry_definitions_permissions_and_search() {
|
||||
let registry = GlobalToolRegistry::builtin()
|
||||
@@ -7931,6 +8067,7 @@ mod tests {
|
||||
id: "tool-1".to_string(),
|
||||
name: "read_file".to_string(),
|
||||
input: json!({}),
|
||||
thought_signature: None,
|
||||
},
|
||||
1,
|
||||
&mut events,
|
||||
@@ -7943,6 +8080,7 @@ mod tests {
|
||||
id: "tool-2".to_string(),
|
||||
name: "grep_search".to_string(),
|
||||
input: json!({}),
|
||||
thought_signature: None,
|
||||
},
|
||||
2,
|
||||
&mut events,
|
||||
@@ -7968,6 +8106,7 @@ mod tests {
|
||||
"tool-1".to_string(),
|
||||
"read_file".to_string(),
|
||||
"{\"path\":\"src/main.rs\"}".to_string(),
|
||||
None,
|
||||
))
|
||||
);
|
||||
assert_eq!(
|
||||
@@ -7976,6 +8115,7 @@ mod tests {
|
||||
"tool-2".to_string(),
|
||||
"grep_search".to_string(),
|
||||
"{\"pattern\":\"TODO\"}".to_string(),
|
||||
None,
|
||||
))
|
||||
);
|
||||
}
|
||||
@@ -8566,7 +8706,7 @@ mod tests {
|
||||
.expect("spawn job should be captured");
|
||||
assert_eq!(captured_job.prompt, "Check tests and outstanding work.");
|
||||
assert!(captured_job.allowed_tools.contains("read_file"));
|
||||
assert!(!captured_job.allowed_tools.contains("Agent"));
|
||||
assert!(!captured_job.allowed_tools.contains("agent"));
|
||||
|
||||
let normalized = execute_tool(
|
||||
"Agent",
|
||||
@@ -9166,7 +9306,7 @@ mod tests {
|
||||
let general = allowed_tools_for_subagent("general-purpose");
|
||||
assert!(general.contains("bash"));
|
||||
assert!(general.contains("write_file"));
|
||||
assert!(!general.contains("Agent"));
|
||||
assert!(!general.contains("agent"));
|
||||
|
||||
let explore = allowed_tools_for_subagent("Explore");
|
||||
assert!(explore.contains("read_file"));
|
||||
@@ -9174,13 +9314,13 @@ mod tests {
|
||||
assert!(!explore.contains("bash"));
|
||||
|
||||
let plan = allowed_tools_for_subagent("Plan");
|
||||
assert!(plan.contains("TodoWrite"));
|
||||
assert!(plan.contains("StructuredOutput"));
|
||||
assert!(!plan.contains("Agent"));
|
||||
assert!(plan.contains("todo_write"));
|
||||
assert!(plan.contains("structured_output"));
|
||||
assert!(!plan.contains("agent"));
|
||||
|
||||
let verification = allowed_tools_for_subagent("Verification");
|
||||
assert!(verification.contains("bash"));
|
||||
assert!(verification.contains("PowerShell"));
|
||||
assert!(verification.contains("power_shell"));
|
||||
assert!(!verification.contains("write_file"));
|
||||
}
|
||||
|
||||
@@ -9223,6 +9363,7 @@ mod tests {
|
||||
id: "tool-1".to_string(),
|
||||
name: "read_file".to_string(),
|
||||
input: json!({ "path": self.input_path }).to_string(),
|
||||
thought_signature: None,
|
||||
},
|
||||
AssistantEvent::MessageStop,
|
||||
])
|
||||
@@ -10264,6 +10405,26 @@ printf 'pwsh:%s' "$1"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn given_workspace_write_enforcer_when_web_tools_then_denied() {
|
||||
let registry = workspace_write_registry();
|
||||
for (tool, input) in [
|
||||
(
|
||||
"WebFetch",
|
||||
json!({"url":"https://example.com", "prompt":"summarize"}),
|
||||
),
|
||||
("WebSearch", json!({"query":"rust language"})),
|
||||
] {
|
||||
let err = registry
|
||||
.execute(tool, &input)
|
||||
.expect_err("network tools should require explicit full access");
|
||||
assert!(
|
||||
err.contains("requires 'danger-full-access'"),
|
||||
"{tool} should require elevated mode: {err}"
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn given_workspace_write_enforcer_when_bash_uses_shell_expansion_then_denied() {
|
||||
let registry = workspace_write_registry();
|
||||
|
||||
Reference in New Issue
Block a user