Changelog
Autonomously-generated entries from agent runs. Most recent first.
feat(munchkins): launch agent runs inside cmux workspace when available (86ddc5a)
2026-05-10 19:35 PDT · feat-small · 604.6s · $4.0926
Goal: When cmux is on PATH, bun run munchkins <agent> ... should re-launch the same invocation inside a fresh cmux new-workspace and exit immediately; otherwise behavior is unchanged.
Outcome: Added packages/munchkins/src/cmux-launcher.ts exporting two pure helpers (shouldDelegateToCmux, buildCmuxCommand) plus a Bun-test suite covering delegation gating and POSIX-safe command construction. Wired the pre-check into packages/munchkins/src/index.ts ahead of the existing registry dispatch, strips --no-cmux from argv before commander sees it, and extended the scenario harness audit guard to also reject real cmux invocations.
How to test manually:
- From the repo root run
bun run lint && bun run typecheck && bun test packages/munchkins/src— all three should pass; the newcmux-launcher.test.tssuite must be green. - Run
bun run scenario— should pass; the audit guard now also banscmuxinvocations, but no scenario invokes one so there is no regression. - With
cmuxNOT on PATH (verify viawhich cmux→ empty), runbun run munchkins bug-fix --user-message=./scratch/example.md(create a tiny scratch file first). It should execute inline exactly as before — same stdout, same worktree behavior. - With
cmuxinstalled and the cmux app running, run the samebun run munchkins bug-fix --user-message=./scratch/example.md. Expect a single stdout line of the formLaunching bug-fix in cmux workspace: bug-fix-<timestamp>and the outer shell to return promptly. Open the cmux app and confirm a new workspace namedbug-fix-<timestamp>exists with the agent running inside it. - Opt-out via env: with cmux installed, run
MUNCHKINS_NO_CMUX=1 bun run munchkins bug-fix --user-message=./scratch/example.md. It should run inline (noLaunching ...line, no new workspace). - Opt-out via flag: with cmux installed, run
bun run munchkins bug-fix --user-message=./scratch/example.md --no-cmux. It should run inline, and commander must NOT error on the unknown flag (the flag is stripped before parsing). Also verifybun run munchkins bug-fix --helpdoes not list--no-cmux. - Meta/help paths stay inline with cmux installed — verify each prints normally without opening a workspace:
bun run munchkins --help,bun run munchkins,bun run munchkins daemon(cancel with Ctrl-C),bun run munchkins resume,bun run munchkins status,bun run munchkins skills install,bun run munchkins bug-fix --dry-run,bun run munchkins bug-fix --help. - Edge case for shell escaping: with cmux installed, run
bun run munchkins bug-fix --user-message="can't stop"and confirm the workspace launches successfully (the single quote in the value is POSIX-escaped via'\''inside the--commandpayload, covered bycmux-launcher.test.tsas well).
Files changed:
- packages/munchkins/package.json
- packages/munchkins/src/cmux-launcher.ts
- packages/munchkins/src/cmux-launcher.test.ts
- packages/munchkins/src/index.ts
- scenarios/lib/mock-spawn-claude.ts
feat(munchkins-core): add Prompt.withSkill() helper (07441da)
2026-05-10 19:34 PDT · feat-small · 385.0s · $2.3356
Goal: Add a Prompt.withSkill(name) helper that reads <repoRoot>/.claude/skills/<name>/SKILL.md, strips YAML frontmatter, and contributes the body to the system prompt — same slot as withSystem(path).
Outcome: Refactored Prompt's internal systemPaths: string[] into a tagged systemSources array supporting both path and skill source kinds. withSystem(path) semantics are unchanged; the new withSkill(name) queues a skill source that resolves to .claude/skills/<name>/SKILL.md at resolve() time, with a textual frontmatter strip (including one trailing blank line). Missing-skill and malformed-frontmatter errors carry actionable messages pointing at bun run munchkins install-skills and the offending path. Eight test cases in the new prompt.test.ts cover stripping, chaining, errors, composition with withSystem, and preservation of pre-change withSystem behavior.
How to test manually:
- From the repo root, run the new test file directly:
bun test packages/munchkins-core/src/builder/prompt.test.ts— expect all 8 cases to pass. - Run the broader gate:
bun run lint && bun run typecheck && bun run test— expect green; existing tests must still pass sincewithSystem(path)behavior is preserved. - Verify the skill happy path against a real on-disk file. Create a fixture and a one-liner script:
Expected:
systemPromptequals"# Demo body\n"(no frontmatter, no leading blank line). - Edge: missing skill. Run the same one-liner but with
withSkill("missing")— expect a thrown error whose message containsSkill 'missing' not foundandinstall-skills. - Edge: malformed frontmatter. Overwrite the file with
printf -- '---\nname: demo\nno close here' > /tmp/wskill-demo/.claude/skills/demo/SKILL.mdand re-run the one-liner — expect an error containingmalformed frontmatter (no closing '---' delimiter)and the absolute path to the file. - Edge: composition order. Create
/tmp/wskill-demo/a.md(AAA) and/tmp/wskill-demo/b.md(BBB), then chainnew Prompt().withSystem("/tmp/wskill-demo/a.md").withSkill("demo").withSystem("/tmp/wskill-demo/b.md").resolve("/tmp/wskill-demo")— expectsystemPromptto beAAA\n\n# Demo body\n\n\nBBB(sources joined with\n\nin call order). - Sanity: confirm existing agents still work without changes —
bun run scenarioshould pass, exercisingwithSystem(path)callers inbugfix-agent.ts,refactor-agent.ts,feat-small-agent.ts, andbugfix-then-refactor-agent.ts.
Files changed:
- packages/munchkins-core/src/builder/prompt.ts
- packages/munchkins-core/src/builder/prompt.test.ts
feat(munchkins): add SKILL.md discovery surface for default agents (bd12fe3)
2026-05-10 19:20 PDT · feat-small · 265.9s · $2.0450
Goal: Add Claude Code SKILL.md files for the three default munchkins agents (bug-fix, refactor, feat-small) and symlink them into .claude/skills/ so Claude Code discovers them as /<name>.
Outcome: Created three new SKILL.md files under packages/munchkins/skills/<name>/ with YAML frontmatter (name, description) followed by the body copied byte-for-byte from each agent's source prompt md. Added three relative symlinks under .claude/skills/ pointing at the corresponding skill directories. No agent .ts files were modified — content is duplicated with agents/<name>/prompts/<name>.md for this MVP; the later migration will collapse the duplication.
How to test manually:
- From the repo root, confirm the three new SKILL.md files exist:
Expected: all three paths print with no error.
- Verify the frontmatter is well-formed and only contains
nameanddescription:Expected: each starts with---, hasname: <slug>matching the directory,description: ...(one sentence), and closes with---. - Verify the body of each SKILL.md is byte-identical to the source prompt md (skipping the 4-line frontmatter + blank line):
Expected: all three diffs produce no output.
- Verify each symlink resolves to the expected relative target:
Expected: each prints
../../packages/munchkins/skills/<name>. - Verify the existing agents still register and run:
Expected: usage lists
bug-fix,refactor, andfeat-smallsubcommands (no regression). - Verify the source prompt files were not modified by the change:
Expected: no output (files unchanged in this commit).
- Out-of-band Claude Code discovery check: open this repo in Claude Code and confirm that typing
/bug-fix,/refactor, and/feat-smalleach surface the new skills with their full descriptions (matches the strings inpackages/munchkins/skills/<name>/SKILL.md).
Files changed:
- packages/munchkins/skills/bug-fix/SKILL.md
- packages/munchkins/skills/refactor/SKILL.md
- packages/munchkins/skills/feat-small/SKILL.md
- .claude/skills/bug-fix (symlink)
- .claude/skills/refactor (symlink)
- .claude/skills/feat-small (symlink)
feat(docs): re-orient onboarding around the new-munchkin skill (deb5605)
2026-05-10 14:10 PDT · feat-small · 458.8s · $3.2744
Goal: Re-orient the docs onboarding so the new-munchkin skill is the destination, with the bug-fix run demoted to a proof-of-life smoke test.
Outcome: Restructured docs/pages/getting-started.md into six sections (Prerequisites, Install, Proof of life, Scaffold your first agent, Next steps), added a fifth Claude Code prerequisite, surfaced bun run munchkins skills install, and compressed the artifact tree and failure recovery to one-liners. Updated docs/pages/index.mdx to drop the working-guide line, split the CTA row into Defaults vs. Build-your-own, and rewrite the Get-started gloss. Reordered docs/pages/agents/custom.md so the new-munchkin skill section leads (now 3 paragraphs covering trigger phrases, repo introspection, and create-mode outputs) and the manual path follows. Reordered docs/pages/agents/_meta.json so custom is first.
How to test manually:
- From the repo root, run
bun run docs:devand open the docs site in a browser. - Land on
/and confirm the lede now reads "The default agents are working examples…"; verify the CTA row shows three lines:Get started,Defaults (reference): …, andBuild your own: /new-munchkin skill · AgentBuilder API. The headline and proof-tail should be unchanged. - Click
Get started. Confirm the page has six top-level##headings in this exact order: Prerequisites, Install, Proof of life: run the bug-fix agent, Scaffold your first agent for this repo, Next steps. (Plus the# Getting startedtitle.) The old "Where the artifacts go" tree and "If it fails" subsection should be gone. - In Prerequisites, confirm there are five items and the fifth names Claude Code as optional.
- In Install, confirm
"munchkins": "munchkins"is the only script line shown, thebunxalternative is mentioned in one sentence, andbun run munchkins skills installappears exactly once. - In the new "Scaffold your first agent" section, confirm
/new-munchkin(with the slash) is shown in a fenced block and the three create-mode bullets are present. - Click into Agents from the sidebar. Confirm the order is
Build your own,Bug fix,Small feature,Refactor. - Open
Build your own. Confirm the first##after the title is "Scaffold with thenew-munchkinskill" and contains 3 paragraphs; the second##is "What you're building" and starts with the bridging sentence about the manual path. - Edge case: grep the rendered docs for the deleted line
This site is a working guide to the framework— it should return zero hits. Also grepgetting-started.mdto confirm/new-munchkinappears at least once andbun run munchkins skills installappears exactly once. - Run
bun run munchkins skills install --dest /tmp/munchkins-skills-testand confirm the bundled skills land at the override path — this validates that the new Install step works as documented.
Files changed:
- docs/pages/getting-started.md
- docs/pages/index.mdx
- docs/pages/agents/custom.md
- docs/pages/agents/_meta.json
docs(pages): add user-facing agent guide organized by agent (a01c8fb)
2026-05-10 13:31 PDT · feat-small · 445.7s · $7.0202
Goal: Replace the thin Rspress site with a full user-facing guide where each default agent has its own self-contained page, plus getting-started and custom-agent pages.
Outcome: Added getting-started.md, agents/{bug-fix,feat-small,refactor,custom}.md, and agents/_meta.json; rewrote index.mdx as a real landing page; updated root _meta.json to surface the new sections. The three default-agent pages each follow the 14-section skeleton (substantive content, intentional repetition for self-containment); custom.md covers every public method on AgentBuilder and Prompt. Also bumped two agent-builder.test.ts integration tests to a 30s timeout so the real-git E2E cases don't graze the default 5s limit under parallel load.
How to test manually:
- From the repo root, run
PUBLIC_DOCS=true bun run docs:buildand confirm it exits 0. Verify the four agent pages andgetting-startedare emitted underdocs/doc_build/(or wherever Rspress writes output) and thatinternal/**is excluded. - Run
bun --cwd docs run devand open the site in a browser. Confirm the sidebar order:Home,Getting started,Agents(expandable toBug fix,Small feature,Refactor,Build your own),Changelog. TheInternalsection should still render in dev mode. - From the home page, click Get started — expect to land on
/getting-started. Click Bug fix — expect/agents/bug-fix. Each default-agent page should scroll through all 14 sections in order (What it does → Worked example). - On
/agents/bug-fix, ctrl-F for--user-message,--cli,--integrate,--dry-run,--thinking,--verbose— all six must appear in the Flags table. Repeat onfeat-small.mdandrefactor.md. - On
/agents/custom, ctrl-F for eachAgentBuildermethod (option,add,addDeterministic,summaryWriter,integrate,setSandbox,rename,describe,thenRun,cron,run,runFromState) and eachPromptmethod (withSystem,withUserMessage,withUserMessageFromOption). All must be present. Confirmlaunch-munchkinandnew-munchkinare both referenced. - Search across all four new pages for
MUNCHKINS_CLI,MUNCHKINS_RUN_LOG_DIR, andMUNCHKINS_CHANGELOG_PATH— each env var should appear at least once across the set. - Confirm
bun run munchkins resume --list | --latest | <id>syntax appears in each of the three default-agent pages, andbun run munchkins daemon+.cron()appear in all three default-agent pages and incustom.md.bun run munchkins skills install [--dest <path>]should appear incustom.md. - Run
bun run lint,bun run typecheck, andbun run scenariofrom the repo root — all should pass. (The deterministic gate in CI does this automatically.) - Edge case: open
docs/pages/changelog.mdanddocs/pages/internal/**in git — confirm they are unchanged by this commit (git diff HEAD~1 -- docs/pages/changelog.md docs/pages/internal/). - Edge case: run
bun test packages/munchkins-core/src/builder/agent-builder.test.tsand confirm the two integration tests inAgentBuilder.run integration dispatch end-to-endpass without timeout warnings.
Files changed:
- docs/pages/_meta.json
- docs/pages/index.mdx
- docs/pages/getting-started.md
- docs/pages/agents/_meta.json
- docs/pages/agents/bug-fix.md
- docs/pages/agents/feat-small.md
- docs/pages/agents/refactor.md
- docs/pages/agents/custom.md
fix(munchkins): make package bunx-executable via bin entry (b630027)
2026-05-10 13:29 PDT · bug-fix · 377.0s · $3.1778
Goal: Fix bunx @serranolabs.io/munchkins skills install failing with "could not determine executable to run for package" because the package declared no bin field.
Outcome: Added a bin entry mapping munchkins to ./src/index.ts in packages/munchkins/package.json, bumped the version from 0.1.1 to 0.1.2, and prepended a #!/usr/bin/env bun shebang to packages/munchkins/src/index.ts so the entrypoint is directly runnable. The existing if (import.meta.main) dispatch in src/index.ts remains the single source of truth — no wrapper file was introduced and no dispatch logic changed. Also extended the timeout on two integration-dispatch end-to-end tests in agent-builder.test.ts to 30s to accommodate the ~10 git subprocess invocations per run on slower/concurrent CI environments. Publishing was intentionally not performed; the user can run that as a separate manual step. Consider a follow-up to document bunx @serranolabs.io/munchkins … usage in the README.
Files changed:
- packages/munchkins/package.json
- packages/munchkins/src/index.ts
- packages/munchkins-core/src/builder/agent-builder.test.ts
feat(agent-cli): wait for Claude rate-limit reset and retry once (c7e4fa8)
2026-05-10 13:00 PDT · feat-small · 742.8s · $4.0512
Goal: When the Claude CLI exits because the user's usage limit was hit, sleep until the reported reset time and retry the spawn exactly once instead of failing the whole pipeline.
Outcome: Switched the base AgentCLI.runJsonStream from stderr: "inherit" to stderr: "pipe" while forwarding bytes to process.stderr in real time, and added captured stderr to SpawnResult. Introduced file-local isLimitHit, parseResetTimestamp, and sleepUntil helpers in agent-cli.ts. ClaudeCLI.spawn() now inspects the result, sleeps until the parsed reset (unix-seconds or HH:MM, today/tomorrow), logs a single ⏳ Claude limit hit, waiting until <HH:MM:SS local> line to stderr, and retries the same spawn exactly once. CodexCLI.spawn is untouched.
How to test manually:
- From the repo root, run the new unit tests to exercise all branches (unix-seconds retry, HH:MM, parse failure, abort mid-wait, double-limit no-third-spawn, stderr-only detection, out-of-range HH:MM):
Expect every test in the
ClaudeCLI rate-limit retrydescribe block to pass in well under a second. - Verify the repo-wide gates the deterministic loop runs:
All three should be green.
- Out-of-band manual check that the unit tests don't cover — confirm stderr is still forwarded live (the change from
inherittopipeis the riskiest part). In a throwaway script, drive a Claude spawn that prints to stderr and watch it appear in your terminal in real time:You should see Claude's stderr stream appear as it's produced (not buffered until exit). If you have a real rate-limited account handy, trigger a limit and confirm you see the single⏳ Claude limit hit, waiting until …line followed by exactly one retry after the reset. - Edge case — abort during wait. In a REPL, kick off a spawn with a fake limit message in stderr that points ~60s into the future via a stub of
runJsonStream, then abort the controller after 50ms;spawn()should reject promptly with the abort reason and not fire a second spawn. (This is exactly what theaborted abortSignal during the waittest asserts; rerun that single test in watch mode if you want to poke at it:bun test packages/munchkins-core/src/builder/agent-cli.test.ts -t "aborted abortSignal".)
Files changed:
- packages/munchkins-core/src/builder/agent-cli.ts
- packages/munchkins-core/src/builder/agent-cli.test.ts
feat(munchkins-core): add resume subcommand for interrupted runs
2026-05-09 20:11 PDT · feat-small · 1532.9s · $15.7377
Goal: Add bun run munchkins resume [runId] so an interrupted run (rate limit, Ctrl-C, OOM) can pick up from the last completed step, including resuming the underlying Claude/Codex session.
Outcome: Each run now writes an incremental state.json to its .munchkins/runs/<slug>-<uuid>/ directory tracking phase, per-step status, and captured CLI session ids. SandboxFactory was reshaped into an object with create() + rehydrate(); gitWorktreeSandbox implements rehydrate() against an existing worktree with hard-fail preconditions for missing worktree/branch and a logged warning for dirty/advanced state. AgentBuilder exposes runFromState() (called by both fresh runs and the new resume orchestrator); ClaudeCLI/CodexCLI capture session_id from their JSONL streams and emit --resume <id> (Claude) or codex resume <id> exec (Codex) with a continue message when given a resumeSessionId. A new runResume(argv) orchestrator wired into packages/munchkins/src/index.ts handles --list, --latest, full runId, and unique-slug resolution; RunLog.resume(dir) replays events.jsonl so token/cost totals survive across the resume boundary.
How to test manually:
- From the repo root, build/install once:
bun install. - Confirm the new subcommand surfaces with no resumable runs:
bun run munchkins resume --list— should printno resumable runsand exit 0. Same with no args:bun run munchkins resume. - Kick off a real bug-fix run against a throwaway change so it produces a
state.json:bun run munchkins bug-fix --user-message="add a no-op comment to packages/munchkins-core/src/index.ts". Once you see the first agent step actually start (the banner prints[step 1/N agent]), interrupt with Ctrl-C. - Inspect the preserved run dir:
ls .munchkins/runs/thencat .munchkins/runs/<slug>-<uuid>/state.json— verifyphaseis still"steps"(not"done"/"failed"), step 0 hasstatus: "in-progress", and (if Claude got far enough to emit its init event)sessionIdis populated. - List resumables:
bun run munchkins resume --list— should print a table row for that run withrunId,agent,slug,started-at,phase, andcompleted/totalstep counts. - Resume by full id:
bun run munchkins resume <slug>-<uuid>. Confirm in output that the worktree path is reused (no newagent/...branch is created), and that step 0 is re-attempted viaclaude --resume <id>(look for the continue message in verbose output by re-running with--verbose). When it finishes,state.json.phaseshould be"done". - Slug resolution edge case — happy path: with one resumable, run
bun run munchkins resume <slug>(no uuid). Should resolve and run. - Slug resolution edge case — ambiguous: leave two interrupted runs with the same slug, run
bun run munchkins resume <slug>— should exit 1 and print both full runIds in the error. --latest: with multiple resumables, runbun run munchkins resume --latestand confirm it picks the most recent bystartedAt.- Rehydrate hard-fail: from another resumable run dir,
rm -rfthe worktree it points at, thenbun run munchkins resume <runId>— should exit 1 withWorktree at <path> no longer exists. Likewise delete its branch (git branch -D agent/...) for the deleted-branch failure. - Session-not-found fallback: edit a
state.json'ssteps[0].sessionIdto"definitely-bogus-id", then resume — Claude returns a session-not-found error and the builder should log[resume] session ... no longer available; restarting step with worktree-state hint.and re-spawn fresh with agit status/git diff --stat HEADpreamble baked into the system prompt. - Token accounting: after a resumed run completes, open
.munchkins/runs/<slug>-<uuid>/summary.jsonand confirm tokens-in/out and cost reflect both the original run's events AND the resumed CLI calls (i.e. greater than what a fresh run alone would have produced). - Regression check on the unchanged path: run a fresh
bun run munchkins bug-fix --user-message="..."to completion and confirm it still works end-to-end with no prompts about resume, no extra args, and produces the same shape of summary as before. - Automated tests cover the unit-level behavior —
bun run test packages/munchkins-core/src/resumeandbun run test packages/munchkins-core/src/sandbox/sandbox.test.ts— but step 11 above (real session-not-found against the liveclaudeCLI) is the one out-of-band check the tests don't perform.
Files changed:
- packages/munchkins-core/src/builder/agent-builder.ts
- packages/munchkins-core/src/builder/agent-builder.test.ts
- packages/munchkins-core/src/builder/agent-cli.ts
- packages/munchkins-core/src/builder/agent-cli.test.ts
- packages/munchkins-core/src/index.ts
- packages/munchkins-core/src/resume/index.ts
- packages/munchkins-core/src/resume/run-resume.ts
- packages/munchkins-core/src/resume/run-resume.test.ts
- packages/munchkins-core/src/resume/run-state.ts
- packages/munchkins-core/src/resume/run-state.test.ts
- packages/munchkins-core/src/resume/test-fixtures.ts
- packages/munchkins-core/src/run-log.ts
- packages/munchkins-core/src/run-log.test.ts
- packages/munchkins-core/src/sandbox/index.ts
- packages/munchkins-core/src/sandbox/sandbox.ts
- packages/munchkins-core/src/sandbox/sandbox.test.ts
- packages/munchkins/src/index.ts
fix(feat-small): swap new-surface section for manual-test recipe
2026-05-09 19:49 PDT · bug-fix · 261.2s · $1.6490
Goal: Replace the feat-small summary writer's New surface catalog with a How to test manually section so changelog entries give operators a concrete smoke-test recipe.
Outcome: Rewrote packages/munchkins/agents/feat-small/prompts/summary-writer.md. Deleted the What "new surface" means section and its category bullets, retargeted the framing paragraph at giving operators a manual smoke-test recipe, swapped the template's New surface block (and the now-redundant Lines added line) for a required How to test manually block with guidance on covering happy path plus an edge case and a fallback string for non-runtime features, and kept the Files changed block. Output contract, JSON envelope, no-headings-in-body rule, and feat(<scope>): <subject> commit format are unchanged. Default and refactor summary writers were not touched.
Files changed:
- packages/munchkins/agents/feat-small/prompts/summary-writer.md
fix(builder): tolerate duplicate JSON envelope from summary writer
2026-05-09 19:17 PDT · bug-fix · 477.2s · $3.4697
Goal: Fix the summary writer JSON parser so it survives the model emitting the envelope twice in one response — a production failure mode (run agent-composition-integration-df872018) where the greedy regex spans both objects and JSON.parse chokes on the gap.
Outcome: Replaced the regex-based extraction in runSummaryWriter with a string-aware balanced-brace forward scan extracted into a new parseSummaryWriterJson helper. The scan enumerates every top-level {...} object in the response, then iterates them last-to-first and returns the first one that parses and has both commitMessage and markdown as strings. Trailing ``` fence handling and existing type checks are preserved; backward-compatible for the single-envelope case. Added 12 unit tests covering the duplicate-emit regression, prose preambles, fenced output, braces inside string literals, escaped quotes, missing/wrong-typed fields, and a realistic fixture.
Files changed:
- packages/munchkins-core/src/builder/agent-builder.ts
- packages/munchkins-core/src/builder/parse-summary-writer-json.ts
- packages/munchkins-core/src/builder/parse-summary-writer-json.test.ts
Notes for future debuggers: The brace scanner is intentionally string-aware (tracks inString + isEscaped) so a JSON string value like "see {a, b, c}" or "escaped \"quote\"" doesn't desync the depth counter. If the scanner ever encounters an unbalanced {, it stops scanning rather than emitting a partial candidate. Failure reason distinguishes "no JSON object found" from "N candidate object(s) inspected" so harness logs point at the right diagnosis.
feat(core): pluggable integration strategies + agent composition
2026-05-09 16:00 PDT · feat-small · 1264.1s · $13.0129
Goal: Add pluggable integration strategies (integrateMerge / integratePR) and AgentBuilder composition (.thenRun()), wiring operator > author > default precedence and moving integration out of sandbox teardown into the run layer.
Outcome: Introduced IntegrationStrategy with integrateMerge (delegates to existing integrateBranch) and integratePR (rebase + push + open PR via gh/glab, with auto provider detection). Added .integrate(), .thenRun(), .setSandbox(), .rename(), .describe() builder methods; .thenRun() returns a new builder with steps concatenated and sandbox/summaryWriter/integration stripped per the S3 contract. Run-layer dispatch enforces precedence and surfaces a clear error for unknown modes; gitWorktreeSandbox.teardown is now cleanup-only. Added --integrate <mode> CLI flag, an example bugfix-then-refactor agent, and a composition scenario.
Note on this entry: the run's summary writer emitted the JSON envelope twice and the harness's greedy regex parser failed to parse it, so this entry is hand-authored from the summary writer's first emitted block (verbatim). The agent's actual work — the diff, the tests, and the deterministic checks — completed cleanly before the parser tripped. A follow-up bug-fix to the harness regex will harden the parser against duplicate-emit hiccups.
New surface:
- Export:
integrateMerge()(inpackages/munchkins-core/src/integrate.ts) - Export:
integratePR(opts?)(inpackages/munchkins-core/src/integrate.ts) - Export:
detectProvider(repoRoot, remote)(inpackages/munchkins-core/src/integrate.ts) - Export:
IntegrationStrategy,IntegrationContext,IntegrationResult,IntegratePROptionstypes (inpackages/munchkins-core/src/integrate.ts) - Export:
AgentBuilder.integrate(strategy)method (inpackages/munchkins-core/src/builder/agent-builder.ts) - Export:
AgentBuilder.thenRun(other)method (inpackages/munchkins-core/src/builder/agent-builder.ts) - Export:
AgentBuilder.setSandbox(factory)method (inpackages/munchkins-core/src/builder/agent-builder.ts) - Export:
AgentBuilder.rename(name)method (inpackages/munchkins-core/src/builder/agent-builder.ts) - Export:
AgentBuilder.describe(description)method (inpackages/munchkins-core/src/builder/agent-builder.ts) - Export:
RunLog.getAgentSummaryMarkdown(),RunLog.getAgentSummaryCommitMessage()(inpackages/munchkins-core/src/run-log.ts) - Example agent:
bugfix-then-refactor(inpackages/munchkins/agents/bugfix-then-refactor/bugfix-then-refactor-agent.ts) - CLI flag:
--integrate <mode>on every registered agent (registered inpackages/munchkins-core/src/registry/registry.ts) - Env var:
__MUNCHKINS_OPT_integrate - New scenario:
scenarios/composition.ts - Other:
PassOpts.prUrlfield onRunLogger.pass()surfaces the PR URL in quiet and verbose output (inpackages/munchkins-core/src/builder/run-logger.ts) - Removed:
IntegrateContextexport andTeardownContext.integratefield (sandbox teardown is now cleanup-only)
Files changed:
- package.json
- packages/munchkins-core/src/builder/agent-builder.test.ts (new)
- packages/munchkins-core/src/builder/agent-builder.ts
- packages/munchkins-core/src/builder/run-logger.test.ts (new)
- packages/munchkins-core/src/builder/run-logger.ts
- packages/munchkins-core/src/index.ts
- packages/munchkins-core/src/integrate.test.ts
- packages/munchkins-core/src/integrate.ts
- packages/munchkins-core/src/registry/registry.ts
- packages/munchkins-core/src/run-log.test.ts
- packages/munchkins-core/src/run-log.ts
- packages/munchkins-core/src/sandbox/index.ts
- packages/munchkins-core/src/sandbox/sandbox.test.ts
- packages/munchkins-core/src/sandbox/sandbox.ts
- packages/munchkins/agents/bugfix-then-refactor/bugfix-then-refactor-agent.ts (new)
- scenarios/composition.ts (new)
feat(scheduler): add cron support for AgentBuilder + daemon subcommand
2026-05-09 15:25 PDT · feat-small · 642.2s · $6.0585
Goal: Add per-agent cron scheduling via a new .cron() builder method and a bun run munchkins daemon entrypoint that arms timers per cronned builder and fires builder.run() at each tick.
Outcome: AgentBuilder now carries an optional CronConfig set via .cron(spec, { userMessage, verbosity }) and exposed via getCron(); calling .cron() twice throws naming the agent. A new scheduler/daemon.ts module collects cronned builders from a registry, renders a column-aligned startup table with ISO + humanized next-tick offsets via cron-parser, and arms one setTimeout per builder that resets per-tick env (__MUNCHKINS_OPT_verbose / __MUNCHKINS_OPT_thinking / __MUNCHKINS_OPT_userMessage) before each run, re-arming in finally. The munchkins entrypoint branches on process.argv[2] === "daemon" to invoke runDaemon() ahead of registry.cli(). No default agent is cronned; the API ships dormant.
New surface:
- Export:
AgentBuilder.prototype.cron(spec, opts)(inpackages/munchkins-core/src/builder/agent-builder.ts) - Export:
AgentBuilder.prototype.getCron()(inpackages/munchkins-core/src/builder/agent-builder.ts) - Export: type
Verbosity(inpackages/munchkins-core/src/builder/agent-builder.ts, re-exported frombuilder/index.tsandsrc/index.ts) - Export: interface
CronConfig(inpackages/munchkins-core/src/builder/agent-builder.ts, re-exported frombuilder/index.tsandsrc/index.ts) - Export:
runDaemon(opts?)(inpackages/munchkins-core/src/scheduler/daemon.ts, re-exported fromscheduler/index.tsandsrc/index.ts) - Export:
applyTickEnv(cfg)(inpackages/munchkins-core/src/scheduler/daemon.ts) - Export:
collectCronnedBuilders(registry)(inpackages/munchkins-core/src/scheduler/daemon.ts) - Export: interface
CronnedBuilder(inpackages/munchkins-core/src/scheduler/daemon.ts) - Export: interface
RunDaemonOptions(inpackages/munchkins-core/src/scheduler/daemon.ts) - CLI flag:
daemonsubcommand on themunchkinsentrypoint (registered inpackages/munchkins/src/index.tsas a pre-registry.cli()argv branch) - Env var:
__MUNCHKINS_OPT_userMessage(set per tick byapplyTickEnv) - Env var:
__MUNCHKINS_OPT_verbose(set per tick when verbosity is"verbose") - Env var:
__MUNCHKINS_OPT_thinking(set per tick when verbosity is"thinking") - New file:
packages/munchkins-core/src/scheduler/daemon.ts - New file:
packages/munchkins-core/src/scheduler/index.ts - New file:
packages/munchkins-core/src/scheduler/daemon.test.ts - New package export path:
@serranolabs.io/munchkins-core/scheduler(added inpackages/munchkins-core/package.json) - Other:
cron-parser@^5.5.0added as a dependency of@serranolabs.io/munchkins-core
Lines added: +407 (across 8 files)
Files changed:
- bun.lock
- packages/munchkins-core/package.json
- packages/munchkins-core/src/builder/agent-builder.ts
- packages/munchkins-core/src/builder/index.ts
- packages/munchkins-core/src/index.ts
- packages/munchkins-core/src/scheduler/daemon.ts
- packages/munchkins-core/src/scheduler/index.ts
- packages/munchkins-core/src/scheduler/daemon.test.ts
- packages/munchkins/src/index.ts
docs(summary-writer): forbid markdown headings in changelog body
2026-05-09 15:15 PDT · bug-fix · 321.1s · $2.0794
Goal: Stop the default summary-writer prompt from producing ## headings inside changelog entry bodies, which collide with the harness-emitted entry title and break the document hierarchy.
Outcome: Updated the Output contract section of packages/munchkins/agents/_shared/prompts/summary-writer.md to explicitly prohibit any Markdown headings (#, ##, ###, etc.) inside the markdown field. Promoted the bold inline labels (**Goal:**, **Outcome:**, **Files changed:**) from a suggested skeleton to a required body shape, and added a side-by-side correct/wrong example so the contrast is visual. The JSON output contract (commitMessage + markdown keys, no code fences) is unchanged, and the harness-side assembly in RunLog.prependChangelogIn was not touched. Per-agent summary-writer prompts under feat-small and refactor were left alone per the task constraints (they were out of scope unless they shared the defect).
Files changed:
- packages/munchkins/agents/_shared/prompts/summary-writer.md
Future changelog entries produced by agents using this default prompt should contain zero #-prefixed lines in the body while still carrying the bold inline labels for Goal / Outcome / Files changed.
fix(munchkins-core): detect merge-fixer progress via working-tree content
2026-05-09 15:04 PDT · bug-fix · 451.4s · $2.6168
Goal
Fix the merge-fixer harness in integrate.ts so it stops misclassifying every real conflict resolution as "no progress."
Outcome
Replaced the index-based post-fixer progress check with a content-based check using git diff --check. The new logic detects leftover conflict markers in the working tree, bails out if the fixer wrote markers to files outside the conflict set, fails with a no-progress reason only when every conflicted file still has markers, and stages only files verified clean (per-file git add, never git add -A). Added a small abortAndFail helper to deduplicate the bail-out paths in rebaseAndResolve. Introduced packages/munchkins-core/src/integrate.test.ts with five real-integrateBranch tests covering the happy path, no-progress, partial-progress (regression test that proves the outer loop re-prompts the fixer), CLI failure, and the clean-rebase no-spawn case.
Files changed
packages/munchkins-core/src/integrate.ts— swap index-based stillConflicted check forfilesWithLeftoverMarkers(new helper usinggit diff --check); add stray-file guard; switch staging to per-filegit add; extract repeatedabortRebase+ return into a localabortAndFailhelper.packages/munchkins-core/src/integrate.test.ts— new file. mkdtemp +git init -b mainrepos,gitEnv()helper withTEST_GIT_IDENTITY,StubFixerCLIwith constructor-injected handler andFailIfSpawnedCLI, plus sharedsetupSingleFileConflict/setupTwoFileConflictsetup helpers driving the five required scenarios.
Notes for future readers
- The partial-progress test (#3) is the load-bearing regression: it asserts
cli.invocations === 2andfixerIters === 2, which is only achievable if the harness correctly recognizes per-file forward progress and re-enters the fixer for the remaining unresolved file. listConflictedFilesis intentionally retained for the initial per-iteration enumeration before the fixer runs; only the post-fixer check moved to a content-based detector.
refactor(agent-builder): extract RunLogger to centralize verbose/quiet formatting
2026-05-09 14:25 PDT · refactor · 590.3s · $3.3619
Goal: Extract every verbose/quiet branch site in AgentBuilder.run() (and its runAgent / runDeterministic / runSummaryWriter helpers) into a single RunLogger class so run() orchestrates and RunLogger formats — byte-identical output in both modes.
Outcome: Created packages/munchkins-core/src/builder/run-logger.ts housing the new RunLogger class plus the C color table, banner(), and printInvocation() helpers (moved out of agent-builder.ts so there's a single home for terminal formatting). AgentBuilder imports them back, drops the inline if (verbose) … else … blocks at all twelve call sites, and threads a single RunLogger instance into the helpers in place of the bare verbose: boolean parameter. streamOutput stays where it was — it's a separate concern (Claude streaming) — and the env reads at the top of run() are untouched.
Refactor type: extraction
Lines changed:
Total: 688 → 771 (Δ +83)
Files changed:
- packages/munchkins-core/src/builder/agent-builder.ts
- packages/munchkins-core/src/builder/run-logger.ts
- packages/munchkins-core/src/index.ts
Call sites that now share the extracted helpers:
RunLogger.stepResultOk()— called by bothrunAgentandrunSummaryWriter, replacing the duplicated quiet-mode " ok (Xs, in→out)\n" line that previously existed inline in each.RunLogger.starting()/stepBanner()/pass()/fail()/logDir()/integrationLine()— collapse the six verbose-vs-quiet branch pairs inAgentBuilder.run()into single calls.RunLogger.summaryWriterEmptyDiff()/summaryWriterStart()— collapse the two branch pairs inrunSummaryWriter.RunLogger.agentStepStart()— collapses the single branch pair inrunAgent.RunLogger.deterministicIterationHeader()/deterministicCommand()/deterministicCommandOutput()/deterministicQuietSummary()/fixerStart()/fixerResult()— collapse the six branch pairs inrunDeterministic.banner()andprintInvocation()— now defined once inrun-logger.tsand re-imported byagent-builder.tsinstead of being module-local helpers.
refactor(munchkins/agents): extract getAgentPromptsDir helper
2026-05-09 14:12 PDT · refactor · 243.6s · $1.6386
Goal: Eliminate the 4× duplicated dirname(fileURLToPath(import.meta.url)) + join("prompts") incantation across bugfix-agent.ts, refactor-agent.ts, feat-small-agent.ts, and presets.ts by adding a single getAgentPromptsDir(importUrl) helper to presets.ts.
Outcome: Added getAgentPromptsDir(importUrl: string) to packages/munchkins/agents/_shared/presets.ts and re-exported it. The shared SHARED_PROMPTS constant and all three agent files (bugfix, feat-small, refactor) now call the helper instead of recomputing dirname(fileURLToPath(import.meta.url)) inline. The node:url import was dropped from the three agent files; only join is still pulled from node:path where needed alongside the helper. The trailing import-order delta in packages/munchkins-core/src/index.ts is incidental cleanup from the biome lint pass that ran with this refactor.
Refactor type: extraction
Lines changed:
Total: 182 → 190 (Δ +8)
Files changed:
- packages/munchkins-core/src/index.ts
- packages/munchkins/agents/_shared/presets.ts
- packages/munchkins/agents/bugfix/bugfix-agent.ts
- packages/munchkins/agents/feat-small/feat-small-agent.ts
- packages/munchkins/agents/refactor/refactor-agent.ts
Call sites now sharing getAgentPromptsDir:
packages/munchkins/agents/_shared/presets.ts— defines and consumes it forSHARED_PROMPTSpackages/munchkins/agents/bugfix/bugfix-agent.ts—const PROMPTS = getAgentPromptsDir(import.meta.url)packages/munchkins/agents/feat-small/feat-small-agent.ts—const PROMPTS = getAgentPromptsDir(import.meta.url)packages/munchkins/agents/refactor/refactor-agent.ts—const PROMPTS = getAgentPromptsDir(import.meta.url)
Net line count grew by 8 because the helper is defined once and each agent still needs an import line for it; the win is that the dirname(fileURLToPath(...)) pattern is no longer repeated and there is now a single place to change how agent prompt directories are resolved.
docs: add README with concise repo intro and usage
2026-05-08 21:21 PDT · feat-small · 300.9s · $1.7630
Goal: Add a concise, compelling README with Usage upfront, then push.
Outcome: Created a new top-level README.md that opens with a one-paragraph pitch, then leads with Usage (install + invocation + examples), followed by default agents, backend selection, per-run lifecycle, repo layout, a build-your-own-agent snippet, requirements, and a pointer to AGENTS.md. Also tightened docs/tsconfig.json to use @types/bun instead of node.
New surface:
- New file:
README.md
Lines added: +87 (across 2 files)
Files changed:
- README.md
- docs/tsconfig.json
feat(core): add switchable agent CLI backend (claude/codex)
2026-05-08 20:59 PDT · feat-small · 671.0s · $6.1088
Goal: Add a process-wide backend switch so operators can run any agent against either the claude CLI (default) or the codex CLI, selected by --cli flag or MUNCHKINS_CLI env var.
Outcome: Introduced an abstract AgentCLI base class with ClaudeCLI and CodexCLI subclasses behind a single AgentCLI.fromEnv() seam. The existing spawnClaude export collapses to a one-liner that delegates to AgentCLI.fromEnv().spawn(), preserving the harness mock seam and all 3 internal call sites unchanged. Cost tracking becomes optional end-to-end (AgentUsage.costUsd?, RunSummary.costUsd?, RunLog.getCostUsd() returns number | undefined) so Codex's missing cost data renders as — honestly instead of being faked as $0.0000. Codex's missing system-prompt flag is handled by prepending ## System\n…\n\n## Task\n… as a single positional argument.
New surface:
- Export:
AgentCLIabstract class (inpackages/munchkins-core/src/builder/agent-cli.ts) - Export:
ClaudeCLIclass withbuildArgs()andspawn()(inpackages/munchkins-core/src/builder/agent-cli.ts) - Export:
CodexCLIclass withbuildArgs()andspawn()(inpackages/munchkins-core/src/builder/agent-cli.ts) - Export:
AgentCLI.fromEnv()static factory (inpackages/munchkins-core/src/builder/agent-cli.ts) - Export: types
SpawnOptions,SpawnResult,AgentUsage,AgentCLIName(inpackages/munchkins-core/src/builder/agent-cli.ts, re-exported frompackages/munchkins-core/src/builder/index.ts) - CLI flag:
--cli <cli>on every registered agent subcommand (registered inpackages/munchkins-core/src/registry/registry.ts) - Env var:
MUNCHKINS_CLI(public; read byAgentCLI.fromEnv()) - Env var:
__MUNCHKINS_OPT_cli(internal flag-bridge set by the registry; takes priority overMUNCHKINS_CLI) - New file:
packages/munchkins-core/src/builder/agent-cli.ts - New file:
packages/munchkins-core/src/builder/agent-cli.test.ts
Lines added: +484 (across 8 files)
Files changed:
- AGENTS.md
- packages/munchkins-core/src/builder/agent-builder.ts
- packages/munchkins-core/src/builder/agent-cli.ts (new)
- packages/munchkins-core/src/builder/agent-cli.test.ts (new)
- packages/munchkins-core/src/builder/index.ts
- packages/munchkins-core/src/builder/spawn-claude.ts
- packages/munchkins-core/src/registry/registry.ts
- packages/munchkins-core/src/registry/registry.test.ts
- packages/munchkins-core/src/run-log.ts
- packages/munchkins-core/src/run-log.test.ts
feat(core): human-readable run names from LLM slug
2026-05-08 20:29 PDT · feat-small · 1077.4s · $9.3383
Goal: Replace timestamp-based run-log directory names and worktree branch names with LLM-generated, human-readable slugs derived from the user's task description.
Outcome: Added a slug-derivation pipeline (Haiku LLM call with retry + deterministic kebab fallback) that feeds both RunLog directory names and worktree branch names. Run dirs are now <slug>-<uuid8> and branches are agent/<slug>-<uuid8>, with the slug capped at 30 chars. Slug derivation runs in parallel with sandbox creation and the worktree branch is renamed once the slug resolves; fallbacks are recorded as slug-fallback events in events.jsonl. The scenario harness recognizes the new naming via a Haiku-aware mock branch.
New surface:
- Export:
sanitize(raw)(inpackages/munchkins-core/src/builder/slug.ts) - Export:
deriveSlugDeterministic(text)(inpackages/munchkins-core/src/builder/slug.ts) - Export:
getSlugWithRetry(userMessage, opts?)(inpackages/munchkins-core/src/builder/slug.ts) - Export:
SLUG_MAXconstant (inpackages/munchkins-core/src/builder/slug.ts) - Export:
SlugResulttype (inpackages/munchkins-core/src/builder/slug.ts) - Export:
SlugFallbacktype (inpackages/munchkins-core/src/builder/slug.ts) - Export:
renameBranch(oldBranch, newBranch, repoRoot)(inpackages/munchkins-core/src/worktree.ts) - Export:
RunLog.recordEvent(event)public method (inpackages/munchkins-core/src/run-log.ts) - Export:
getSlugOutput()(inscenarios/lib/mock-spawn-claude.ts) - Export: New
slug?: stringoption onRunLogconstructor (inpackages/munchkins-core/src/run-log.ts) - Export: New optional
branchNameparameter oncreateWorktree(inpackages/munchkins-core/src/worktree.ts) - Export: New
model,disallowedTools,abortSignalfields onSpawnClaudeOptions(inpackages/munchkins-core/src/builder/spawn-claude.ts) - New file:
packages/munchkins-core/src/builder/slug.ts - New file:
packages/munchkins-core/src/builder/slug.test.ts - New file:
packages/munchkins-core/src/run-log.test.ts - New file:
packages/munchkins-core/src/worktree.test.ts - Other: New
slug-fallbackevent type written toevents.jsonl(recorded fromagent-builder.ts) - Other:
claudeCLI now invoked with--modeland--disallowedToolsflags when those options are set (inpackages/munchkins-core/src/builder/spawn-claude.ts)
Lines added: +362 (across 12 files)
Files changed:
- packages/munchkins-core/src/builder/agent-builder.ts
- packages/munchkins-core/src/builder/index.ts
- packages/munchkins-core/src/builder/slug.ts
- packages/munchkins-core/src/builder/slug.test.ts
- packages/munchkins-core/src/builder/spawn-claude.ts
- packages/munchkins-core/src/index.ts
- packages/munchkins-core/src/run-log.ts
- packages/munchkins-core/src/run-log.test.ts
- packages/munchkins-core/src/sandbox/sandbox.ts
- packages/munchkins-core/src/sandbox/sandbox.test.ts
- packages/munchkins-core/src/worktree.ts
- packages/munchkins-core/src/worktree.test.ts
- scenarios/index.ts
- scenarios/lib/mock-spawn-claude.ts
refactor: extract duplicated helpers across builder, run-log, scenarios
2026-05-08 20:11 PDT · refactor · 408.3s · $3.3636
Goal: Fix any refactoring opportunities found across the codebase.
Outcome: Pulled four sets of duplicated inline logic into named helpers/constants. In agent-builder.ts, the user-message resolution and summary-writer prompt construction were duplicated across _buildSummaryWriterUserPrompt-equivalent call sites; these are now resolveOriginalGoal() and buildSummaryWriterUserPrompt(). In run-log.ts, the command-entry log formatter and env-path resolution logic were each duplicated; extracted as formatCommandEntries() and resolveEnvPath(). In sandbox.test.ts, the git identity env block was inlined twice; extracted to TEST_GIT_IDENTITY constant and gitEnv() helper. In scenarios/index.ts, six near-identical fail-result objects were collapsed into a single failResult() factory closure. Net line count drops, but the primary value is single source of truth for each pattern.
Refactor type: extraction
Lines changed:
Total: 1527 → 1461 (Δ −66)
Files changed:
- packages/munchkins-core/src/builder/agent-builder.ts
- packages/munchkins-core/src/run-log.ts
- packages/munchkins-core/src/sandbox/sandbox.test.ts
- scenarios/index.ts
Extracted helpers and their call sites:
resolveOriginalGoal(repoRoot)inagent-builder.ts— called from the runtime summary-writer phase (~line 317) and the resolved-prompt preview phase (~line 442).buildSummaryWriterUserPrompt(originalGoal, diffSection)inagent-builder.ts— called from the same two phases as above.formatCommandEntries(entries)inrun-log.ts— called fromrecordDeterministic(~line 206) andrecordFinalize(~line 244).resolveEnvPath(envValue, fallback, repoRoot)inrun-log.ts— called from theRunLogconstructor (run-log dir) and_prependChangelog(changelog path).TEST_GIT_IDENTITYconstant +gitEnv()helper insandbox.test.ts— used bycreateRepo()andcommit().failResult(phase, message, opts?)closure inscenarios/index.ts— replaces six inline fail-shaped returns covering registry-miss, agent failure, mock-call audit, claude-spawn audit, cleanup assertion, artifact assertion, and the catch-all setup error.
feat(cli): add --thinking flag for mid verbosity level
2026-05-08 19:58 PDT · feat-small · 582.6s · $4.0524
Goal: Add a middle verbosity level called --thinking that sits between the default (terse) and --verbose (full), streaming Claude output without the boxed prompt prefaces.
Outcome: Registered a new --thinking Commander option on every agent subcommand that sets __MUNCHKINS_OPT_thinking=true. AgentBuilder.run() now reads both verbose and thinking env vars and computes a streamOutput flag (verbose || thinking) that is threaded through runAgent, runDeterministic, and the writer phase to control whether spawnClaude streams. Boxed printInvocation output, phase banners, and per-command stdout remain gated on the existing verbose flag, so --thinking only unlocks streaming. Updated --verbose help text to mention the three levels and added registry tests covering option registration and env-var wiring.
New surface:
- CLI flag:
--thinkingon every registered agent subcommand (registered inpackages/munchkins-core/src/registry/registry.ts) - Env var:
__MUNCHKINS_OPT_thinking - New file:
packages/munchkins-core/src/registry/registry.test.ts
Lines added: +124 (across 4 files)
Files changed:
- packages/munchkins-core/src/builder/agent-builder.ts
- packages/munchkins-core/src/registry/registry.ts
- packages/munchkins-core/src/registry/registry.test.ts
- packages/munchkins-core/src/sandbox/sandbox.ts
refactor(run-log): extract _writeClaudeCall helper
2026-05-08 19:02 PDT · refactor · 267.4s · $1.6061
Goal: Deduplicate three near-identical Claude-call writers in RunLog (agentStep, summaryStep, fixerInvocation) by extracting a private helper.
Outcome: Added _writeClaudeCall(prefix, kind, systemPrompt, userPrompt, response, exitCode, durationMs, extra) that writes the *.system.md / *.user.md / *.response.txt artifacts, increments claudeCallCount, and emits the corresponding event with systemBytes / userBytes / responseBytes. Each public method now computes only its own prefix and per-kind extra fields (stepIndex, iteration) and delegates to the helper, while preserving its own counter bumps (agentStepCount, fixerInvocationCount). Behavior, file names, and event payload shapes are unchanged.
Files changed:
packages/munchkins-core/src/run-log.ts— introduced_writeClaudeCallhelper; rewroteagentStep,summaryStep, andfixerInvocationto delegate to it; consolidated theclaudeCallCountincrement into the helper.
No agent runs have been recorded yet. The first successful run of bun run munchkins <agent> will prepend an entry above.