Building with AI Agents
Methodology version: 2026-04-17 — if you're using this method as a standard in your organisation, pin this version.
Quick Start
Copy-paste one of these into Claude Code or Codex. The agent reads this page and walks you through everything.
Start from scratch — the agent interviews you and creates the plan:
I want to build with AI agents just like Mikko does, read the instructions from https://mikkosniemela.com/build-with-agents
Resume as the builder — plan already exists:
I want to build with AI agents just like Mikko does, read the instructions from https://mikkosniemela.com/build-with-agents — I am the builder, my plan file is ______.md (also read handoff.md if it exists)
Resume as the auditor — plan already exists:
I want to build with AI agents just like Mikko does, read the instructions from https://mikkosniemela.com/build-with-agents — I am the auditor, my plan file is ______.md (also read handoff.md if it exists)
Handoff prompt — use with /compact, at session end (save the output as handoff.md), or as the first prompt of a new session:
Summarize this project for a fresh agent joining now. Cover: goal and current status; what's done, in progress, and next by step state; files involved and their purpose; decisions already made that should not be relitigated; open questions to the human; and any access or permission blockers. Max 3 paragraphs.
This page works two ways.
If you're a human reading this — read on. Understand the methodology, then start a conversation with your agent and build the plan together.
If you're an AI agent and your human sent you here without reading this themselves — you run the show. Here's your playbook:
- Interview your human. Ask them what they want to build, who it's for, what the end state looks like. Help them think in capability questions, not feature lists. Keep asking until you have at least 20 strong questions. Write the goal document from the conversation.
- Figure out access. Based on the goal, work out what repos, services, APIs, databases, and credentials the builder and auditor will need. Ask your human to confirm what's available and what's missing. List blockers before writing a single line of code.
- Write the plan. Create plan.md with phases, acceptance criteria, and space for auditor comments. Put the access requirements at the top.
- Set up the loop. Present the available loop options to your human (for Claude Code:
/loopcommand, polling hooks, or file-change hooks; for Codex: polling hooks). Explain the tradeoffs. Recommend/loopfor getting started and hooks for serious multi-day work. Set up whichever they choose. - Start the loop. Tell your human to open two terminals — one for the builder, one for the auditor. Give them the exact commands to run. Then get to work.
Either way, the rest of this page explains the methodology and has all the code you need.
Contents
1. The Idea
A single AI agent asked to build and review its own code will rationalize its own output. This is the same cognitive failure that makes developer self-review unreliable. The fix is structural: separate the builder from the reviewer.
You need two things to make this work:
- A plan file — the shared understanding of what needs to be built and where things stand.
- A comments file — a one-way channel from the reviewer to the builder.
Each agent has strict write permissions. The builder writes the plan and the code. The auditor writes comments. Neither touches the other's file. This is the entire coordination mechanism. No orchestration framework, no message queue, no shared database. Two files.
Important: treat write boundaries as policy + guardrails. Role prompts define intent, but hooks should enforce it (for example: block builder writes to comments.md, block auditor writes to plan.md and code paths).
Hooks don't replace great thinking. Hooks are for execution. Do the plan first — brainstorm with Claude or Codex about the plan. When it's ready, then let the agents do their work.
2. Two Roles, Two Files
| Role | Reads | Writes | Responsibility |
|---|---|---|---|
| Builder | Plan, code, comments | Plan, code | Executes scoped tasks from comments. Updates the plan with progress. Does not investigate or research unless explicitly instructed. |
| Auditor | Plan, code | Comments only | Reviews completed work. Investigates upcoming steps — maps risks, dependencies, and fragile points before the builder gets there. Writes concrete instructions. Escalates to human when needed. |
The builder is an execution agent, not a discovery agent. It receives scoped tasks from comments with exact file paths, commands, and acceptance criteria. It builds, updates the plan with progress, and can write questions or notes in the plan. It does not do open-ended research or architecture investigation unless the auditor explicitly asks for it.
The auditor is always ahead. While the builder works on step A, the auditor is investigating steps B and C — mapping which repos and services are affected, identifying fragile integration points, listing what could break, and figuring out rollback options. By the time the builder finishes A, the auditor has already translated B into concrete, deterministic instructions in the comments. The builder never walks into unknown territory.
The auditor escalates to the human supervisor when it encounters:
- Business decisions — anything where the right answer depends on business context the auditor can't know.
- High-impact forks — decisions that change the shape of future steps, where getting it wrong is expensive to undo.
- Access and permissions — anything the agents can't resolve themselves.
- Cross-team coordination — anything that involves people or systems outside the project.
The human decides, the auditor translates the decision into builder instructions.
The auditor also tells the builder how to test and what evidence to produce. The builder executes the tests and reports results in the plan.
Why this split makes you faster
The role separation isn't just about integrity — it's about using the right model for the right job. The auditor does the thinking: investigation, risk analysis, dependency mapping, sequencing. That's work for a capable, thorough model (Claude Opus 4.7 or GPT-5.4). The builder does the building: scoped tasks with clear instructions. That's work for a fast, precise model (Claude Sonnet 4.6, Haiku 4.5, or a Codex fast tier) that doesn't need to reason about architecture because the auditor already did.
The auditor investigates the next steps, does the thinking, and makes sure the builder just has to build — no research required. This means the builder runs faster, uses fewer tokens, and produces more predictable output. The thinking happened upstream.
Go further: pair across model families
The strongest version of this pattern runs the builder on one model family and the auditor on another — Claude Opus 4.7 as auditor, GPT-5.4 as builder, or the reverse. Two different families reviewing each other can't share the same training blind spots or the same failure modes. The builder and auditor are structurally independent, not just procedurally separated. If one family rationalises a bad pattern, the other family has no reason to.
A good starting lineup today. Auditor: Claude Opus 4.7 (1M context, heavy reasoning) or GPT-5.4. Builder: Claude Sonnet 4.6, Haiku 4.5, or Codex fast tier. Pair across families when you can. Refresh this choice whenever a new flagship ships — the rest of this primer stays the same.
Who does the thinking: follow the smarter model
The default split — auditor thinks, builder executes — assumes the auditor is the smarter model. If it isn't, flip the thinking to where the capability lives.
Fast builder, heavy auditor. Sonnet 4.6 or Haiku 4.5 as builder, Opus 4.7 or GPT-5.4 as auditor. The auditor does all investigation, risk analysis, and sequencing. It translates each upcoming step into precise, executable instructions in comments.md with exact file paths, commands, and acceptance criteria. The builder just executes. This is the fastest path for well-understood work.
You are the auditor and the thinker. The builder is a fast execution model — it does not investigate, research, or design. Do all investigation, risk analysis, dependency mapping, and rollback planning yourself. Before the builder reaches a step, translate it into precise, executable instructions in comments.md: exact file paths, commands, acceptance criteria, and the evidence you require. The builder should never need to reason about architecture or strategy — only execute.
Heavy builder, pragmatic auditor. Opus 4.7 as builder. Don't waste it on execution alone. The auditor delegates investigation to the builder between steps — "before you start step N, investigate X and report observations in plan.md" — and the builder returns risks, alternatives, and commentary there. The auditor reads those observations alongside the code; the human makes higher-impact decisions with both perspectives in view.
You are the auditor working with a heavy-reasoning builder. When you need investigation or research for upcoming steps, delegate it via comments.md: ask the builder to investigate specific open questions between build steps and report observations, risks, and commentary in plan.md. Read the builder's observations alongside the code when reviewing. For high-impact forks, escalate to the human with both your assessment and the builder's observations attached.
The rule. plan.md is the builder's voice. comments.md is the auditor's voice. If the builder has something to say beyond "step N finished" — observations, commentary, risks, alternatives — that belongs in plan.md. The plan file is a two-way analysis channel, not just a progress log.
This is also a security pattern
This structure addresses three of the most dangerous failure modes in autonomous AI systems: excessive agency (one agent with unchecked write access everywhere), cascading failures (an agent whose output feeds its own next step with no independent check), and accountability gaps (you can't tell which agent decided what went wrong). File-level write boundaries, independent review, and evidence gating before step completion are the same controls you'd apply to any high-stakes autonomous system. See the agent scenario in the AI Security Primer for the full threat model.
3. Plan First, Execute Second
The most important part of this workflow happens before any agent starts building. You create the goal document — but you don't have to write it alone. The agent interviews you.
You sit down with Claude or Codex and talk about what you want to build. The agent asks questions, challenges your assumptions, and helps you think deeper. Together you produce a goal document — a vision written from the future, describing what life looks like after the software is delivered. No implementation details. No stack decisions. Just: what can a user do, and what questions can the system answer?
Write questions, not requirements
"Which contract renewals are coming in the next 6 months, and which are at risk?" implicitly demands a far richer system than "The system shall have a contract renewal dashboard with alerting."
A question tells the builder what the user needs to know and leaves the implementation open. Questions also set the quality bar implicitly: if the system cannot answer the question, it has failed. No interpretation required.
Write at least 20 questions. The depth of your questions drives the depth of the software. If a question only requires one data source to answer, it's too shallow. The best questions require three or more.
The process
- Interview — The agent interviews you. What are you building? Who is it for? What does the user's day look like after this exists? The agent pushes you to think in capability questions (at least 20) and writes the goal document from the conversation.
- Access check — The agent figures out what repos, services, APIs, databases, and credentials the builder and auditor will need. It asks you what's available and flags anything missing as a blocker before any code is written.
- Plan file — The agent reads the goal and writes a structured plan with phases, acceptance criteria, access requirements at the top, and space for auditor comments.
- Hook setup — The agent creates the hook scripts and config files in your project directory.
- Start the loop — Two terminals. Builder builds, auditor reviews. Both loop on a regular interval or react to file changes. You intervene only when the auditor escalates.
Why access goes first
An agent that hits a permissions wall mid-build will either stall silently or invent a workaround you didn't ask for. Both are expensive. The agent should figure out what access is required and ask you about it before writing a single line of code.
- Repos and branches — which repositories, which branches, does the builder need to create new ones?
- Services and APIs — does the builder need running services, API keys, database access, deployment credentials?
- External tools — does the auditor need browser access for end-to-end testing? Does the builder need package registries, CI/CD pipelines?
- Permissions mode — is the builder running with
--dangerously-skip-permissionsor will it need approval for each shell command?
If anything is missing, it goes in the plan as a blocker at step zero. Don't discover it at step five.
The plan is the memory
A 1M-context model like Claude Opus 4.7 can hold the entire goal document, plan history, and comments file in working context for multi-week projects. No compaction, no summarisation, no lost decisions. The plan file stops being just a coordination artefact and becomes a shared, durable memory that survives across sessions, context resets, and agent handoffs.
4. The Loop
Both agents run concurrently, but they look in different directions. The builder looks at the current step and executes it. The auditor looks backward at the builder's work and forward at what's coming next. Every time an agent finishes its current work, the loop brings it back to the shared files to check for updates.
The loop is the mechanism that brings both agents back to the shared files. It can run on an interval — checking every few minutes whether the other role has written something new — or it can react to file changes, firing only when the plan or comments file is actually modified.
The implementation depends on your tool. Claude Code offers three approaches (interval command, polling hooks, or file-change hooks). Codex uses polling hooks. The sections below have the code for each.
No orchestration framework needed. The shared files are the coordination layer. The loop just makes sure both agents keep coming back to them.
Step states
Each step in the plan moves through these states. A step is not done just because the code compiles and tests pass — the auditor must accept the evidence.
| State | Who acts | What happens |
|---|---|---|
planned |
Auditor | Step is scoped with instructions, file paths, acceptance criteria, and evidence requirements. |
in_progress |
Builder | Builder is executing the step. |
evidence_pending |
Builder | Code is done. Builder provides evidence: test results, runtime output, negative-path checks, screenshots — whatever the auditor asked for. |
auditor_accepted |
Auditor | Auditor reviews the evidence, confirms it meets acceptance criteria and proves real-world behavior. Next step is unlocked. |
This prevents the common failure where things move fast on "PASS" without proving actual behavior. The auditor defines what evidence is required before the builder starts, and the builder cannot move on until the auditor accepts it.
Handoffs and compaction
Even with 1M context, sessions end — terminals close, roles switch, a human takes over. You need one handoff artefact that produces the same shape of summary whether you are compacting mid-session or starting fresh.
Write a single handoff prompt and use it three ways: as the instruction for your tool's context-compaction command (in Claude Code: /compact <instruction>), as the first prompt of every new session, and as the last prompt before you close a terminal — save the output next to plan.md as handoff.md so the next agent can read it on cold start. One prompt, three uses. The next agent always lands in the same informational posture.
Here's a good default. Copy it, adapt it if your project needs more, and keep it stable for the life of the project — consistent wording is what keeps every summary comparable.
Summarize this project for a fresh agent joining now. Cover: goal and current status; what's done, in progress, and next by step state; files involved and their purpose; decisions already made that should not be relitigated; open questions to the human; and any access or permission blockers. Max 3 paragraphs.
5. The Goal Document
The goal document is the highest-leverage investment in the entire process. For a significant project, the interview that produces it might take hours spread over a day or two. It is worth every minute.
The agent interviews you and writes the goal document from the conversation. You don't need to be a writer. You need to know what you want. The agent's job is to pull that out of you and structure it.
Structure
- Walkthrough — A step-by-step narrative of the user's experience. Written as if the product already exists and is working. The agent drafts this from what you describe.
- Capability questions — At least 20 questions the finished software will answer. These are the specification. The agent helps you go deeper — the first 10 are easy, the next 10 are where the real value lives.
- Scope boundaries — Explicit list of out-of-scope capabilities. Without hard boundaries, an autonomous agent will keep expanding scope. The agent should ask you: "What should this NOT do?"
Good vs. weak questions
Weak: "Can I see a list of my customers?" — produces a database and a list view.
Strong: "Which of my customers are close to their license capacity, have a renewal coming in the next 90 days, and have had no contact from our team in the last 6 weeks?" — requires usage data, contract data, activity tracking, time-based filtering, cross-referencing, and risk surfacing. It will produce all of those things because it has to.
The depth of questions drives the depth of the software. A useful self-check: read each question and count how many independent data sources, processes, or judgements are needed to answer it. If the answer is one, the question is too shallow.
6. Code: Claude Code
Claude Code offers three ways to run the loop. Choose based on your project's needs.
Quick start: /loop command
The fastest way to get started. No scripts, no config files. Good for short projects and getting the feel of the workflow. Session-scoped — if the terminal closes, the loop stops. No built-in file-write enforcement — relies on role prompts.
/loop 5m Read comments.md. If there are new instructions from the auditor, execute them and update plan.md. If nothing new, report status briefly and wait.
/loop 5m Read plan.md and review any code changes since your last check. If the builder completed work, review it and write feedback to comments.md. If nothing changed, wait.
That's it. The agents loop on a 5-minute interval and coordinate through the files. For persistent loops with file-write enforcement, use the hook approach below.
Controlled path: Stop hook + polling
The serious path for multi-day projects. Shell scripts fire on the Stop lifecycle event, enforce role boundaries, check for changes, and sleep 5 minutes between checks.
Directory structure
your-project/
.claude/
settings.local.json # hook configuration
hooks/
build-a-plan.sh # builder hook
audit-a-plan.sh # auditor hook
plan.md # the plan (builder writes, auditor reads)
comments.md # comments (auditor writes, builder reads)
GOAL.md # your goal document
settings.local.json
This file configures which hooks fire and when. Use settings.local.json (gitignored) so each developer can run their own role without conflicts. The timeout is set to 420 seconds (7 minutes) to allow the 5-minute sleep between checks.
{
"hooks": {
"Stop": [
{
"matcher": "",
"hooks": [
{
"type": "command",
"command": ".claude/hooks/build-a-plan.sh",
"timeout": 420
}
]
}
]
}
}
{
"hooks": {
"Stop": [
{
"matcher": "",
"hooks": [
{
"type": "command",
"command": ".claude/hooks/audit-a-plan.sh",
"timeout": 420
}
]
}
]
}
}
build-a-plan.sh (Builder hook)
#!/bin/bash
# Build-a-Plan hook — loops on Stop event
# First check is immediate. Subsequent checks wait 5 minutes.
INPUT=$(cat)
SESSION_ID=$(echo "$INPUT" | jq -r '.session_id // "default"' 2>/dev/null)
STOP_ACTIVE=$(echo "$INPUT" | jq -r '.stop_hook_active // false' 2>/dev/null)
MARKER="/tmp/build-a-plan-${SESSION_ID}"
LOOP_GUARD="/tmp/build-a-plan-loops-${SESSION_ID}"
if [ "$STOP_ACTIVE" = "true" ]; then
exit 0
fi
LOOPS=$(cat "$LOOP_GUARD" 2>/dev/null || echo 0)
LOOPS=$((LOOPS + 1))
echo "$LOOPS" > "$LOOP_GUARD"
if [ "$LOOPS" -gt 200 ]; then
echo "[Build-a-Plan] Loop guard reached 200 stop-hook cycles; allowing stop." >&2
rm -f "$LOOP_GUARD" "$MARKER"
exit 0
fi
if [ -f "$MARKER" ]; then
sleep 300
fi
touch "$MARKER"
cat >&2 <<'MSG'
[Build-a-Plan] You are the builder. Do NOT write to the comments
file — that belongs to the auditor. You can write to the plan file
(updates, questions, notes) and to code. Check:
1. Is the plan file up to date? Update it if needed. You can also
add questions or notes in the plan for the auditor.
2. Are there new comments in the comments file? If yes, take action
based on them in the plan and code. If no new comments, no
worries — you'll check again in 5 minutes.
MSG
exit 2
audit-a-plan.sh (Auditor hook)
#!/bin/bash
# Audit-a-Plan hook — loops on Stop event
# First check is immediate. Subsequent checks wait 5 minutes.
INPUT=$(cat)
SESSION_ID=$(echo "$INPUT" | jq -r '.session_id // "default"' 2>/dev/null)
STOP_ACTIVE=$(echo "$INPUT" | jq -r '.stop_hook_active // false' 2>/dev/null)
MARKER="/tmp/audit-a-plan-${SESSION_ID}"
LOOP_GUARD="/tmp/audit-a-plan-loops-${SESSION_ID}"
if [ "$STOP_ACTIVE" = "true" ]; then
exit 0
fi
LOOPS=$(cat "$LOOP_GUARD" 2>/dev/null || echo 0)
LOOPS=$((LOOPS + 1))
echo "$LOOPS" > "$LOOP_GUARD"
if [ "$LOOPS" -gt 200 ]; then
echo "[Audit-a-Plan] Loop guard reached 200 stop-hook cycles; allowing stop." >&2
rm -f "$LOOP_GUARD" "$MARKER"
exit 0
fi
if [ -f "$MARKER" ]; then
sleep 300
fi
touch "$MARKER"
cat >&2 <<'MSG'
[Audit-a-Plan] You are the auditor. Your ONLY writable file is the
comments file. Do NOT write to the plan file, code, or anything else.
1. Check if the plan file or the code has changed since you last
looked. If work is still in progress and it's not the right time
to audit yet, no worries — you'll check again in 5 minutes.
2. If things look stable, review the plan and the relevant code,
then write your feedback in the comments file. Be instructive —
tell the builder clearly what to do.
3. If things are stalled (no progress from the builder, or a
blocker), decide: should you both wait, or does something need
to happen? If the decision requires human supervisor approval,
say so in the comments and wait — do not proceed without it.
MSG
exit 2
Make both scripts executable: chmod +x .claude/hooks/*.sh
Loop safety: keep the stop guard in both scripts (the stop_hook_active check + max-cycle counter) so the Stop hook cannot run forever.
5 minutes is a good default. For fast-moving work, try 2 minutes. For overnight or low-priority runs, 15 minutes saves tokens. Adjust the sleep value in the hook scripts to match your project's pace.
Advanced: FileChanged hook (event-driven)
This is an event-driven alternative that reduces wasted polling when the shared files change frequently. It is not the strict controlled default path for role enforcement — use the Stop hook approach above if you need the agent to enforce file-write boundaries on every cycle.
Instead of polling on a timer, react instantly when the file changes:
{
"hooks": {
"FileChanged": [
{
"matcher": "comments.md",
"hooks": [
{
"type": "command",
"command": "echo '[Build-a-Plan] Comments file changed. Read comments.md and take action on new instructions.'"
}
]
}
]
}
}
{
"hooks": {
"FileChanged": [
{
"matcher": "plan.md",
"hooks": [
{
"type": "command",
"command": "echo '[Audit-a-Plan] Plan file changed. Read plan.md and review the builder progress.'"
}
]
}
]
}
}
Combine with the Stop hook for the initial check when the agent starts. FileChanged fires only on subsequent modifications.
7. Code: OpenAI Codex
Codex CLI supports hooks and persistent instructions through AGENTS.md. The approach is the same: two roles, two files, a polling loop. The configuration is slightly different.
Important: Codex hooks are off by default. Enable them first:
[features] codex_hooks = true
Directory structure
Run builder and auditor in separate worktrees or clones of the same repo — not two sessions in the same directory. Each gets its own AGENTS.md with the correct role. They share state through the plan and comments files via git.
your-project/
.codex/
hooks.json # hook configuration
hooks/
build-a-plan.sh # or audit-a-plan.sh
AGENTS.md # role instructions for THIS instance
plan.md
comments.md
GOAL.md
AGENTS.md — Builder worktree
Codex uses AGENTS.md (equivalent to Claude Code's CLAUDE.md) for persistent role instructions. Each worktree gets its own AGENTS.md matching its role.
# Builder Role You are the builder. Your job is to build according to the plan. ## Write permissions - plan.md — update progress, add questions and notes - All source code files ## Read permissions - comments.md — the auditor writes feedback here - GOAL.md — the original goal document ## Rules - NEVER write to comments.md — that belongs to the auditor - Update the plan after completing each task - Check comments.md before starting new work - Run tests when the auditor specifies how to test
AGENTS.md — Auditor worktree
# Auditor Role You are the auditor. You review the builder's work. ## Write permissions - comments.md — this is your ONLY writable file ## Read permissions - plan.md — track builder progress and status - All source code files — review implementation quality - GOAL.md — compare against the original goal ## Rules - NEVER write to plan.md, code, or any other file - Be instructive — tell the builder clearly what to do - Include testing instructions in your comments - If things stall, decide: wait, instruct, or escalate - If human approval is needed, say so and wait
hooks.json
Codex hooks use the same shell scripts but the JSON structure nests handlers inside matcher groups.
{
"hooks": {
"Stop": [
{
"hooks": [
{
"type": "command",
"command": ".codex/hooks/build-a-plan.sh",
"timeout": 420
}
]
}
]
}
}
{
"hooks": {
"Stop": [
{
"hooks": [
{
"type": "command",
"command": ".codex/hooks/audit-a-plan.sh",
"timeout": 420
}
]
}
]
}
}
The hook scripts are identical to the Claude Code versions — they're standard bash. Just copy them into .codex/hooks/ and make them executable.
8. Running It
If you used the one-liner from the top of this page, your agent already knows all of this. It will walk you through the interview, create the files, set up the hooks, and tell you what to run. You just answer its questions.
If you're setting things up manually, here's the sequence:
Step by step
- Start a conversation with Claude or Codex. Tell it what you want to build. Let it interview you and create the goal document.
- Confirm access. The agent will ask you what repos, services, and credentials are available. Answer honestly — missing access discovered later is expensive.
- Review the plan. The agent writes plan.md. Read it. Push back on anything that doesn't match your vision. This is your last easy chance to course-correct.
- The agent sets up the loop. For Claude Code, it either starts a
/loopcommand (quick path) or creates the hook scripts and config files (controlled path). For Codex, it creates the hook scripts. The code examples are in Sections 6 and 7. - Open two terminals:
If using /loop (Claude Code only):
claude > /loop 5m Read comments.md. If there are new instructions from the auditor, execute them and update plan.md. If nothing new, report status briefly and wait.
claude > /loop 5m Read plan.md and review any code changes since your last check. If the builder completed work, review it and write feedback to comments.md. If nothing changed, wait.
If using hooks (Claude Code or Codex):
# Claude Code claude "Read GOAL.md and plan.md. Start building." # Codex codex "Read GOAL.md and plan.md. Start building."
# Claude Code claude "Read GOAL.md and plan.md. You are the auditor. Begin reviewing." # Codex codex "Read GOAL.md and plan.md. You are the auditor. Begin reviewing."
Both agents will work, finish, get intercepted by the hook, check the files, and loop. The builder waits for comments. The auditor waits for progress. They coordinate through the files. You intervene only when the auditor escalates.
Tips
--dangerously-skip-permissionsmakes the builder fully autonomous (Claude Code). It's powerful but means no confirmation before destructive commands. Use it — but control access at the OS user level so the agent can't touch things it shouldn't.- Run the auditor in a read-heavy mode — it mostly reads and only writes to one file.
- If you're running both on the same machine, they'll share the filesystem naturally. If on different machines, use a shared git repo and have both agents pull/push.
- The 5-minute interval is a starting point. For fast-moving work, try 2 minutes. For overnight runs, 15 minutes saves tokens. If using
/loop, change the interval directly in the command. If using hooks, adjust thesleepvalue in the scripts. - For large projects, you can run multiple builders on independent modules. Each gets their own plan section.
- If you use git: the builder commits after each completed task with a short message prefixed
builder:. Put detailed reasoning in the plan file, not the commit message. The auditor never commits — it only writes to the comments file, which the builder commits as part of its normal flow.
Credits
The core insight: structure replaces supervision. Invest in the plan, enforce role separation through file permissions, and let the agents loop.
Dr. Mikko S. Niemelä — 2026
Last updated: April 17, 2026