BMB — Be My Butler | Multi-Agent Orchestration for Claude Code

01 — Foundation

Core Design Principles

Every architectural decision traces back to these non-negotiable principles.

Worktree Isolation

Every write-capable parallel agent gets its own git worktree. No shared state. No index.lock conflicts. True filesystem-level isolation.

Blind Divergent Framing

Cross-model tracks read different context documents. Not just blind on results — divergent on problem framing itself.

Lead as Thin Orchestrator

Lead reads only .bmb/ and CLAUDE.md. Never touches code. Context protection is paramount — Lead is the bottleneck.

Bidirectional Consultant

SendMessage enables Consultant → Lead feedback. Business rules from user conversations actually reach the pipeline.

Auto-Learning

Mistakes, corrections, and successes are automatically recorded. Past pitfalls are injected into future sessions. 3-tier: local → global → CLAUDE.md.

Session Continuity

session-prep.md captures state for next session. Cross-session context preserved. No work is ever truly lost.

3-Layer Compression

Read-Time summaries, Write-Time caching, Reference-Time FTS5. Lead context stays below 50% through full pipeline.

Profile-Based Permissions

Cross-model invocations use role-specific profiles: council and verify are read-only. No more --full-auto everywhere.

Graceful Degradation

Cross-model unavailable? Pipeline continues Claude-only. Simplifier breaks tests? Revert and proceed. Never blocks.

02 — Team

Agent Roster

9 specialized agents, each with a clear role and strict boundaries. From brainstorming to production-ready code.

graph TB
  User["👤 User"]
  Lead["🎯 Lead
Opus · Orchestrator"]
  Consultant["💬 Consultant
Sonnet · Persistent"]
  Architect["📐 Architect
Opus · Council"]
  Executor["⚙️ Executor
Opus · Backend"]
  Frontend["🎨 Frontend
Opus · UI"]
  Tester["🧪 Tester
Opus · Blind"]
  Verifier["✅ Verifier
Opus · Blind"]
  Simplifier["🧹 Simplifier
Opus · Cleanup"]
  Writer["📝 Writer
Sonnet · Docs"]
  BMB[".bmb/
Handoffs · Artifacts"]
  CrossModel["🌐 Cross-Model
Codex / Gemini"]

  User <-->|"brainstorm
approve"| Lead
  Lead <-->|"SendMessage
bidirectional"| Consultant
  Lead -->|"briefing"| Architect
  Architect <-->|"council
debate"| CrossModel
  Lead -->|"plan-to-exec"| Executor
  Lead -->|"plan-to-exec"| Frontend
  Lead -->|"test request"| Tester
  Tester -.->|"blind wall"| CrossModel
  Lead -->|"verify request"| Verifier
  Verifier -.->|"blind wall"| CrossModel
  Lead -->|"simplify"| Simplifier
  Lead -->|"write docs"| Writer
  Architect -->|"writes"| BMB
  Executor -->|"writes"| BMB
  Frontend -->|"writes"| BMB
  Tester -->|"writes"| BMB
  Verifier -->|"writes"| BMB
  Lead -->|"reads only"| BMB

  classDef opus fill:#1a1030,stroke:#7c3aed,stroke-width:2px,color:#a78bfa
  classDef sonnet fill:#0a2015,stroke:#16a34a,stroke-width:2px,color:#22c55e
  classDef cross fill:#1a1500,stroke:#d97706,stroke-width:2px,color:#f59e0b
  classDef user fill:#0a1628,stroke:#3b82f6,stroke-width:2px,color:#60a5fa
  classDef storage fill:#111827,stroke:#1e3a5f,stroke-width:2px,color:#8494a7

  class Lead,Architect,Executor,Frontend,Tester,Verifier,Simplifier opus
  class Consultant,Writer sonnet
  class CrossModel cross
  class User user
  class BMB storage

← scroll →

Lead

Claude Opus • Full Pipeline

Orchestration, decisions, relay, brainstorming. Reads only .bmb/. Never writes code. The thin conductor.

orchestratorbrainstorms in-process

Consultant

Claude Sonnet • Step 2–11

Coordinator identity: full situational awareness, zero command authority. Dual-channel (feed file + SendMessage). Receives lifecycle events during blind phase; gets full post-briefing after reconciliation.

dual-channelpersistentpost-briefing

Architect

Claude Opus • Step 4

Design + cross-model council debate. 2–4 rounds of Claude vs Codex/Gemini. Writes plan-to-exec.md. Queries Context7 for live library docs before designing.

council debatecross-modelContext7

Executor

Claude Opus • Step 5

Backend implementation. Works in isolated git worktree. Queries Context7 for current library docs before writing. Commits only within worktree scope.

worktree-isolatedContext7

Frontend

Claude Opus • Step 5

React/Next.js + shadcn/Tailwind specialist. Separate worktree from Executor. Queries Context7 for current framework docs. Spawned only if frontend scope detected.

worktree-isolatedconditionalContext7

Tester

Claude Opus • Step 6

Unit, integration, edge-case tests. Part of blind cross-model testing with divergent framing.

blindworktree-isolated

Verifier

Claude Opus • Step 7

Evidence-based verification + code review in one agent. Blind cross-model verification.

blindreview + verify

Simplifier

Claude Opus • Step 9

Post-work code cleanup. Must re-verify (build + tests) after changes. Failure triggers auto-revert.

re-verify

Writer

Claude Sonnet • Step 10

Documentation update + cross-validation. Sonnet is sufficient for docs.

Analyst

Claude Sonnet • Step 10.5

Retrospective analysis: queries analytics.db, classifies events by Bird’s Law severity (critical/warn/info), surfaces pattern_counts promotion candidates.

bypassPermissionsread-only

03 — Flow

The 11.5-Step Pipeline

From user intent to production-ready code. Click each step for details.

Setup

Lead

▾

tmux guard, session ID, directory structure, source bmb-learn.sh, load past MISTAKE entries (local + global), config, session-prep check, conversation logger start.

auto-learningsession continuity

Brainstorm + Consultant

Lead Consultant

▾

Lead brainstorms directly with user (no separate brainstormer agent). Spawn persistent Consultant pane with bidirectional SendMessage. Minimum 2 rounds. Write briefing with Known Pitfalls section.

in-processbidirectional

User Approval

Lead

▾

Present compressed briefing. YES → bmb_learn PRAISE. MODIFY → bmb_learn CORRECTION + update. NO → cancel.

auto-learning3-way branch

Architecture Council

Architect Cross-Model

▾

Create git worktrees for execution. Spawn Architect for Claude-Codex/Gemini council debate (2–4 rounds). Write plan-to-exec.md. Skip for bugfix/infra recipes.

council debatecross-modelskip: bugfix, infra

Execution

Executor Frontend

▾

Spawn Executor + Frontend (conditional) in separate worktrees. Parallel execution with zero git conflicts. Frontend only if scope detected (React, Vue, Svelte, etc.).

worktree-isolatedparallelconditional frontend

5.5

Merge Worktrees

Lead

▾

Commit in worktree → merge to main → remove worktree. Conflict? bmb_learn MISTAKE + escalate to user.

auto-learning

Cross-Model Testing (Blind)

Tester Cross-Model

▾

Claude Tester reads plan-to-exec.md. Cross-Model Tester reads briefing.md. Different framing, separate worktrees, separate timeouts. Neither reads the other's results.

blind walldivergent framingworktree-isolated

Cross-Model Verification (Blind)

Verifier Cross-Model

▾

Same blind divergent pattern as Step 6. Consultant isolation: no results shared until reconciliation.

blind walldivergent framing

Reconciliation

Lead

▾

Read structured summaries. 5-category failure classifier: IMPL→Step 5, ARCH→Step 4, REQ→Step 2, ENV→Step 1, TEST→Step 6. FAIL triggers bmb_learn MISTAKE + classified loop-back.

auto-learningfailure classification

Simplification + Re-verify

Simplifier

▾

Minimal safe improvements. Build + tests must pass (re-verification). Failure → bmb_learn MISTAKE + revert + proceed with original.

re-verifyauto-learning

Docs Update

Writer

▾

Writer updates documentation, removes dead references, cross-validates consistency across all modified files.

10.5

Retrospective Analysis

Analyst

▾

Analyst queries analytics.db for the current session. Classifies events by Bird’s Law severity (critical / warn / info). Cross-references pattern_counts to find recurring failures (≥2 occurrences) eligible for CLAUDE.md promotion. Writes analyst-report.md (3–5 min timeout; pipeline continues on timeout).

Bird’s Lawpattern_countsread-only agent

Cleanup + Session Prep

Lead

▾

bmb_learn PRAISE on success. Check recurrence ≥2 → propose CLAUDE.md promotion. Git commit/push. FTS5 indexing. Generate session-prep.md for next session.

auto-learningsession continuityCLAUDE.md promotion

Pipeline Flow Diagram

flowchart TD
  S1["1. Setup
tmux, session, learnings"]
  S2["2. Brainstorm + Consultant
min 2 rounds"]
  S3{"3. User Approval"}
  S4["4. Architecture Council
2-4 debate rounds"]
  S5["5. Execution
worktree-isolated"]
  S55["5.5 Merge Worktrees"]
  S6["6. Blind Testing
divergent framing"]
  S7["7. Blind Verification
divergent framing"]
  S8{"8. Reconciliation"}
  S9["9. Simplify + Re-verify"]
  S10["10. Docs Update"]
  S105["10.5 Retrospective Analysis
Bird's Law severity"]
  S11["11. Cleanup + Session Prep"]

  S1 --> S2
  S2 --> S3
  S3 -->|"YES"| S4
  S3 -->|"MODIFY"| S2
  S3 -->|"NO"| CANCEL["Cancel"]
  S4 -->|"skip: bugfix/infra"| S5
  S4 --> S5

  subgraph parallel ["Parallel Worktrees"]
    direction LR
    EX["Executor"]
    FE["Frontend
if detected"]
  end
  S5 --> parallel
  parallel --> S55
  S55 --> S6
  S6 --> S7
  S7 --> S8
  S8 -->|"PASS"| S9
  S8 -->|"IMPL fail"| S5
  S8 -->|"ARCH fail"| S4
  S8 -->|"REQ fail"| S2
  S8 -->|"ENV fail"| S1
  S8 -->|"TEST fail"| S6
  S9 --> S10
  S10 --> S105
  S105 --> S11

  classDef decision fill:#1a1500,stroke:#d97706,color:#f59e0b
  classDef cancel fill:#2a0a0a,stroke:#ef4444,color:#ef4444
  classDef step fill:#111827,stroke:#1e3a5f,color:#e8edf5
  classDef parallel fill:#0a1628,stroke:#3b82f6,color:#60a5fa
  classDef analyst fill:#0a2010,stroke:#22c55e,color:#4ade80

  class S3,S8 decision
  class CANCEL cancel
  class S1,S2,S4,S5,S55,S6,S7,S9,S10,S11 step
  class EX,FE parallel
  class S105 analyst

← scroll →

04 — Data

Handoff Data Flow

How artifacts flow between agents through the .bmb/ directory. Lead never touches code — only reads summaries.

flowchart LR
  User["👤 User"]
  Lead["🎯 Lead"]
  Brief["📋 briefing.md"]
  Arch["📐 Architect"]
  Plan["📄 plan-to-exec.md"]
  ExFe["⚙️ Executor
🎨 Frontend"]
  Merge["🔀 Merge"]
  Test["🧪 Tester
blind"]
  Verify["✅ Verifier
blind"]
  Recon["⚖️ Reconcile"]
  Simp["🧹 Simplifier"]
  Write["📝 Writer"]
  Output["✨ Output"]

  User -->|"intent"| Lead
  Lead -->|"brainstorm"| Brief
  Brief -->|"briefing"| Arch
  Arch -->|"council"| Plan
  Plan -->|"instructions"| ExFe
  ExFe -->|"worktrees"| Merge
  Merge -->|"merged code"| Test
  Merge -->|"merged code"| Verify
  Test -->|"test-summary"| Recon
  Verify -->|"verify-summary"| Recon
  Recon -->|"PASS"| Simp
  Simp -->|"cleaned"| Write
  Write --> Output

  classDef artifact fill:#1a2234,stroke:#3b82f6,color:#60a5fa
  classDef agent fill:#111827,stroke:#1e3a5f,color:#e8edf5

  class Brief,Plan artifact
  class User,Lead,Arch,ExFe,Merge,Test,Verify,Recon,Simp,Write,Output agent

← scroll →

Consultant Feed Timeline

gantt
  title Consultant Monitoring (Steps 2–11)
  dateFormat X
  axisFormat %s

  section Brainstorm
    Bidirectional with Lead          :active, 0, 2

  section Approval
    Monitor user decision            :1, 3

  section Council
    Observe debate rounds            :2, 4

  section Execution
    Track progress via feed file     :3, 6

  section Testing
    Monitor blind test results       :5, 7

  section Verification
    Monitor blind verify results     :6, 8

  section Reconciliation
    Observe failure classification   :7, 9

  section Simplify
    Track re-verify outcome          :8, 10

  section Docs
    Validate doc consistency         :9, 11

  section Cleanup
    Final session summary            :10, 12

← scroll →

06 — Isolation

Worktree Lifecycle

Every parallel agent gets its own git worktree. No shared state, no index.lock conflicts, true filesystem isolation.

gantt
  title Worktree Lifecycle per Pipeline Run
  dateFormat X
  axisFormat Step %s

  section Executor
    Create worktree     :e1, 4, 5
    Work in worktree    :e2, 5, 6
    Merge to main       :crit, e3, 6, 7

  section Frontend
    Create worktree     :f1, 4, 5
    Work in worktree    :f2, 5, 6
    Merge to main       :crit, f3, 6, 7

  section Tester-Claude
    Create worktree     :tc1, 6, 7
    Run tests           :tc2, 7, 8
    Cleanup             :tc3, 8, 9

  section Tester-Cross
    Create worktree     :tx1, 6, 7
    Run tests           :tx2, 7, 8
    Cleanup             :tx3, 8, 9

  section Verifier-Claude
    Create worktree     :vc1, 7, 8
    Verify              :vc2, 8, 9
    Cleanup             :vc3, 9, 10

  section Verifier-Cross
    Create worktree     :vx1, 7, 8
    Verify              :vx2, 8, 9
    Cleanup             :vx3, 9, 10

← scroll →

Step 4  Create executor + frontend worktrees from HEAD
        git worktree add .bmb/worktrees/executor bmb-executor-{SESSION}
        git worktree add .bmb/worktrees/frontend bmb-frontend-{SESSION}

Step 5  Agents work in isolated worktrees (zero conflict possible)

Step 5.5 Merge worktrees → main
        commit in worktree → merge --no-edit → remove worktree
        Conflict? → bmb_learn MISTAKE + escalate to user

Step 6  Create 2 tester worktrees from merged HEAD
        tester-claude + tester-cross

Step 7  Create 2 verifier worktrees from merged HEAD
        verifier-claude + verifier-cross

Step 8+ Remove all remaining worktrees
        git worktree list | grep '.bmb/worktrees' | xargs -I{} git worktree remove {}

07 — Multi-Model

Cross-Model Protocols

Council debate uses Claude ↔ Codex/Gemini file exchange. Profile-based permissions keep read-only where needed.

sequenceDiagram
  participant A as Architect
(Claude Opus)
  participant F as Council Files
(.bmb/council/)
  participant X as Cross-Model
(Codex / Gemini)

  Note over A,X: Round 1 — Initial Proposals
  A->>F: Write claude-proposal.md
  X->>F: Write cross-proposal.md

  Note over A,X: Round 2 — Critique
  A->>F: Read cross-proposal.md
  A->>F: Write claude-critique.md
  X->>F: Read claude-proposal.md
  X->>F: Write cross-critique.md

  Note over A,X: Round 3 — Synthesis (optional)
  A->>F: Read cross-critique.md
  A->>F: Write claude-synthesis.md
  X->>F: Read claude-critique.md
  X->>F: Write cross-synthesis.md

  Note over A,X: Round 4 — Final Decision
  A->>F: Read all files
  A->>F: Write plan-to-exec.md ✅

  Note right of X: Cross-Model uses
--profile read-only

← scroll →

Read-Only Profiles

council and verify profiles: cross-model can read code and write to .bmb/ only. No production writes.

Write Profiles

test and exec-assist profiles: cross-model can write tests and helper code within worktree scope.

Per-Track Timeouts

Claude: 1200s default. Cross-model: 3600s default. Configurable via bmb-config.sh. Independent deadline tracking.

09 — Intelligence

Auto-Learning System

Mistakes, corrections, and successes automatically recorded. Past pitfalls injected into future sessions across all projects.

Project-Local

.bmb/learnings.md — One line per learning, chronological append. Loaded at Step 1 for this project.

↓

Global Cross-Project

~/.claude/bmb-system/learnings-global.md — Same format + [project_name] tag. Shared across all BMB projects.

↓

CLAUDE.md Promotion

Recurrence ≥2 → propose to user → permanent rule. Never auto-edits. User always approves.

# Example learnings.md entries
[2026-03-10 14:32] MISTAKE (step 8): Missing input validation → Always validate at API boundary
[2026-03-10 15:01] CORRECTION (step 3): User changed auth to OAuth → Confirm auth strategy in brainstorm
[2026-03-10 16:45] PRAISE (step 11): Pipeline completed successfully → Current approach works

# Context cost: ~150 tokens (5 lines × ~30 tokens). Negligible.

11 — Commands

Skill Commands

4 slash commands expose BMB capabilities at different scales — from full pipeline to focused brainstorming.

/BMB

Full Pipeline

The complete 11.5-step A-to-Z pipeline. Cross-model council, blind verification, analytics, simplification, and session continuity. Use for any non-trivial feature or bug fix.

11.5 steps cross-model worktree blind verify

/Bb

/BMB-brainstorm

Ideation

Lead + Consultant bidirectional brainstorming with conversation logging. Explores intent, requirements, and design before any code is written.

5 phases consultant no code

/Br

/BMB-refactoring

Code Quality

Parallel analysis with cross-model review, worktree-isolated execution, review cycle, and merge. Focused on improving existing code without feature changes.

6 phases cross-model worktree

/Bs

/BMB-setup

Configuration

One-time project setup: prerequisites check, config generation, gitignore rules, and confirmation. Run once per project before first pipeline use.

5 steps prerequisite

Skill Relationships

flowchart LR
    Setup["/BMB-setup\n⚙️ Config"]
    BMB["/BMB\n🔧 Full Pipeline"]
    Brainstorm["/BMB-brainstorm\n💡 Ideation"]
    Refactoring["/BMB-refactoring\n🔄 Code Quality"]

    Setup -->|"prerequisite"| BMB
    Setup -->|"prerequisite"| Brainstorm
    Setup -->|"prerequisite"| Refactoring
    Brainstorm -.->|"feeds into"| BMB
    Refactoring -.->|"standalone"| BMB

    style Setup fill:#111827,stroke:#22c55e,color:#e8edf5
    style BMB fill:#111827,stroke:#3b82f6,color:#e8edf5,stroke-width:3px
    style Brainstorm fill:#111827,stroke:#22d3ee,color:#e8edf5
    style Refactoring fill:#111827,stroke:#a78bfa,color:#e8edf5

← scroll →

Step Coverage Comparison

Phase	/BMB	/BMB-brainstorm	/BMB-refactoring
Setup / Config	✓	✓	—
Consultant Session	✓	✓	—
Brainstorm / Analysis	✓	✓	✓ Parallel
Council Debate	✓	—	✓ Synthesis
Architecture Plan	✓	—	—
Execution (Worktree)	✓	—	✓
Testing	✓	—	—
Verification (Blind)	✓	—	✓ Review
Fix Cycle	✓	—	✓
Simplification	✓	—	—
Merge / Cleanup	✓	✓ Summary	✓

Scenario	Behavior
Cross-model unavailable (council)	Solo design (Claude only), noted in session log
Cross-model unavailable (testing)	Claude-only test results, noted in reconciliation
Cross-model unavailable (verification)	Claude-only verification, noted in reconciliation
Claude tester timeout	Log timeout, continue with cross-model results
Cross-model timeout	Proceed with Claude-only results
Merge conflict	`bmb_learn MISTAKE` + escalate to user
Simplifier breaks tests	`bmb_learn MISTAKE` + revert + proceed with original
Telegram env unset	Skip notifications silently
knowledge.db missing	Skip indexing/search
Frontend not detected	Skip Frontend agent, Executor only

13 — Structure

Directory Layout

The .bmb/ directory is the single source of truth for all pipeline artifacts.

.bmb/ ├── handoffs/ ── Agent-to-agent artifacts │ ├── briefing.md ── Lead's brainstorm output │ ├── plan-to-exec.md ── Architect's execution plan │ ├── test-claude.md ── Claude tester results │ ├── test-cross.md ── Cross-model tester results │ ├── verify-claude.md ── Claude verifier results │ └── verify-cross.md ── Cross-model verifier results │ └── .compressed/ ── L1 summaries for Lead │ ├── briefing.summary.md │ ├── test-result-claude.summary.md │ └── verify-result-claude.summary.md ├── councils/ ── Cross-model debate files │ ├── LEGEND.md ── Index of all past debates │ └── {topic}/ │ ├── round-01-claude.md │ ├── round-01-cross.md │ └── CONSENSUS.md ├── worktrees/ ── Git worktree mount points │ ├── executor/ │ ├── frontend/ │ ├── tester-claude/ │ ├── tester-cross/ │ ├── verifier-claude/ │ └── verifier-cross/ ├── sessions/ ── Per-session state │ └── {session_id}/ │ ├── session-prep.md ── Next-session continuity │ └── conversation.log ── Consultant feed ├── .tool-cache/ ── L2 write-time cache ├── config.json ── Project configuration ├── learnings.md ── T1 project-local learnings ├── session-log.md ── Current session event log ├── consultant-feed.md ── Lead → Consultant feed └── knowledge.db ── L3 FTS5 reference index ~/.claude/bmb-system/ ── Global BMB installation ├── scripts/ │ ├── cross-model-run.sh ── Cross-model invocation │ ├── bmb-learn.sh ── Shared learning functions │ ├── knowledge-index.sh ── FTS5 indexer │ └── knowledge-search.sh── FTS5 search ├── config/ │ └── defaults.json ── Default configuration └── learnings-global.md ── T2 cross-project learnings

BMB
Be-my-butler

Core Design Principles

Agent Roster

The 11.5-Step Pipeline

Pipeline Flow Diagram

Handoff Data Flow

Consultant Feed Timeline

Blind Divergent Protocol

Claude Track

Cross-Model Track

Worktree Lifecycle

Cross-Model Protocols

3-Layer Context Compression

Auto-Learning System

Recipe Matrix

Skill Commands

Skill Relationships

Step Coverage Comparison

Graceful Degradation

Directory Layout