All projects 2026

AI Infrastructure · Personal Project · 2026

Project
Architect.

Seven-phase planning skill that turns a fuzzy idea into a frozen set of stage contracts — with parallel research agents, 400K token budgeting, and HTML overlays at every human-review gate.

Planning phases

Parallel research agents

Review-gate overlays

Deliverable templates

01 · System overview

Seven phases, one frozen plan.

The architect walks a project from a blank slate to a frozen set of stage contracts. Every phase has a gate the human approves before the architect proceeds. Every gate writes a markdown checkpoint to disk so the next session can resume from the last approved phase — and renders an HTML overlay so the human actually reads what they're approving.

Seven phases, five HTML review gates, seven markdown checkpoints — all under one 400K context budget

Ask only the three to five questions whose answers would fork the architecture. Save the rest for Phase 3.

02 · Phase 0–1 — Classify, Disambiguate

Start narrow, ask only what forks.

A planning skill that asks 15 generic questions earns nothing but user fatigue. The architect's first move is to classify the project size (Small / Medium / Large → 1, 3, or 5 stages), then ask only the 3–5 questions whose answers would fork the technology stack, the deployment model, the data architecture, or the core API. Everything else is deferred to Phase 3.

01Classify + Disambig

The fork test, applied to every candidate question.

Each Phase 1 question must pass a single test: if the answer were different, would the architecture be different? If the answer is no, the question is logged for Phase 3 and skipped now. Each question must also explain its architectural impact in parentheses, so the user knows what they're committing to.

The example the skill carries in its own documentation: "Is this deployed to cloud or on-premise? (Forks: containerization strategy, database choice, auth model)." The parenthetical makes the question feel less like a survey and more like a decision with consequences.

Phase 0

Size classification

Small / Medium / Large map to 1, 3, and 5 stages. The classification drives the entire downstream batch count and team composition.

Phase 1

Critical-path questions only

Maximum 3–5. Each must list the architectural axes it forks. Anything that doesn't fork the architecture is deferred to Phase 3.

Discipline

Single message

All Phase 1 questions arrive in one message. No drip-feed. The user can answer them at their own cadence; the architect waits.

Insurance

Crash checkpoint

docs/architect-outputs/phase-1.md captures every Q+A verbatim. Session-recovery rereads it to resume without restarting the conversation.

03 · Phase 2 — Parallel research

Three Explore agents, one synthesised report.

Technology selection is parallelised. The architect dispatches up to three Explore subagents — one per technology domain (backend, frontend, data/infra) — each instructed to rank candidates by performance first, capability second, power ceiling third, maturity fourth. Team familiarity is not a selection criterion: unfamiliar tools can be learned, but performance ceilings cannot be worked around.

Three agents fan out, return ranked tables, architect synthesises into one comparable matrix

02Research

Performance first, familiarity never.

Each Explore agent receives the same context — project description, Phase 1 answers, hard constraints — and an explicit instruction: find the most efficient and powerful technology for the use case; team familiarity is NOT a factor. The output per agent is a comparison table ranked by performance, with benchmarks and capability scores cited.

The architect synthesises the three reports into a single Technology Mapping Report with the same columns across all domains. The Blueprint-aesthetic HTML overlay (/architect-html-renderer:tech-mapping) renders the comparison side-by-side so the user can pick decisively at the Phase 2 gate.

Dispatch

Agent tool · subagent_type Explore

Three parallel research subagents, one per domain. Each scoped to its category — no cross-pollination, no shared context bloat.

Criteria

Performance > capability > ceiling > maturity

Strict priority order. Familiarity appears only as a tie-breaker when all four upper criteria are equal across candidates.

Synthesis

Mapping Report

Cross-domain table with category / choice / performance edge / maturity / capability score / justification. One row per chosen tech.

Review

Blueprint comparison HTML

The tech-mapping renderer turns the MD into a per-domain comparison page with the recommended choice highlighted. User picks at the gate.

04 · Phase 3 — Refine scope

Numbered ambiguity, three-tier scope.

After the user provides domain context, the architect enumerates every remaining ambiguity as a numbered list with its architectural impact, then classifies every feature into one of three tiers: Essential v1 (system doesn't work without it), Valuable v2 (significant value, defer with extension point), Possible Future (note in timeline only).

03Scope

Stop asking when no remaining answer changes the architecture.

The skill defines explicit merge-point detection: stop asking ambiguity questions when (a) no remaining answer would change the architecture, (b) the user signals readiness ("just build it"), (c) the last three answers were "your call," or (d) all core entities and business rules are documented. Beyond that, asking more is just stalling.

The tier classification is shown to the user in an Editorial-aesthetic HTML overlay (/architect-html-renderer:scope-tiers) — three columns, each feature with its architectural-impact line, future-tier items annotated with their promotion condition.

Ambiguity resolution

Numbered impact list

Every open question gets a number and an "Impact:" line. The numbering makes it easy for the user to answer by reference.

Discovery tracks

Backend · frontend · data

Parallel sub-discoveries when needed — entities + relationships + business rules on the backend, user roles + workflows on the frontend, sources + transformations on the data side.

Tier classification

Essential / Valuable / Future

Every feature mentioned is sorted. Valuable items get an extension point documented; Future items get a promotion condition.

Review

Editorial 3-column HTML

Scope decisions feel like editorial choices about what the product is — the warm cream + serif aesthetic reflects that.

05 · Phase 4 — Staged architecture

A 400K token budget, drawn at design time.

The most consequential phase. The architect designs the staged architecture (1 / 3 / 5 stages per project size), computes the token budget against the 400K context window, draws the dependency graph between stages, and produces stage docs in dual resolution (a 400K-sized compact version for tight loads, a 1M-sized full version with ASCII diagrams). For projects with both backend and frontend, a mandatory observability layer is part of the architecture — telemetry isn't retrofittable.

Fixed + variable ≤ 41% of context window · target ≥60% headroom for code per task

04Architect

Dual-resolution stage docs and a Mermaid dependency graph.

Each stage has two docs: the 400K version (~150 lines, bullets and tables only) for compact context loads, and the 1M version (~400 lines, ASCII diagrams + code patterns + interface definitions) for the full-context model. The architect produces both. Workers and auditors choose which to load based on the model they're running on.

The architecture review HTML overlay (/architect-html-renderer:architecture-review) is load-bearing — every downstream contract depends on this approval. The page shows the cross-stage Mermaid dependency graph, the directory tree, per-stage drill-down panels, and the token budget visualisation in a single Blueprint-aesthetic view.

Stages

1 / 3 / 5 by project size

Small = 1 stage, Medium = 3 (backend / frontend / integration), Large = 5. Each stage independently testable.

Dual resolution

400K compact + 1M full

Workers on the 400K model load the compact doc. Auditors and architects on the 1M model load the full doc with ASCII diagrams.

Observability

Mandatory for backend + frontend

If the project has both layers, the architect MUST include a dual-destination event bus (JSONL + WebSocket) in the architecture. Retrofitting is 10× harder.

Review

Blueprint plan-review HTML

The highest-stakes gate in the project — Mermaid dep graph + directory tree + per-stage panels + token budget. Must be reviewed as HTML, not chat text.

06 · Phase 5–6 — Contracts, handover

Iterative contract drafting, then a clean handover.

Phase 5 drafts the stage contracts in tight HIL collaboration. Each contract has a deliverables table, interface contracts with exact signatures, verification commands with expected output, and a completion checklist. The architect iterates with the human until each contract passes a single quality test: can a worker with only this contract and the project rules determine, with zero ambiguity, whether their work is complete?

05Contracts + Handover

Per-stage contracts, then CLAUDE.md as the directory.

The contract review HTML overlay (/architect-html-renderer:contract-review) is invoked once per stage. The Blueprint page renders the deliverables table, every interface signature in its own block with a parameter table and return type, each verification command paired with its expected output, and the batch dependency chain as a Mermaid diagram.

Phase 6 hands over three layers of documentation. Stage contracts are immutable after planning — workers and auditors grade against them. .context.md files are living docs updated by workers as they implement, one per architectural boundary, capped at 8–10 total. project_summary.md is the append-only session log written by the scribe. CLAUDE.md routes to all of them — it is a directory, not a warehouse.

Contract drafting

HIL iteration

Suggest breakdown → draft → human reviews HTML overlay → iterate until quality test passes → freeze in contracts/.

Three documentation layers

Contracts · context · summary

"What to deliver" (immutable) + "what was built" (living, per-module) + "what happened" (append-only session log).

CLAUDE.md

Directory, not warehouse

Routes to contracts and rules by file path. Never inlines them. Keeps the worker boot context small.

Decision timeline

Editorial filterable table

Every decision with its trigger-for-change. Read across the project's lifetime — the Editorial aesthetic gives it editorial weight.

07 · Five review-gate overlays

The architect's read-once review surface.

Markdown specs past ~100 lines stop being read — both by humans and by future-you. The HTML overlay at each gate is the intervention. Five purpose-built renderer commands in architect-html-renderer turn every architect output into a scannable page in the right aesthetic for that artefact. The markdown stays canonical and git-tracked; the HTML overlay is ephemeral, regenerable, and never enters the repo.

07Overlays

Two aesthetics. Five renderers. One design system.

The shared design system lives as a rule file at ~/.claude/rules/cto-orchestration/design-system.md. It explicitly assigns one aesthetic per artefact: Blueprint (deep slate + accent blue, IBM Plex Sans + IBM Plex Mono, subtle grid background) for engineering review, Editorial (warm cream + deep navy, Instrument Serif + JetBrains Mono) for decisions read across the project's lifetime. The discipline is what makes five different renderers feel like one system.

Phase 2

tech-mapping · Blueprint

Per-domain comparison table with performance metrics, capability scores, and the recommended choice highlighted.

Phase 3

scope-tiers · Editorial

Three-column tier breakdown (Essential v1 / Valuable v2 / Possible Future) with architectural-impact lines.

Phase 4

architecture-review · Blueprint

Mermaid dependency graph + directory tree + per-stage drill-down panels + token budget visualisation. The load-bearing gate.

Phase 5b · per stage

contract-review · Blueprint

Deliverables + interface contracts + verification commands + batch dependency chain. One invocation per contract.

Phase 6e

decision-timeline · Editorial

Filterable decision log grouped by phase with trigger-for-change column. Read across the project's lifetime.

Discipline

Anti-slop ruleset

Forbidden fonts (Inter, Roboto), forbidden palettes (indigo/violet, cyan-magenta-pink), forbidden patterns (emoji headers, gradient text). Two violations = regenerate.

08 · Full stack

Everything in version control.

Four layers, all editable as markdown / Astro / inline HTML, all in ~/.claude/.

Skill layer

SKILL.mdSeven-phase workflow
references/Deep guides per phase
templates/Seven deliverable templates
scripts/init-project-docs.sh

Templates

stage-doc-400k.template.mdCompact stage reference
stage-doc-1m.template.mdFull stage reference
stage-contract.template.mdHIL contract structure
domain-context.template.mdEntities + business rules
context-md.template.mdPer-module living doc
project-summary.template.mdAppend-only session log
batch-plan.template.mdExecution batches

Renderers

tech-mappingBlueprint comparison
scope-tiersEditorial 3-column
architecture-reviewBlueprint + Mermaid
contract-reviewBlueprint per stage
decision-timelineEditorial table

Agents

Explore × 3Parallel research subagents
Plan agentArchitecture design (Phase 4)
Skill toolRenderer invocation per gate
Crash recoveryRe-reads checkpoints on resume

09 · Design constraints

Every rule encodes a past failure mode.

The non-negotiable rules in the architect's behaviour are not preferences — each one is a fence around a specific way the skill has failed at least once.

Critical-path discipline

3–5 fork-the-architecture questions max

Generic surveys produce user fatigue and inferior answers. The fork test makes each question worth answering.

Performance first

Familiarity is not a selection criterion

Unfamiliar tools can be learned; performance ceilings cannot be worked around. Familiarity returns only as a tiebreaker.

400K token budget

≥60% headroom for code per task

Fixed (CLAUDE.md + domain + .context.md) plus variable (stage doc + task prompt) must leave at least 60% of the context window free.

Crash checkpoints

Every phase gate writes phase-N.md

Re-running the skill on a half-finished project reads the checkpoints and resumes — no recap, no restart, no lost decisions.

Contract quality test

Worker + rules → "done?" with zero ambiguity

If a contract leaves room for interpretation, the auditor can't grade against it. The HIL drafting loop catches ambiguity before freeze.

Observability mandatory

Backend + frontend → telemetry layer in architecture

Retrofitting telemetry is 10× harder than building it in. Even simple projects grow; debugging without it is blind.

MD canonical

HTML overlays never enter git

HTML diffs are noisy. Every overlay is regenerable from its source MD, so commits stay focused on the spec, not the rendering.

CLAUDE.md is a directory

Routes to contracts + rules · never inlines

Inlining bloats every worker spawn. The CLAUDE.md sends them to the right file; their context loads only what they need.

10 · Lessons learned

What this build taught me.

Lesson 01

Questions cost user time. Ask only the 3–5 that fork the architecture; defer everything else to Phase 3 where it can be enumerated as numbered ambiguity. A planning skill that asks 15 generic questions earns nothing but fatigue and inferior answers.

Lesson 02

For ambitious projects, performance-first selection beats team-familiarity. Unfamiliar tools can be learned in a weekend; a fundamental performance ceiling can't be worked around at any cost. Familiarity returns only as a tiebreaker.

Lesson 03

Token budgeting at design time prevents context starvation later. The 400K rule — fixed + variable ≤ 40% — gives every task at least 60% of the window free for code. Designing the budget AFTER the architecture is too late.

Lesson 04

HIL contract drafting catches ambiguity before it propagates. Once a contract is frozen and the executor team is running, ambiguity becomes a RED cycle. The quality test — "can a worker determine 'done?' with zero ambiguity" — is the cheapest filter.

Lesson 05

Crash checkpoints are insurance for the future-you who returns after the session crashed. The architect writes phase-N.md after every gate; recovery rereads them rather than restarting. The cost is small; the benefit shows up exactly once but it pays for itself the first time.

Lesson 06

HTML overlays at the review gate dramatically raise the chance the spec actually gets read. A 400-line markdown stage doc gets skimmed; a Blueprint HTML page with Mermaid, directory tree, and token budget viz gets reviewed. The markdown stays canonical; the HTML is the read-once review surface.