Architecture Critic
A separate architecture-critic reviews a design adversarially before any code is written, returning a verdict that feeds the same escalation gate as the rest of the model.
The Scenario
An architect agent has produced a design for the customer-portal SSO feature from example 13.5: an OIDC integration, an auth-callback service, a session store, and a role-based authorization layer. Before any code is written, a separate architecture-critic reviews the design adversarially. The critic does not produce architecture. It interrogates it.
What the Critic Probes
- Unstated assumptions — what must be true for this design to hold that no one wrote down?
- Missing failure modes — what happens when the IdP is slow, down, or returns a malformed token?
- Untested NFR claims — the design says 'scales to 10k concurrent logins'; on what basis?
- Trust-boundary gaps — where does untrusted input cross into trusted execution?
- Over- / under-engineering — is the complexity proportional to the declared depth level?
A Sample Critique
## Architecture Critique — IDEA-014-portal-sso VERDICT: revise (2 blocking, 1 advisory) [BLOCKING] Trust boundary — the auth-callback service treats the IdP redirect parameters as trusted. They are untrusted content (10.1). State and nonce validation are not shown. A forged callback could mint a session. -> Require explicit state/nonce validation before session creation. [BLOCKING] Failure mode — no behavior is specified for IdP timeout. The design implies an indefinite wait, which becomes a portal-wide outage when the IdP degrades. -> Specify a timeout, a fallback, and a user-facing failure state. [ADVISORY] Over-engineering — a bespoke session store is proposed where the existing platform session service already meets the stated SLO. -> Justify the bespoke store against the declared depth level, or reuse. confidence: 0.81
The critique is itself a typed artifact, carries a confidence value, and is recorded in the decision trace. Its verdict feeds the same four-way escalation gate the rest of the model uses.
What This Illustrates
A producer that reviews its own design has a blind spot exactly where it is most confident. Splitting the roles — and optionally the model classes — turns review into an adversarial second opinion rather than a self-check. Section 10.6 honestly admits functional verification is unsolved; a structured second opinion before code exists does not close that gap, but it is among the cheapest practical levers against it. Catching the missing nonce validation at design time costs a paragraph; catching it in production costs an incident.