KCC 06.1

The Token Guard

The Token Guard enforces cost envelopes and blast-radius limits for every invocation — structurally simple but operationally critical.

Meta AgentsCost envelopeBlast radiusEscalationCost transparency
Created 2026-06-08 · v0.4.0

Cost and Blast-Radius Enforcement

The Token Guard enforces cost envelopes and blast-radius limits for every invocation. It is structurally simple but operationally critical.

Before Invocation

  • Reads the agent's declared cost envelope (Surface 5).
  • Estimates the cost of the invocation based on input size, historical performance, and prompt structure.
  • If estimated cost exceeds max_tokens_in, max_latency_ms, or max_cost_usd — refuses the invocation.
  • If estimated cost approaches max_* values (typically over 80%) — escalates to HITL via the Butler.
  • Inspects declared tools for lethal trifecta status; forces HITL if detected.
  • Requires kernel maintainer approval for organizational blast radius in HOTL mode.

During and After Invocation

  • Monitors actual resource consumption; cancels invocations that breach hard ceilings mid-flight.
  • Updates the agent's running cost profile (used for future estimates).
  • Flags agents whose actual costs systematically exceed their declared envelopes.
  • Emits telemetry to the Inspector Pipeline; updates organizational cost dashboards.

The Token Guard's gate, before and during an invocation:

Diagram

Blast Radius Enforcement

LevelDefinitionHOTL Allowed?
readRead data, cannot modify stateYes
localModify state within a single resourceYes
domainModify state within a single service/systemYes, with declaration
organizationalAffects multiple systemsOnly with kernel maintainer approval

Cost Transparency

The Token Guard's telemetry feeds a cost dashboard visible to the entire organization: every cell sees its own consumption, kernel maintainers see aggregate consumption, and costs are attributed to capabilities and cells so poorly designed or runaway usage can be flagged. Cost transparency is not surveillance — it is the same visibility that financial planning provides for any organizational resource.

Why It Matters

Most AI adoption failures in 2026 are cost failures, not capability failures: a team enables agents, agents start running in loops, costs spike, leadership panics, the program is paused. The Token Guard prevents this by making cost a first-class declarative property of each agent, enforced at runtime, with visibility into who is consuming what.

Cost is governed the same way memory and latency are governed in distributed systems: declared, monitored, enforced.