KCC 10.5

Context Economics

Two agents with the same token budget can differ sharply in effectiveness because system prompt and tool-definition bloat erode the working space.

Special ConcernsContext efficiencySystem prompt bloatCost envelopeDrift
Created 2026-06-08 · v0.4.0

Description

Two agents may have the same token budget but very different effectiveness depending on context window utilization. A window 80% filled with system prompt and tool definitions leaves only 20% for working space; effectiveness degrades sharply even though token consumption appears acceptable.

The failure pattern: an agent is designed with a reasonable token envelope, then the system prompt grows by accretion (more instructions, more examples), tool definitions accumulate, and few-shot examples expand — until 14K tokens of overhead leave only 6K for actual work. Quality degrades, but cost looks fine, and the maintainer blames the model.

How KCC Handles This

Surface 5 (Cost Envelope) includes context_efficiency as a declared field. Capabilities declare expected context utilization at design time; the Token Guard monitors actual utilization.

ValueInterpretation
over 0.8Lean — most context is working space
0.6-0.8Healthy — typical for well-designed capabilities
0.4-0.6Acceptable — but worth reviewing for bloat
0.3-0.4Concerning — system prompt may be too elaborate
under 0.3Rejected — system prompt is consuming the context window

Patterns That Cause Bloat

The Inspector Pipeline detects bloat patterns: over-elaborate system prompts (fix: state principles, not exceptions), redundant tool definitions (fix: consolidate, remove unused), excessive few-shot examples (fix: trust the structured input schema), and declarations that re-describe kernel-provided context (fix: only specify what is capability-specific).

Why It Matters

Making context efficiency a declared field forces the conversation at design time. A capability that declares 0.8 and consistently runs at 0.5 has a problem the kernel will surface. The drift signal is more valuable than the absolute number — the same way a service with a 512MB memory budget that drifts to 4.4GB is in trouble not because 4.4GB is wrong but because the drift was invisible.

Boring engineering applied to AI: declare your budgets, monitor your actuals, refactor when they diverge.