Context Economics
Two agents with the same token budget can differ sharply in effectiveness because system prompt and tool-definition bloat erode the working space.
Description
Two agents may have the same token budget but very different effectiveness depending on context window utilization. A window 80% filled with system prompt and tool definitions leaves only 20% for working space; effectiveness degrades sharply even though token consumption appears acceptable.
The failure pattern: an agent is designed with a reasonable token envelope, then the system prompt grows by accretion (more instructions, more examples), tool definitions accumulate, and few-shot examples expand — until 14K tokens of overhead leave only 6K for actual work. Quality degrades, but cost looks fine, and the maintainer blames the model.
How KCC Handles This
Surface 5 (Cost Envelope) includes context_efficiency as a declared field. Capabilities declare expected context utilization at design time; the Token Guard monitors actual utilization.
| Value | Interpretation |
|---|---|
| over 0.8 | Lean — most context is working space |
| 0.6-0.8 | Healthy — typical for well-designed capabilities |
| 0.4-0.6 | Acceptable — but worth reviewing for bloat |
| 0.3-0.4 | Concerning — system prompt may be too elaborate |
| under 0.3 | Rejected — system prompt is consuming the context window |
Patterns That Cause Bloat
The Inspector Pipeline detects bloat patterns: over-elaborate system prompts (fix: state principles, not exceptions), redundant tool definitions (fix: consolidate, remove unused), excessive few-shot examples (fix: trust the structured input schema), and declarations that re-describe kernel-provided context (fix: only specify what is capability-specific).
Why It Matters
Making context efficiency a declared field forces the conversation at design time. A capability that declares 0.8 and consistently runs at 0.5 has a problem the kernel will surface. The drift signal is more valuable than the absolute number — the same way a service with a 512MB memory budget that drifts to 4.4GB is in trouble not because 4.4GB is wrong but because the drift was invisible.
Boring engineering applied to AI: declare your budgets, monitor your actuals, refactor when they diverge.