KCC 11

Metrics and Measurements

KCC is meaningful only insofar as it is measurable: what to measure, at what layer and phase, with what cadence, and how to interpret the result.

MetricsLeading indicatorsLagging indicatorsCalibrationCadence
Created 2026-06-08 · v0.4.0

KCC is meaningful only insofar as it is measurable. This section specifies what to measure, at what level, with what frequency, and how to interpret results. It distinguishes leading indicators (signals that predict future outcomes) from lagging indicators (signals that confirm what already happened).

11.1 Metrics by Layer

LayerRepresentative metrics
Per-agentsuccess_rate, latency percentiles, cost_per_invocation, confidence_calibration, hitl_escalation_rate, downstream_rejection_rate, workslop_signature_count
Per-capabilitycells_using_capability, maturity level, promotion_readiness, breaking_change_rate, issue resolution times
Per-celltotal_cost, cost_per_feature, success_rate, reviewer_load_p95, kernel_contract_violations, bypass_signature_count
Per-organizationcapability_reuse_ratio, cross_cell_learning_lag, inspector promotions per month, roi_on_kcc_investment, knowledge_survey_score

11.2 Metrics by Phase

Different metrics matter at different phases; using the wrong metric for the current phase produces misleading signals. Phase 1 tracks lead-time delta and per-individual productivity (and will plateau — that plateau is the signal to invest in Phase 2). Phase 2 tracks specs per sprint and token burn rate. Phase 3 tracks cross-cell learning lag, capability reuse ratio, ROI, and knowledge-survey scores — organizational intelligence over time, not per-person productivity. See Phase Model.

11.3 Leading vs. Lagging Indicators

Leading indicatorWhat it predicts
Reviewer load trendsReview quality degradation
Confidence calibration driftTrust failures
Bypass signature countPlatform abandonment
Inspector proposal queue lengthPipeline backlog
Context efficiency degradationAgent quality decline

Leading indicators allow intervention before harm and should be reviewed weekly or monthly. Lagging indicators (success rates, cost per feature, ROI, knowledge-survey scores) confirm whether the system is working and are reviewed monthly or quarterly.

11.5 Measurement Cadences

CadenceWhat's measured
Real-timeToken Guard signals, Butler decisions, decision traces
DailyPer-cell cost summaries, reviewer load, failure-rate snapshots
WeeklyInspector detection report, adoption changes, bypass and workslop review
MonthlyPer-capability health, pipeline velocity, confidence calibration
QuarterlyPhase progression, kernel evolution, L3 review, ROI and compounding metrics

11.6 Metric Anti-Patterns

  • Measuring activity instead of outcome — invocations per day says nothing about value.
  • Using individual productivity in Phase 3 — it hides cross-cell learning effects.
  • Reporting averages without distributions — averages hide tail problems.
  • Optimizing for cost without quality — a cheap workslop generator is worse than an expensive value generator.
  • Mistaking acceptance for verification — humans approving things does not mean they are correct.
Numbers without judgment are noise.

Section 11 specifies the measurements; Intelligence Economics addresses why they matter.