KCC 05.8-05.9

Confidence and Decision Trace

Confidence expresses uncertainty while decision trace records the choices that shaped an outcome.

Agent ContractConfidenceDecision traceCalibrationCognitive layerSurfaces 8-9 · Confidence / Trace
Created 2026-06-08 · v0.4.0
The cognitive layer begins here. Surfaces 1-7 are operational. Surfaces 8 and 9 are cognitive. Together they make agent decisions observable, replayable, and trustable over time. This is the most distinctive technical contribution of KCC.

Surface 8 - Confidence: Purpose

Provide a structured, computed confidence score for the agent's output. This enables the Butler to escalate low-confidence invocations and the Inspector Pipeline to detect confidence calibration patterns.

confidence:
  overall_confidence: float          # 0.0-1.0
  computation:
    method: string                   # How the confidence was computed
    inputs: [string]                 # What signals contributed
    formula_reference: string        # Reference to documented algorithm
  confidence_sources:
    - source: string                 # e.g., 'input_familiarity', 'tool_reliability'
      value: float
      weight: float
  known_limitations:
    - description: string
      severity: low|medium|high
      impact_on_confidence: float    # How much this discounts overall_confidence

The Critical Rule

Confidence values are computed, not generated.

computation.method cannot be the literal string 'model_introspection' or 'model_says_so' - confidence must be computed from observable signals, not asked of the model. The agent must publish its formula_reference so the computation can be audited. Confidence values must be reproducible: given the same input hash and the same versions, the same confidence should result. Confidence below the cell's threshold triggers HITL escalation regardless of HOTL eligibility. When an agent says 'I am 0.85 confident,' that number must be defensible by replaying the computation, not a hallucination. This is the single most important rule of the cognitive layer.

Common Confidence Sources

SourceWhat It Measures
input_familiarityHow well the input matches the agent's training distribution
tool_reliabilityWhether the tools the agent invoked returned successful results
output_validationWhether the agent's output passes structural validation
domain_coverageWhether the input domain is one the agent handles well
intermediate_consistencyWhether the agent's reasoning steps are mutually consistent
historical_accuracyThe agent's historical accuracy on similar inputs

The Confidence Gate

The escalation rule is, made concrete, a three-layer separation. Contract: every agent turn emits a confidence value in [0.0, 1.0] together with a short justification naming its top uncertainties. Threshold: a per-cell threshold (a common default is 0.7) decides what counts as low. Gate: on a trip, the agent invokes the universal escalation gate - pause, four-way decision, record, resume - rather than inventing a local escape hatch. Keeping the three layers separate is what lets the spec mandate that confidence gates without mandating how confidence is scored.

The Calibration Loop answers the calibration question operationally: for every turn, record (agent, claimed_confidence, observed_outcome), where the outcome is one of approved | revised | escalated | aborted. Over a rolling window per agent (a common default is the last ten turns), negative drift (claims run higher than outcomes) means the agent is over-confident and its effective threshold should be raised; positive drift means it is sandbagging. The gate consumes the loop by biasing its per-agent effective threshold by the observed drift.

Surface 9 - Decision Trace: Purpose

Record a structured trace of how the agent arrived at its output. This enables replay, audit, and pattern detection. A KCC organization without decision traces is a KCC organization that cannot learn from itself.

decision_trace:
  trace_id: uuid
  invocation_id: uuid
  input_hash: hash                   # Hash of the input for replay matching
  versions:
    kernel: semver
    capability: semver
    prompt: hash
  reasoning:
    method: chain_of_thought|tree_of_thought|none|model_native
    available: boolean               # Some models do not expose reasoning
    content: string|reference        # Inline or stored reference
  tool_invocations:
    - tool_name: string
      invocation_order: int
      input: <serialized>
      output: <serialized>
      duration_ms: int
      cost: number
  output:
    schema_ref: string
    value: <reference>
    confidence: <reference to Surface 8 value>
  resource_consumption:
    tokens_in: int
    tokens_out: int
    latency_ms: int
    cost_usd: float
  outcome:
    succeeded: boolean
    error: string|null
    human_intervention: boolean
    final_state: <serialized>

trace_id is unique across all traces; input_hash is computed deterministically; tool_invocations includes every external interaction in order; the trace is recorded durably (typically append-only storage). Replay-by-input-hash must produce either the same output (deterministic agents) or measurably similar output within a defined tolerance (stochastic agents). Traces cannot be sampled below L2 - every invocation produces a trace.

Three Operations Traces Enable

  • Replay - given the same input hash and versions, re-running produces the same (deterministic) or measurably similar (stochastic) output; questions about a decision made three months ago are answered with evidence, not speculation
  • Audit - when a decision is contested, the trace shows the exact basis: every tool invocation, reasoning step, and resource consumption; the engineering equivalent of ADRs, produced per invocation, automatically
  • Pattern Detection - the Inspector Pipeline reads traces across cells to detect emergent practices worth promoting; sparse, sampled, or missing traces produce sparse, biased, or wrong patterns
Decision traces are the engineering record. Without them, "AI decisions" is a category that cannot be audited, learned from, or improved. With them, it becomes engineering.