Confidence and Decision Trace
Confidence expresses uncertainty while decision trace records the choices that shaped an outcome.
The cognitive layer begins here. Surfaces 1-7 are operational. Surfaces 8 and 9 are cognitive. Together they make agent decisions observable, replayable, and trustable over time. This is the most distinctive technical contribution of KCC.
Surface 8 - Confidence: Purpose
Provide a structured, computed confidence score for the agent's output. This enables the Butler to escalate low-confidence invocations and the Inspector Pipeline to detect confidence calibration patterns.
confidence:
overall_confidence: float # 0.0-1.0
computation:
method: string # How the confidence was computed
inputs: [string] # What signals contributed
formula_reference: string # Reference to documented algorithm
confidence_sources:
- source: string # e.g., 'input_familiarity', 'tool_reliability'
value: float
weight: float
known_limitations:
- description: string
severity: low|medium|high
impact_on_confidence: float # How much this discounts overall_confidenceThe Critical Rule
Confidence values are computed, not generated.
computation.method cannot be the literal string 'model_introspection' or 'model_says_so' - confidence must be computed from observable signals, not asked of the model. The agent must publish its formula_reference so the computation can be audited. Confidence values must be reproducible: given the same input hash and the same versions, the same confidence should result. Confidence below the cell's threshold triggers HITL escalation regardless of HOTL eligibility. When an agent says 'I am 0.85 confident,' that number must be defensible by replaying the computation, not a hallucination. This is the single most important rule of the cognitive layer.
Common Confidence Sources
| Source | What It Measures |
|---|---|
| input_familiarity | How well the input matches the agent's training distribution |
| tool_reliability | Whether the tools the agent invoked returned successful results |
| output_validation | Whether the agent's output passes structural validation |
| domain_coverage | Whether the input domain is one the agent handles well |
| intermediate_consistency | Whether the agent's reasoning steps are mutually consistent |
| historical_accuracy | The agent's historical accuracy on similar inputs |
The Confidence Gate
The escalation rule is, made concrete, a three-layer separation. Contract: every agent turn emits a confidence value in [0.0, 1.0] together with a short justification naming its top uncertainties. Threshold: a per-cell threshold (a common default is 0.7) decides what counts as low. Gate: on a trip, the agent invokes the universal escalation gate - pause, four-way decision, record, resume - rather than inventing a local escape hatch. Keeping the three layers separate is what lets the spec mandate that confidence gates without mandating how confidence is scored.
The Calibration Loop answers the calibration question operationally: for every turn, record (agent, claimed_confidence, observed_outcome), where the outcome is one of approved | revised | escalated | aborted. Over a rolling window per agent (a common default is the last ten turns), negative drift (claims run higher than outcomes) means the agent is over-confident and its effective threshold should be raised; positive drift means it is sandbagging. The gate consumes the loop by biasing its per-agent effective threshold by the observed drift.
Surface 9 - Decision Trace: Purpose
Record a structured trace of how the agent arrived at its output. This enables replay, audit, and pattern detection. A KCC organization without decision traces is a KCC organization that cannot learn from itself.
decision_trace:
trace_id: uuid
invocation_id: uuid
input_hash: hash # Hash of the input for replay matching
versions:
kernel: semver
capability: semver
prompt: hash
reasoning:
method: chain_of_thought|tree_of_thought|none|model_native
available: boolean # Some models do not expose reasoning
content: string|reference # Inline or stored reference
tool_invocations:
- tool_name: string
invocation_order: int
input: <serialized>
output: <serialized>
duration_ms: int
cost: number
output:
schema_ref: string
value: <reference>
confidence: <reference to Surface 8 value>
resource_consumption:
tokens_in: int
tokens_out: int
latency_ms: int
cost_usd: float
outcome:
succeeded: boolean
error: string|null
human_intervention: boolean
final_state: <serialized>trace_id is unique across all traces; input_hash is computed deterministically; tool_invocations includes every external interaction in order; the trace is recorded durably (typically append-only storage). Replay-by-input-hash must produce either the same output (deterministic agents) or measurably similar output within a defined tolerance (stochastic agents). Traces cannot be sampled below L2 - every invocation produces a trace.
Three Operations Traces Enable
- Replay - given the same input hash and versions, re-running produces the same (deterministic) or measurably similar (stochastic) output; questions about a decision made three months ago are answered with evidence, not speculation
- Audit - when a decision is contested, the trace shows the exact basis: every tool invocation, reasoning step, and resource consumption; the engineering equivalent of ADRs, produced per invocation, automatically
- Pattern Detection - the Inspector Pipeline reads traces across cells to detect emergent practices worth promoting; sparse, sampled, or missing traces produce sparse, biased, or wrong patterns
Decision traces are the engineering record. Without them, "AI decisions" is a category that cannot be audited, learned from, or improved. With them, it becomes engineering.