KCC 10.6-10.7

Verification and Environments

Two honest boundaries: functional verification is an unsolved problem KCC does not pretend to close, and host-mutating actions are gated and never silent-able.

Special ConcernsFunctional verificationEnvironment mutationAutonomy envelope
Created 2026-06-08 · v0.4.0

10.6 — Functional Verification

Decision Trace (Surface 9) and Confidence (Surface 8) make agent behavior auditable and replayable. They do not, by themselves, prove that the output is functionally correct. A green test suite when the AI wrote the tests is not strong evidence of correctness.

Functional verification of agent output remains an unsolved problem.

Honest framing: KCC enables verification infrastructure (replay-by-input-hash, confidence calibration, trace audit) but does not specify the verification methodology. Cells are responsible for their own functional verification.

MethodWhat It Provides
Golden datasetsCurated examples with known-correct outputs; regression detection
Property testsAssertions that must hold across many inputs; behavioral invariants
Differential testingComparing new agent versions against previous on the same inputs
Sampled human-in-loop checksRandom verification at defined rates

Why functional verification is hard: the test itself is suspect when the AI wrote both code and tests; correctness is domain-specific; verification at scale requires verification of verification; and some things can only be discovered, not verified. One practical lever is the Architecture Critic: a producer/critic separation that applies an adversarial second opinion at design time, when the cost of a wrong decision is lowest.

Auditability is not correctness. KCC provides the first. The second is your job, in your domain, with methods appropriate to your risks.

10.7 — The Environment Mutation Boundary

Reading the environment is safe and can be unsupervised. Changing the environment — installing a compiler, package manager, or runtime — is a system-changing act that must not ride along silently inside an autonomous run. An agent may freely inspect its environment and suggest install commands, but it may not install anything on its own authority.

OutcomeMeaning
installThe human authorizes the install to run now
human-installThe human will install out of band; the agent waits
deferThe step is postponed; the build/verify step is marked deferred, not silently re-stacked

Observe freely; mutate only through a gate that never goes silent:

Diagram

Two rules make the boundary hold: the gate is never silent (it fires even under the most permissive autonomy envelope), and every decision carries provenance in the decision trace. Detection is reversible; installation is not. At scale, silent environment mutation is how an 'autonomous coding' program becomes an 'unexplained machine state' program.

An agent may read its world freely. Changing it is a decision a human signs.