Correctness
Core Idea
Examples and diagrams in this page follow the shared Hypothetical Scenario.
Correctness is the degree to which software behavior satisfies a defined specification under expected operating conditions. Testing is one of the primary mechanisms for validating that claim, but correctness itself is broader than testing activity. It includes behavior, data integrity, contract semantics, failure handling, and operational guarantees.
In the scenario platform, correctness is not only "returning recommendations." A response can be fast and still wrong if it violates budget constraints, uses stale inventory assumptions, or breaks ownership invariants. Correctness must therefore be engineered as a system property across modules, services, and delivery pipelines.
Conceptual Overview
Correctness Dimensions
A practical correctness model separates concerns into explicit dimensions:
- functional correctness: outputs match expected business behavior
- contract correctness: API and event semantics remain stable and valid
- data correctness: state transitions preserve invariants and ownership rules
- temporal correctness: behavior remains valid under concurrency, retries, and timing variation
- fault-path correctness: degradation and recovery behavior preserve guarantees
Without these dimensions, teams over-index on "happy-path correctness" and miss failure-path defects.
Specification and Oracles
Correctness is only testable when expected behavior is explicit. Teams need high-quality oracles:
- acceptance criteria with concrete input/output expectations
- domain invariants with formal or semi-formal statements
- interface contracts with stable request, response, and error models
- operational constraints such as idempotency and timeout semantics
An oracle should be specific enough that a failure means one of two things: the system is wrong or the specification is wrong. Ambiguous oracles create test suites that pass while defects survive.
Correctness in Distributed Systems
Distributed systems add failure modes that do not appear in local code units:
- partial success across service boundaries
- duplicated messages and out-of-order delivery
- stale reads during eventual-consistency windows
- retried commands that mutate state multiple times
A correctness strategy must include distributed guarantees such as idempotency keys, compensating flows, and explicit consistency expectations. This links directly to State and Data Modeling, Resilience and Recovery, and Correlation IDs.
Correctness and Test Layers
No single test type can validate all correctness dimensions. A layered model is required:
- Unit Testing validates local behavior deterministically
- Smoke Testing validates deployment-level critical-path viability
- Integration and Functional Testing validates cross-boundary behavior and user-facing workflows
Test depth should follow risk, not habit. A simple pure function rarely needs broad integration scenarios. A distributed payment or reservation workflow always does.
Determinism and Signal Quality
A correctness claim is weak when tests are flaky. Deterministic tests require controlled inputs, stable clocks, explicit randomness, and isolated dependencies. Signal quality also matters:
- one behavior claim per test whenever practical
- descriptive names that encode state and expectation
- failure output that points to violated behavior, not framework internals
This reduces diagnostic time and increases trust in the suite.
Correctness Under Change
Most correctness incidents are introduced during change, not initial implementation. A robust strategy includes:
- regression protection on historical defect classes
- compatibility checks for contract evolution
- risk-based test selection in CI
- post-incident test additions that prevent recurrence
Correctness is therefore not a one-time quality gate. It is a continuous engineering discipline tied to architecture and delivery.
Computing History
The Ariane 5 Flight 501 failure in June 1996 is a classic correctness lesson. A reused conversion routine raised an overflow exception under the new flight profile. Exception handling assumptions from a prior context were no longer valid. The incident showed that correctness cannot be inherited from previous systems without revalidation against current operating conditions.
Sources: European Space Agency (1996)
Quote
"Testing can show the presence of bugs, not their absence."
Source: Edsger W. Dijkstra, 1972
Practice Checklist
- Define correctness dimensions before implementation begins.
- Write behavior specifications with explicit input, output, and failure semantics.
- Link every critical invariant to at least one automated verification path.
- Treat flaky tests as correctness defects, not tooling noise.
- Validate idempotency, retries, and timeout behavior in distributed workflows.
- Add compatibility checks for every externally consumed contract.
- Keep test names descriptive and behavior-oriented.
- Review test strategy after incidents and architecture changes.
- Track correctness debt explicitly (missing tests, unstable oracles, weak failure-path coverage).
- Ensure release decisions include correctness signal review, not only throughput and latency.
Written by: Pedro Guzmán
See References for complete APA-style bibliographic entries used on this page.