The Database Dilemma
Core Idea
Examples and diagrams in this page follow the shared Hypothetical Scenario.
Teams can decouple business rules from storage tooling. That design move is real and useful. Ports, adapters, repositories, and clean use case boundaries reduce direct coupling to a database product. Still, one hard constraint remains. Every database engine defines transaction behavior, replication behavior, and consistency behavior in its own way.
This page frames one architecture claim. Database selection must start from the transactional nature of the problem. The question is not only data shape. The question is operation semantics under load, faults, and scale.
A domain can be modeled in relational, document, or graph form. Yet the same domain operation can produce very different risk profiles across those engines. A cart checkout flow and a recommendation feed do not carry the same consistency needs. A fraud graph traversal and a monthly billing close do not carry the same transaction boundaries. The architecture decision must reflect that reality.
Historical Context
Codd published the relational model in 1970 with strong focus on data integrity and declarative access. Gray and Reuter documented transaction processing models in 1992 and strengthened ACID vocabulary in practice. Brewer framed CAP tradeoffs in 2000 and drove new distributed database debate. Dynamo and Bigtable era systems then expanded document and key value models for large scale replication paths.
Sources: Codd (1970), Gray and Reuter (1992), Brewer (2000), DeCandia et al. (2007)
The Problem It Solves
Architecture teams often compare databases through benchmark charts. Those charts can hide operational semantics. A fast read benchmark says little about cross-entity write invariants. A high write throughput chart says little about stale read tolerance in user-facing risk decisions.
This gap creates costly mistakes. Teams may pick a database from feature popularity. Then late project phases expose semantic mismatch. A few common failure paths appear in real programs.
- critical operations need strict atomic updates, yet engine defaults favor eventual consistency
- domain invariants span many aggregates, yet transaction scope is narrow or expensive
- replication lag breaks user trust in flows that need read-your-writes guarantees
- global scalability goals push partitioning, then cross-partition transactions become bottlenecks
- graph-like analysis arrives late, then relationship traversal in document or relational shape becomes complex and slow
The database dilemma is not a tooling fashion debate. It is a semantic alignment problem. If transaction semantics do not match problem semantics, abstraction layers will not save outcome quality.
Main Concept
The main concept is transactional nature mapping. Before selecting technology, map each domain operation through five architecture lenses.
- Atomicity scope
- Consistency requirement
- Contention profile
- Latency and scale target
- Persistence necessity
Atomicity scope asks this. What set of state changes must commit as one unit. A user profile update may touch one aggregate. A marketplace purchase may touch inventory, payment intent, and order state. These operations carry different failure cost.
Consistency requirement asks this. What read visibility is required after a write. Recommendation refresh can tolerate brief delay in many contexts. Payment confirmation cannot tolerate stale state in the same way.
Contention profile asks this. How many concurrent actors update the same logical records. Inventory counters and bid auctions face high write contention. Historical analytics tables face low contention.
Latency and scale target asks this. What throughput and response time goals define acceptable service behavior. A feed ranking job can run async batch windows. A checkout API must return fast and remain correct under peak spikes.
Persistence necessity asks this. Does this operation need durable state after process restart. Some operations only transform input into output in real time. Some operations keep short lived memory that expires in seconds. For these classes, a persistent database can add cost with no business gain.
Common no-database candidates in this platform:
- pure recommendation scoring from upstream immutable feeds
- request validation and normalization services
- stateless pricing simulation endpoints
- transient orchestration workers with idempotent upstream sources
The map separates operation classes by semantic pressure. Then architecture can align storage choices to each class.
How It Works
A practical architecture workflow can reduce risk. Run this workflow before final database commitment.
Step 0. Decide if persistent storage is required. Classify operations by durability need. If the operation has no durable state requirement, keep the component stateless. Use in-memory processing and external event inputs. Do not introduce a database only for future possibility.
Step 1. Build an operation catalog.
List core domain operations in concrete terms.
Use names such as create_order, reserve_listing, publish_listing, score_match_candidates, run_owner_cost_projection, detect_seller_ring.
Step 2. Define semantic grade per operation. Assign required transaction guarantees, read visibility guarantees, and tolerated anomaly classes. Document if dirty reads, non-repeatable reads, lost updates, or stale replicas are acceptable.
Step 3. Group operations by semantic class. You will often see three broad classes.
- strict transactional core
- scalable flexible content core
- deep relationship analysis core
Step 4. Match classes to candidate options. If durable state is not required, select a stateless path with no local database. If durable state is required, select a database model. Relational engines usually fit strict multi-row ACID invariants. Document engines often fit high-scale aggregate reads and flexible shape evolution. Graph engines often fit path-heavy correlation and relationship-centric analysis.
Step 5. Design storage ports at architecture boundary. Keep business use cases behind ports and interfaces. Then adapters implement database specifics. This decoupling protects core logic from migration cost. It does not erase semantic constraints. Port contracts must encode semantic assumptions clearly.
Step 6. Validate with architecture tests and load tests. Run failure injection for replica lag, partial region loss, and concurrent write races. Run semantic tests for invariants under peak load.
Step 7. Plan evolution path. A stateless component can remain stateless for long periods if domain rules allow it. A single engine may fit early stage when persistence is required. Growth can justify polyglot persistence later. Plan migration seams and event contracts from day one.
A decision matrix helps teams keep focus.
| Decision Lens | No Persistent Database | Relational Model | Document Model | Graph Model |
|---|---|---|---|---|
| Durable state requirement | Not required | Required | Required | Required |
| Multi-record ACID invariants | Not applicable | Strong fit | Mixed fit by engine and partition scope | Mixed fit |
| Flexible schema evolution | High at code level | Moderate | Strong | Moderate |
| Relationship traversal depth | Low to moderate | Moderate with joins | Moderate with embedding and references | Strong |
| Horizontal partition scale path | Strong through stateless scale-out | Moderate to strong by engine and design | Strong | Mixed by workload |
| Operational complexity for graph analytics | Not a fit | High | High | Moderate |
The matrix guides discussion. Final decision still depends on concrete operation map and risk tolerance. A no-database choice is a valid architecture decision when durability and transaction scope do not require persistence.
The flow shows how transactional class drives storage selection, then adapter boundaries preserve business decoupling.
Challenges and Shortcomings
No database model solves every operation class with equal quality. That statement is simple and important. In some operation classes, no database is the right solution.
A single relational engine can carry strong integrity. Yet very high scale document payload workloads may push cost and latency. A single document engine can scale quickly. Yet strict multi-aggregate invariants can become complex at high contention points. A graph engine can unlock rich correlation. Yet it may add operational overhead for workflows that are mostly transactional rows.
Polyglot persistence can reduce semantic mismatch. It then introduces new costs.
- data duplication and synchronization design
- stronger need for event contract governance
- more observability and ops skill demand
- cross-store consistency strategy decisions
A database can be harmful in low-state services. It can create migration overhead, backup overhead, and operational burden. It can push teams toward premature schema commitments. It can increase incident surface with little business value.
A no-database strategy still needs discipline:
- explicit idempotency keys at integration edges
- clear source-of-truth boundaries in upstream systems
- replay safe event processing rules
- bounded in-memory state with failure recovery rules
Teams must choose explicit tradeoffs. Hidden tradeoffs create the hardest incidents. Architecture records should capture these decisions in clear language with dates and assumptions.
Link to Existing Handbook Concepts
| Concept | Why? |
|---|---|
| Abstraction and Boundaries | Ports and adapters hide vendor APIs but preserve semantic contracts. |
| State and Data Modeling | State transitions and invariants define transaction boundaries. They define where no durable state is acceptable. |
| Onion Architecture | Hexagonal Architecture. Both models support storage decoupling through inward dependencies and explicit ports. |
| Hexagonal Architecture | Hexagonal Architecture. Both models support storage decoupling through inward dependencies and explicit ports. |
| Dependency Inversion | Interface Segregation Principle. Decoupling works best with narrow, semantically precise contracts. |
| Interface Segregation Principle | Interface Segregation Principle. Decoupling works best with narrow, semantically precise contracts. |
| Correctness and Testing | Semantic contract tests must validate transaction behavior under contention and failure. |
Written by: Pedro Guzmán
See References for complete APA-style bibliographic entries used on this page.