Skip to content

Repository Pattern

Repository Pattern is a pattern that implements Dependency Inversion. It works with Liskov Substitution Principle. It supports Single Responsibility Principle.

A use case calls a repository contract. A repository adapter talks to one database driver contract. Use case code stays free of driver details.

Core Idea

A repository is a boundary object for persistence operations. Domain and use case code call domain language methods such as find_matching and save_profile. The repository contract should stay agnostic from database technology. Method names, parameters, and return types should express domain intent. They should not expose SQL strings, collection names, or driver session objects.

Conceptual Overview

All examples in this page follow the shared context: Hypothetical Scenario.

In the car intelligence platform, recommendation flows read listings, complaints, ownership cost, and user preference data. A recommendation use case should ask for cars that match a profile. It should not know if data comes from PostgreSQL, MongoDB, Neo4j, or a read model cache. Repository contracts hold that boundary.

A correct implementation creates three effects.

  • domain services stay focused on policy logic
  • persistence adapters change with lower blast radius
  • tests run with fast fakes that follow the same contracts

Computing History

In 2002, Martin Fowler described Repository in Patterns of Enterprise Application Architecture. The pattern gave domain code a collection-like interface for persistent objects. Eric Evans expanded this style in domain-driven design with clear aggregate boundaries.

Sources: Fowler (2002) and Evans (2003)

Generic Example: Contract First Design in Python

from dataclasses import dataclass
from typing import Protocol


@dataclass(frozen=True)
class CarListing:
    listing_id: str
    make: str
    model: str
    year: int
    market_value_usd: int
    reliability_score: float


@dataclass(frozen=True)
class MatchQuery:
    max_price_usd: int
    min_reliability_score: float
    fuel_type: str


class CarListingRepository(Protocol):
    def find_by_id(self, listing_id: str) -> CarListing | None:
        raise NotImplementedError

    def find_matching(self, query: MatchQuery) -> list[CarListing]:
        raise NotImplementedError

The contract speaks in domain terms. No method leaks tables, joins, collection APIs, or driver types.

Generic Example: Repository Adapter with Injected Driver in Python

from typing import Any, Protocol


class DbDriver(Protocol):
    def fetch_all(self, statement: str, params: dict[str, Any]) -> list[dict[str, Any]]:
        raise NotImplementedError


class SqlCarListingRepository(CarListingRepository):
    def __init__(self, driver: DbDriver):
        self.driver = driver

    def find_by_id(self, listing_id: str) -> CarListing | None:
        rows = self.driver.fetch_all(
            statement=(
                "SELECT listing_id, make, model, year, market_value_usd, reliability_score "
                "FROM car_listings WHERE listing_id = %(listing_id)s"
            ),
            params={"listing_id": listing_id},
        )
        if not rows:
            return None
        row = rows[0]
        return CarListing(
            listing_id=row["listing_id"],
            make=row["make"],
            model=row["model"],
            year=row["year"],
            market_value_usd=row["market_value_usd"],
            reliability_score=row["reliability_score"],
        )

    def find_matching(self, query: MatchQuery) -> list[CarListing]:
        rows = self.driver.fetch_all(
            statement=(
                "SELECT listing_id, make, model, year, market_value_usd, reliability_score "
                "FROM car_listings "
                "WHERE market_value_usd <= %(max_price)s "
                "AND reliability_score >= %(min_rel)s "
                "AND fuel_type = %(fuel)s"
            ),
            params={
                "max_price": query.max_price_usd,
                "min_rel": query.min_reliability_score,
                "fuel": query.fuel_type,
            },
        )
        return [
            CarListing(
                listing_id=row["listing_id"],
                make=row["make"],
                model=row["model"],
                year=row["year"],
                market_value_usd=row["market_value_usd"],
                reliability_score=row["reliability_score"],
            )
            for row in rows
        ]

The repository gets a driver dependency from the outside. Dependency injection keeps object wiring in the composition root.

class RecommendCarsUseCase:
    def __init__(self, repository: CarListingRepository):
        self.repository = repository

    def run(self, query: MatchQuery) -> list[CarListing]:
        matches = self.repository.find_matching(query)
        return sorted(matches, key=lambda item: item.reliability_score, reverse=True)


def build_recommend_cars_use_case(sql_connection_pool) -> RecommendCarsUseCase:
    driver = PsycopgDriver(connection_pool=sql_connection_pool)
    repository = SqlCarListingRepository(driver=driver)
    return RecommendCarsUseCase(repository=repository)

Liskov Substitution in Driver and Repository Dependencies

RecommendCarsUseCase depends on CarListingRepository. SqlCarListingRepository depends on DbDriver. Each dependency is typed as a contract. A production driver and a fake driver can replace each other with no use case changes. This is Liskov substitution in practice.

class FakeDriver(DbDriver):
    def __init__(self, rows: list[dict[str, Any]]):
        self.rows = rows

    def fetch_all(self, statement: str, params: dict[str, Any]) -> list[dict[str, Any]]:
        return self.rows


def test_recommend_cars_orders_by_reliability() -> None:
    fake_rows = [
        {
            "listing_id": "car-10",
            "make": "Aster",
            "model": "City",
            "year": 2021,
            "market_value_usd": 19000,
            "reliability_score": 84.0,
        },
        {
            "listing_id": "car-20",
            "make": "Aster",
            "model": "Tour",
            "year": 2022,
            "market_value_usd": 21000,
            "reliability_score": 91.0,
        },
    ]
    repository = SqlCarListingRepository(driver=FakeDriver(rows=fake_rows))
    use_case = RecommendCarsUseCase(repository=repository)

    result = use_case.run(
        MatchQuery(max_price_usd=25000, min_reliability_score=80.0, fuel_type="hybrid")
    )

    assert [item.listing_id for item in result] == ["car-20", "car-10"]

Generic Example: TypeScript with Driver Interface

type CarListing = {
  listingId: string
  make: string
  model: string
  year: number
  marketValueUsd: number
  reliabilityScore: number
}

type MatchQuery = {
  maxPriceUsd: number
  minReliabilityScore: number
  fuelType: string
}

interface CarListingRepository {
  findMatching(query: MatchQuery): Promise<CarListing[]>
}

interface DbDriver {
  query(statement: string, params: Record<string, unknown>): Promise<Array<Record<string, unknown>>>
}

class PostgresCarListingRepository implements CarListingRepository {
  constructor(private readonly driver: DbDriver) {}

  async findMatching(query: MatchQuery): Promise<CarListing[]> {
    const rows = await this.driver.query(
      `
      SELECT listing_id, make, model, year, market_value_usd, reliability_score
      FROM car_listings
      WHERE market_value_usd <= :maxPriceUsd
      AND reliability_score >= :minReliabilityScore
      AND fuel_type = :fuelType
      `,
      {
        maxPriceUsd: query.maxPriceUsd,
        minReliabilityScore: query.minReliabilityScore,
        fuelType: query.fuelType,
      },
    )

    return rows.map((row) => ({
      listingId: String(row.listing_id),
      make: String(row.make),
      model: String(row.model),
      year: Number(row.year),
      marketValueUsd: Number(row.market_value_usd),
      reliabilityScore: Number(row.reliability_score),
    }))
  }
}

Non Compliant and Compliant Shapes

Non compliant example:

class RecommendCarsUseCase:
    def run(self, query: MatchQuery) -> list[dict]:
        import psycopg

        with psycopg.connect("postgresql://...") as conn:
            rows = conn.execute("SELECT * FROM car_listings").fetchall()
            return rows

Main issues:

  • use case imports a concrete driver
  • data access policy leaks into domain flow
  • tests need heavy patching of low-level calls

Compliant direction:

  • define a repository contract with domain methods
  • inject repository implementation into use case constructor
  • inject database driver into repository constructor
  • replace driver or repository with fakes in tests

This pattern links to Abstraction and Boundaries. The repository contract creates a clean persistence boundary.

This pattern links to Modularity and Composition. Domain policy, repository contract, and adapter each keep a focused role.

This pattern links to Correctness and Testing. Contract-based substitution keeps tests deterministic.

This pattern links to State and Data Modeling. Repositories protect invariants by controlling write and read semantics at one boundary.

Practice Checklist

  • define repository methods in domain language
  • keep repository contracts free of engine-specific types
  • inject repository implementations into use case classes
  • inject driver contracts into repository adapters
  • use fakes in unit tests through the same contracts
  • keep transaction policy close to repository adapter boundaries
  • review repository interfaces in design reviews for contract drift

Written by: Pedro Guzmán

See References for complete APA-style bibliographic entries used on this page.