Repository Pattern vs Active Record: Choosing Your Data Access Strategy

The Repository pattern and Active Record pattern are two approaches to data access in application code. They represent different trade-offs between simplicity and separation of concerns. The choice af

Introduction#

The Repository pattern and Active Record pattern are two approaches to data access in application code. They represent different trade-offs between simplicity and separation of concerns. The choice affects testability, code organization, and how tightly your domain logic couples to the database.

Active Record#

In Active Record, each model object knows how to save and retrieve itself. The object represents both the domain entity and its database persistence.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# Active Record style (Django ORM)
from django.db import models

class Order(models.Model):
    user_id = models.IntegerField()
    total = models.DecimalField(max_digits=10, decimal_places=2)
    status = models.CharField(max_length=20, default="pending")
    created_at = models.DateTimeField(auto_now_add=True)

    def complete(self):
        self.status = "completed"
        self.save()  # database operation on the domain object

    def cancel(self):
        if self.status == "completed":
            raise ValueError("Cannot cancel a completed order")
        self.status = "cancelled"
        self.save()

# Usage: querying and saving on the model itself
order = Order.objects.get(id=1)
order.complete()

pending_orders = Order.objects.filter(status="pending").select_related("user")

Pros: Simple, less boilerplate, tightly integrated with the ORM.

Cons: Domain objects are coupled to the database. Hard to test without a database. Difficult to implement complex domain logic without mixing persistence concerns.

Repository Pattern#

The Repository mediates between the domain and the data mapping layer. Domain objects are pure Python (or C#) classes with no knowledge of persistence.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
# Domain entity: pure Python, no database knowledge
from dataclasses import dataclass, field
from datetime import datetime
from decimal import Decimal

@dataclass
class Order:
    id: int | None
    user_id: int
    total: Decimal
    status: str = "pending"
    created_at: datetime = field(default_factory=datetime.utcnow)

    def complete(self) -> None:
        if self.status != "processing":
            raise ValueError(f"Cannot complete order in status '{self.status}'")
        self.status = "completed"

    def cancel(self) -> None:
        if self.status == "completed":
            raise ValueError("Cannot cancel a completed order")
        self.status = "cancelled"

# Repository interface: abstracts persistence
from abc import ABC, abstractmethod

class OrderRepository(ABC):
    @abstractmethod
    async def get_by_id(self, order_id: int) -> Order | None: ...

    @abstractmethod
    async def get_by_user(self, user_id: int) -> list[Order]: ...

    @abstractmethod
    async def save(self, order: Order) -> Order: ...

    @abstractmethod
    async def delete(self, order_id: int) -> None: ...

# SQLAlchemy implementation
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select
from app.db.models import OrderModel  # SQLAlchemy model (separate from domain)

class SQLAlchemyOrderRepository(OrderRepository):
    def __init__(self, session: AsyncSession):
        self._session = session

    async def get_by_id(self, order_id: int) -> Order | None:
        result = await self._session.get(OrderModel, order_id)
        return self._to_domain(result) if result else None

    async def get_by_user(self, user_id: int) -> list[Order]:
        result = await self._session.execute(
            select(OrderModel).where(OrderModel.user_id == user_id)
        )
        return [self._to_domain(row) for row in result.scalars()]

    async def save(self, order: Order) -> Order:
        if order.id is None:
            db_order = OrderModel(
                user_id=order.user_id,
                total=order.total,
                status=order.status,
            )
            self._session.add(db_order)
            await self._session.flush()
            order.id = db_order.id
        else:
            db_order = await self._session.get(OrderModel, order.id)
            db_order.status = order.status
            db_order.total = order.total
        return order

    def _to_domain(self, model: OrderModel) -> Order:
        return Order(
            id=model.id,
            user_id=model.user_id,
            total=model.total,
            status=model.status,
            created_at=model.created_at,
        )

Testing: The Key Difference#

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# With Active Record: need database for unit tests
import pytest
from django.test import TestCase

class TestOrderCompletion(TestCase):
    def test_complete_order(self):
        order = Order.objects.create(user_id=1, total=99.99, status="processing")
        order.complete()
        # Hit the test database — slow, requires DB setup

# With Repository: test domain logic without database
import pytest
from unittest.mock import AsyncMock

@pytest.mark.asyncio
async def test_complete_order():
    order = Order(id=1, user_id=1, total=Decimal("99.99"), status="processing")
    mock_repo = AsyncMock(spec=OrderRepository)
    mock_repo.get_by_id.return_value = order

    service = OrderService(mock_repo)
    await service.complete_order(1)

    assert order.status == "completed"
    mock_repo.save.assert_called_once_with(order)

@pytest.mark.asyncio
async def test_complete_order_wrong_status():
    order = Order(id=1, user_id=1, total=Decimal("99.99"), status="pending")
    mock_repo = AsyncMock(spec=OrderRepository)
    mock_repo.get_by_id.return_value = order

    service = OrderService(mock_repo)
    with pytest.raises(ValueError, match="Cannot complete"):
        await service.complete_order(1)

C#: Repository Pattern#

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
// Domain entity
public record Order(int Id, int UserId, decimal Total, string Status);

// Repository interface
public interface IOrderRepository
{
    Task<Order?> GetByIdAsync(int id, CancellationToken ct = default);
    Task<IReadOnlyList<Order>> GetByUserAsync(int userId, CancellationToken ct = default);
    Task<Order> SaveAsync(Order order, CancellationToken ct = default);
}

// Service uses the abstraction — not the implementation
public class OrderService
{
    private readonly IOrderRepository _orders;

    public OrderService(IOrderRepository orders) => _orders = orders;

    public async Task<Order> CompleteOrderAsync(int orderId, CancellationToken ct)
    {
        var order = await _orders.GetByIdAsync(orderId, ct)
            ?? throw new NotFoundException($"Order {orderId} not found");

        if (order.Status != "processing")
            throw new InvalidOperationException($"Cannot complete order in status '{order.Status}'");

        var completed = order with { Status = "completed" };
        return await _orders.SaveAsync(completed, ct);
    }
}

When to Use Each#

Active Record is appropriate for:

  • CRUD-heavy applications with simple domain logic
  • Rapid prototyping
  • Django applications (the ORM is designed for Active Record)

Repository Pattern is appropriate for:

  • Complex domain logic that should be tested independently
  • Applications that may switch databases
  • Large teams where separation of concerns improves maintainability
  • Domain-Driven Design (DDD) where the domain model must be free of infrastructure concerns

Conclusion#

Active Record is simpler and faster to implement. Repository pattern provides better separation of concerns and testability. The deciding factor is complexity: if your domain logic is simple CRUD, Active Record is fine. If your domain objects have meaningful behavior that you want to test without a database, the Repository pattern pays for its boilerplate.

Contents