Introduction#
The Repository pattern and Active Record pattern are two approaches to data access in application code. They represent different trade-offs between simplicity and separation of concerns. The choice affects testability, code organization, and how tightly your domain logic couples to the database.
Active Record#
In Active Record, each model object knows how to save and retrieve itself. The object represents both the domain entity and its database persistence.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# Active Record style (Django ORM)
from django.db import models
class Order(models.Model):
user_id = models.IntegerField()
total = models.DecimalField(max_digits=10, decimal_places=2)
status = models.CharField(max_length=20, default="pending")
created_at = models.DateTimeField(auto_now_add=True)
def complete(self):
self.status = "completed"
self.save() # database operation on the domain object
def cancel(self):
if self.status == "completed":
raise ValueError("Cannot cancel a completed order")
self.status = "cancelled"
self.save()
# Usage: querying and saving on the model itself
order = Order.objects.get(id=1)
order.complete()
pending_orders = Order.objects.filter(status="pending").select_related("user")
Pros: Simple, less boilerplate, tightly integrated with the ORM.
Cons: Domain objects are coupled to the database. Hard to test without a database. Difficult to implement complex domain logic without mixing persistence concerns.
Repository Pattern#
The Repository mediates between the domain and the data mapping layer. Domain objects are pure Python (or C#) classes with no knowledge of persistence.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
# Domain entity: pure Python, no database knowledge
from dataclasses import dataclass, field
from datetime import datetime
from decimal import Decimal
@dataclass
class Order:
id: int | None
user_id: int
total: Decimal
status: str = "pending"
created_at: datetime = field(default_factory=datetime.utcnow)
def complete(self) -> None:
if self.status != "processing":
raise ValueError(f"Cannot complete order in status '{self.status}'")
self.status = "completed"
def cancel(self) -> None:
if self.status == "completed":
raise ValueError("Cannot cancel a completed order")
self.status = "cancelled"
# Repository interface: abstracts persistence
from abc import ABC, abstractmethod
class OrderRepository(ABC):
@abstractmethod
async def get_by_id(self, order_id: int) -> Order | None: ...
@abstractmethod
async def get_by_user(self, user_id: int) -> list[Order]: ...
@abstractmethod
async def save(self, order: Order) -> Order: ...
@abstractmethod
async def delete(self, order_id: int) -> None: ...
# SQLAlchemy implementation
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select
from app.db.models import OrderModel # SQLAlchemy model (separate from domain)
class SQLAlchemyOrderRepository(OrderRepository):
def __init__(self, session: AsyncSession):
self._session = session
async def get_by_id(self, order_id: int) -> Order | None:
result = await self._session.get(OrderModel, order_id)
return self._to_domain(result) if result else None
async def get_by_user(self, user_id: int) -> list[Order]:
result = await self._session.execute(
select(OrderModel).where(OrderModel.user_id == user_id)
)
return [self._to_domain(row) for row in result.scalars()]
async def save(self, order: Order) -> Order:
if order.id is None:
db_order = OrderModel(
user_id=order.user_id,
total=order.total,
status=order.status,
)
self._session.add(db_order)
await self._session.flush()
order.id = db_order.id
else:
db_order = await self._session.get(OrderModel, order.id)
db_order.status = order.status
db_order.total = order.total
return order
def _to_domain(self, model: OrderModel) -> Order:
return Order(
id=model.id,
user_id=model.user_id,
total=model.total,
status=model.status,
created_at=model.created_at,
)
Testing: The Key Difference#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# With Active Record: need database for unit tests
import pytest
from django.test import TestCase
class TestOrderCompletion(TestCase):
def test_complete_order(self):
order = Order.objects.create(user_id=1, total=99.99, status="processing")
order.complete()
# Hit the test database — slow, requires DB setup
# With Repository: test domain logic without database
import pytest
from unittest.mock import AsyncMock
@pytest.mark.asyncio
async def test_complete_order():
order = Order(id=1, user_id=1, total=Decimal("99.99"), status="processing")
mock_repo = AsyncMock(spec=OrderRepository)
mock_repo.get_by_id.return_value = order
service = OrderService(mock_repo)
await service.complete_order(1)
assert order.status == "completed"
mock_repo.save.assert_called_once_with(order)
@pytest.mark.asyncio
async def test_complete_order_wrong_status():
order = Order(id=1, user_id=1, total=Decimal("99.99"), status="pending")
mock_repo = AsyncMock(spec=OrderRepository)
mock_repo.get_by_id.return_value = order
service = OrderService(mock_repo)
with pytest.raises(ValueError, match="Cannot complete"):
await service.complete_order(1)
C#: Repository Pattern#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
// Domain entity
public record Order(int Id, int UserId, decimal Total, string Status);
// Repository interface
public interface IOrderRepository
{
Task<Order?> GetByIdAsync(int id, CancellationToken ct = default);
Task<IReadOnlyList<Order>> GetByUserAsync(int userId, CancellationToken ct = default);
Task<Order> SaveAsync(Order order, CancellationToken ct = default);
}
// Service uses the abstraction — not the implementation
public class OrderService
{
private readonly IOrderRepository _orders;
public OrderService(IOrderRepository orders) => _orders = orders;
public async Task<Order> CompleteOrderAsync(int orderId, CancellationToken ct)
{
var order = await _orders.GetByIdAsync(orderId, ct)
?? throw new NotFoundException($"Order {orderId} not found");
if (order.Status != "processing")
throw new InvalidOperationException($"Cannot complete order in status '{order.Status}'");
var completed = order with { Status = "completed" };
return await _orders.SaveAsync(completed, ct);
}
}
When to Use Each#
Active Record is appropriate for:
- CRUD-heavy applications with simple domain logic
- Rapid prototyping
- Django applications (the ORM is designed for Active Record)
Repository Pattern is appropriate for:
- Complex domain logic that should be tested independently
- Applications that may switch databases
- Large teams where separation of concerns improves maintainability
- Domain-Driven Design (DDD) where the domain model must be free of infrastructure concerns
Conclusion#
Active Record is simpler and faster to implement. Repository pattern provides better separation of concerns and testability. The deciding factor is complexity: if your domain logic is simple CRUD, Active Record is fine. If your domain objects have meaningful behavior that you want to test without a database, the Repository pattern pays for its boilerplate.