Introduction#
DynamoDB’s pricing and performance model rewards a design that is radically different from relational databases. The single-table design pattern stores all entity types in one table, using carefully constructed partition and sort keys to enable efficient access patterns. This post covers the methodology and common patterns.
Why Single-Table Design#
DynamoDB charges per request and scales based on partition throughput. Joins do not exist — to relate entities from different tables requires multiple round-trips. Single-table design solves this by co-locating related items.
1
2
Traditional RDBMS: many tables, JOIN at query time
DynamoDB: one table, JOIN at write time via key structure
Table Structure#
1
2
3
4
5
6
7
8
9
10
11
Table: MyApp
PK (Partition Key) | SK (Sort Key) | Attributes
--------------------|------------------------|-------------------
USER#42 | PROFILE | name, email, plan
USER#42 | ORDER#2024-01-15#1001 | status, total
USER#42 | ORDER#2024-01-20#1002 | status, total
USER#42 | ADDRESS#home | street, city, zip
PRODUCT#SKU-123 | METADATA | name, price, stock
ORDER#1001 | METADATA | user_id, total
ORDER#1001 | ITEM#1 | product_id, qty
ORDER#1001 | ITEM#2 | product_id, qty
Key Design Principles#
Single entity: PK = ENTITY_TYPE#id, SK = METADATA
One-to-many relationship: PK = PARENT_TYPE#parent_id, SK = CHILD_TYPE#child_id
Hierarchical data: PK = ROOT#id, SK = CHILD#grandchild#leaf
Python: DynamoDB Resource#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
import boto3
from boto3.dynamodb.conditions import Key, Attr
from datetime import datetime
dynamodb = boto3.resource("dynamodb", region_name="us-east-1")
table = dynamodb.Table("MyApp")
# Write a user profile
def create_user(user_id: int, name: str, email: str) -> None:
table.put_item(Item={
"PK": f"USER#{user_id}",
"SK": "PROFILE",
"name": name,
"email": email,
"created_at": datetime.utcnow().isoformat(),
"GSI1PK": f"EMAIL#{email}", # for email lookup via GSI
"GSI1SK": "PROFILE",
})
# Write an order (co-located with user data)
def create_order(user_id: int, order: dict) -> None:
order_id = order["id"]
created_at = order["created_at"]
table.put_item(Item={
"PK": f"USER#{user_id}",
# Pad order_id for lexicographic sorting by date
"SK": f"ORDER#{created_at}#{order_id:08d}",
"order_id": order_id,
"total": order["total"],
"status": order["status"],
# Duplicate data for direct order lookup
"GSI1PK": f"ORDER#{order_id}",
"GSI1SK": "METADATA",
})
Efficient Access Patterns#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# Get user profile (direct lookup)
def get_user(user_id: int) -> dict:
response = table.get_item(Key={
"PK": f"USER#{user_id}",
"SK": "PROFILE",
})
return response.get("Item")
# Get all orders for a user (range query)
def get_user_orders(user_id: int, limit: int = 20) -> list:
response = table.query(
KeyConditionExpression=Key("PK").eq(f"USER#{user_id}")
& Key("SK").begins_with("ORDER#"),
ScanIndexForward=False, # newest first (descending SK)
Limit=limit,
)
return response["Items"]
# Get everything about a user (profile + orders + addresses) in one query
def get_user_with_all(user_id: int) -> dict:
response = table.query(
KeyConditionExpression=Key("PK").eq(f"USER#{user_id}"),
)
items = response["Items"]
return {
"profile": next((i for i in items if i["SK"] == "PROFILE"), None),
"orders": [i for i in items if i["SK"].startswith("ORDER#")],
"addresses": [i for i in items if i["SK"].startswith("ADDRESS#")],
}
Global Secondary Indexes (GSI)#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# GSI: look up user by email
def get_user_by_email(email: str) -> dict | None:
response = table.query(
IndexName="GSI1",
KeyConditionExpression=Key("GSI1PK").eq(f"EMAIL#{email}")
& Key("GSI1SK").eq("PROFILE"),
)
items = response["Items"]
return items[0] if items else None
# GSI for global order status queries
def get_pending_orders(limit: int = 100) -> list:
response = table.query(
IndexName="GSI2",
KeyConditionExpression=Key("GSI2PK").eq("STATUS#pending"),
ScanIndexForward=False,
Limit=limit,
)
return response["Items"]
Transactions#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Atomic operation: create order + decrement inventory
def place_order(user_id: int, product_id: str, quantity: int) -> None:
dynamodb_client = boto3.client("dynamodb")
dynamodb_client.transact_write_items(
TransactItems=[
{
"Put": {
"TableName": "MyApp",
"Item": {
"PK": {"S": f"USER#{user_id}"},
"SK": {"S": f"ORDER#{datetime.utcnow().isoformat()}"},
"product_id": {"S": product_id},
"quantity": {"N": str(quantity)},
},
"ConditionExpression": "attribute_not_exists(PK)",
}
},
{
"Update": {
"TableName": "MyApp",
"Key": {
"PK": {"S": f"PRODUCT#{product_id}"},
"SK": {"S": "METADATA"},
},
"UpdateExpression": "SET stock = stock - :q",
"ConditionExpression": "stock >= :q", # optimistic lock
"ExpressionAttributeValues": {":q": {"N": str(quantity)}},
}
}
]
)
Access Pattern First Design Process#
- List all access patterns before writing any code
- Get user by ID
- Get user by email
- Get all orders for user (newest first)
- Get order by ID
- Get pending orders (admin view)
-
Design keys to serve each access pattern directly without secondary queries.
-
Add GSIs for access patterns that cannot use the primary key.
- Denormalize freely — DynamoDB reads are cheap; normalization costs extra round-trips.
Conclusion#
DynamoDB single-table design requires thinking through access patterns before writing any code. Co-locate related entities by prefixing partition keys consistently. Use GSIs for alternate access patterns. Embrace denormalization — duplicate data across entities to avoid multiple round-trips. The investment in upfront design pays off in predictable performance at any scale.