Redis Beyond Caching: Data Structures and Patterns

Most engineers encounter Redis as a cache in front of a database. That is a legitimate use, but Redis is a data structure server with several built-in types that can replace entire service dependencie

Introduction#

Most engineers encounter Redis as a cache in front of a database. That is a legitimate use, but Redis is a data structure server with several built-in types that can replace entire service dependencies when used well. This post walks through the core data structures and practical patterns for each.

Core Data Structures#

Strings#

The simplest type. Strings hold bytes up to 512 MB and support atomic increment/decrement operations.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import redis

r = redis.Redis(host="localhost", port=6379, decode_responses=True)

# Simple key-value with TTL
r.setex("session:user:42", 3600, "active")

# Atomic counter — rate limiting, page view counts
r.incr("page:views:/home")
r.incrby("api:calls:user:42", 1)

# Compare-and-swap pattern using SET NX
acquired = r.set("lock:resource:1", "worker-1", nx=True, ex=30)
if acquired:
    try:
        # critical section
        pass
    finally:
        # only release if we own the lock
        current = r.get("lock:resource:1")
        if current == "worker-1":
            r.delete("lock:resource:1")

Hashes#

A hash is a map of field-value pairs stored under one key. Useful for objects where you want to read or update individual fields without fetching the whole object.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Store user profile — update fields independently
r.hset("user:42", mapping={
    "name": "Alice",
    "email": "alice@example.com",
    "plan": "pro",
    "login_count": 0,
})

# Read one field without deserializing the whole object
plan = r.hget("user:42", "plan")

# Atomic field increment
r.hincrby("user:42", "login_count", 1)

# Read all fields
profile = r.hgetall("user:42")

Hashes are memory-efficient for small maps. Redis uses a compact encoding (ziplist) when a hash has fewer than 128 fields and all values are under 64 bytes.

Lists#

Ordered sequences with O(1) push/pop at both ends. The natural fit for queues and stacks.

1
2
3
4
5
6
7
8
# Producer: push tasks to the right
r.rpush("queue:emails", '{"to": "alice@example.com", "subject": "Welcome"}')

# Consumer: blocking pop — waits up to 30s for a new item
task = r.blpop("queue:emails", timeout=30)
if task:
    _, payload = task
    print("Processing:", payload)

BLPOP blocks until a message arrives, making it an efficient alternative to polling. For reliable processing where you need to acknowledge completion, use BRPOPLPUSH or the Streams type.

Sets#

Unordered collections of unique strings with O(1) membership tests and set operations (union, intersection, difference).

1
2
3
4
5
6
7
8
9
10
11
12
13
# Track unique visitors per day
r.sadd("visitors:2026-04-01", "user:42", "user:99", "user:101")
r.sadd("visitors:2026-04-02", "user:42", "user:200")

# Returning visitors: intersection
returning = r.sinter("visitors:2026-04-01", "visitors:2026-04-02")

# New visitors today: difference
new_today = r.sdiff("visitors:2026-04-02", "visitors:2026-04-01")

# Random sampling — A/B test assignment
r.sadd("experiment:treatment", *range(1, 501))
r.sadd("experiment:control", *range(501, 1001))

Sorted Sets#

Like sets but each member has a floating-point score. Members are ordered by score. Provides O(log N) insertions and range queries by score or rank.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import time

# Leaderboard: score is the user's points
r.zadd("leaderboard:global", {"user:42": 4200, "user:99": 3800, "user:7": 5100})

# Top 10 players (highest score first)
top10 = r.zrevrange("leaderboard:global", 0, 9, withscores=True)

# User rank (0-indexed, lower is better here — invert for rank)
rank = r.zrevrank("leaderboard:global", "user:42")

# Rate limiting: sliding window using score as timestamp
now = time.time()
window = 60  # seconds
max_requests = 100
key = "ratelimit:user:42"

pipe = r.pipeline()
pipe.zadd(key, {str(now): now})          # add current request
pipe.zremrangebyscore(key, 0, now - window)  # remove old entries
pipe.zcard(key)                          # count requests in window
pipe.expire(key, window)
results = pipe.execute()

request_count = results[2]
if request_count > max_requests:
    raise Exception("Rate limit exceeded")

Streams#

Redis Streams (added in Redis 5.0) are an append-only log with consumer groups. They provide at-least-once delivery guarantees and persistent history.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# Producer
r.xadd("events:orders", {
    "order_id": "ord-123",
    "user_id": "user:42",
    "amount": "99.99",
})

# Consumer group setup (run once)
try:
    r.xgroup_create("events:orders", "billing-service", id="0", mkstream=True)
except redis.ResponseError:
    pass  # group already exists

# Consumer: read new messages
messages = r.xreadgroup(
    groupname="billing-service",
    consumername="worker-1",
    streams={"events:orders": ">"},  # ">" = undelivered messages
    count=10,
    block=5000,
)

for stream, entries in (messages or []):
    for msg_id, data in entries:
        print("Processing:", data)
        # Acknowledge after successful processing
        r.xack("events:orders", "billing-service", msg_id)

Practical Patterns#

Distributed Lock#

The Redlock algorithm uses multiple Redis nodes for fault-tolerant locking. For single-node deployments, the SET NX PX pattern is sufficient:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import uuid

def acquire_lock(r, resource: str, ttl_ms: int) -> str | None:
    token = str(uuid.uuid4())
    acquired = r.set(f"lock:{resource}", token, nx=True, px=ttl_ms)
    return token if acquired else None

def release_lock(r, resource: str, token: str) -> bool:
    script = """
    if redis.call('get', KEYS[1]) == ARGV[1] then
        return redis.call('del', KEYS[1])
    else
        return 0
    end
    """
    result = r.eval(script, 1, f"lock:{resource}", token)
    return bool(result)

The Lua script ensures the get-and-delete is atomic. Without it, a race condition between checking ownership and deleting is possible.

Pub/Sub for Fan-out#

1
2
3
4
5
6
7
8
9
10
# Publisher
r.publish("channel:notifications", '{"type": "order_shipped", "order_id": "ord-123"}')

# Subscriber
pubsub = r.pubsub()
pubsub.subscribe("channel:notifications")

for message in pubsub.listen():
    if message["type"] == "message":
        print("Received:", message["data"])

Pub/Sub is fire-and-forget: messages are not persisted. If a subscriber is offline, messages are lost. Use Streams when you need persistence and consumer groups.

Best Practices#

  • Set TTLs on all keys you do not manage manually. Unbounded key growth will exhaust memory.
  • Use pipelines to batch multiple commands and reduce round-trip latency.
  • Use Lua scripts for compound operations that must be atomic.
  • Avoid KEYS * in production. Use SCAN with a cursor for iterating keys.
  • Separate Redis instances for cache and durable data. A cache can tolerate eviction (allkeys-lru); durable stores cannot.
  • Monitor used_memory, evicted_keys, and keyspace_hits to understand memory pressure and hit rates.

Conclusion#

Redis is more than a cache. Its data structures — hashes, sorted sets, lists, and streams — solve a range of problems that would otherwise require separate services. Understanding each type and its performance characteristics lets you replace complex infrastructure with a few focused Redis operations.

Contents