Guide
Redis fundamentals explained
Your API dashboard shows Postgres CPU at 94% — every product page fires six identical catalog queries. You add an in-process cache; deploys wipe it, and three app instances each hold stale copies. Redis is an in-memory data structure server that sits beside your primary database: sub-millisecond reads, shared across every service instance, with rich primitives for caching, sessions, rate limits, leaderboards, and pub/sub. Twitter, GitHub, and Stack Overflow lean on Redis daily. It is not a replacement for PostgreSQL or MongoDB — it is a fast, ephemeral layer that fails loudly when you treat it as durable storage without configuring persistence. This guide covers Redis data structures, key expiry and eviction policies, caching patterns (cache-aside, write-through), RDB snapshots vs AOF journaling, replication and Sentinel failover, Redis Cluster hash slots, common production use cases, an API gateway worked example, when Redis beats Memcached or an edge HTTP cache, pitfalls, and a production checklist.
What Redis is
Redis (Remote Dictionary Server) stores data in RAM as key-value
pairs where values are typed data structures, not
opaque blobs. A single redis-server process handles commands
over TCP (default port 6379) using the Redis Serialization Protocol (RESP).
Clients exist for every major language; redis-cli is the built-in
shell for debugging.
Single-threaded command execution
The core command loop is single-threaded — one command at a time per shard, which eliminates lock contention and keeps latency predictable. I/O threads (Redis 6+) and background child processes handle persistence and replication without blocking the hot path. Throughput scales by adding more CPU for I/O threads or by sharding with Redis Cluster, not by expecting multi-core parallelism inside one instance for arbitrary commands.
Memory-first, optional durability
All hot data lives in RAM. When memory fills, Redis applies an eviction policy (if configured) or rejects writes. Durability is optional: RDB point-in-time snapshots, AOF append-only logs, or both. Treat Redis as a cache with recovery options, not as your sole source of financial truth unless you have tested failover and backup restore drills.
Core data structures
Choosing the right structure avoids serializing JSON on every read and unlocks atomic operations Redis performs in O(1) or O(log N) time.
Strings
The simplest type — counters (INCR), feature flags, serialized
JSON blobs, and distributed locks via SET key value NX EX ttl.
GETSET and MGET/MSET batch hot paths.
Hashes
Field-value maps inside one key — ideal for user session objects or product
attribute caches. HGETALL returns the full hash; HINCRBY
atomically bumps a counter field without read-modify-write races in application code.
Lists
Doubly-linked lists supporting LPUSH/RPOP queues.
Use for lightweight job queues when you do not yet need
Kafka-level durability.
BLPOP blocks consumers efficiently.
Sets and sorted sets
Sets store unique members — tags, online-user ID sets,
deduplication. Sorted sets (ZSET) pair members with scores;
ZADD + ZRANGEBYSCORE power leaderboards, time-windowed
rate-limit buckets, and priority queues. ZREVRANK returns a player’s
rank in O(log N).
Streams, bitmaps, HyperLogLog
- Streams — append-only log with consumer groups; Redis-native alternative to lightweight event buses.
- Bitmaps — bit-level operations for daily active-user sets and bloom-filter-style membership at tiny memory cost.
- HyperLogLog — approximate unique counts (cardinality) in ~12 KB regardless of set size.
- Geospatial —
GEOADD/GEORADIUSfor nearby-store queries.
Keys, TTL, and eviction
Key naming and TTL
Use a consistent namespace: session:{userId},
cache:product:{sku}, ratelimit:{ip}:{minute}.
Set time-to-live (TTL) on cache keys with EXPIRE or
SET ... EX seconds. Sessions without TTL leak memory forever; caches
without TTL serve stale data until manual invalidation.
Eviction policies
When maxmemory is reached, maxmemory-policy decides
what to remove:
- allkeys-lru — evict least-recently-used keys (common default for pure caches).
- volatile-lru — evict LRU among keys with a TTL set.
- allkeys-lfu / volatile-lfu — evict least-frequently-used (Redis 4+); better for skewed access patterns.
- noeviction — return errors on write when full; use when data must not disappear silently.
Monitor evicted_keys and used_memory. Rising evictions
under steady traffic mean your working set exceeds RAM — scale up, shard, or
shorten TTLs.
Caching patterns
Cache-aside (lazy loading)
The application checks Redis first. On miss, it reads Postgres, writes the result to Redis with a TTL, and returns. Invalidation happens on writes: delete the cache key or publish an invalidation event. Simple and resilient — if Redis dies, traffic falls through to the database (watch for thundering herds on cold start).
Write-through and write-behind
Write-through updates Redis and the database synchronously on every write — consistent but slower. Write-behind writes to Redis immediately and batches database persistence asynchronously — higher throughput, risk of data loss on crash. Reserve write-behind for analytics counters, not payment ledgers.
Stampede protection
When a hot key expires, thousands of requests may hit the database at once.
Mitigations: probabilistic early expiration, per-key mutex locks
(SET lock NX EX 5), or serving slightly stale data while one
worker rebuilds the cache.
Persistence: RDB vs AOF
RDB snapshots
Redis forks a child process that writes a compact binary snapshot of memory at
intervals (save 900 1 = save if 1 key changed in 900 seconds).
Fast restarts, small files, but you may lose writes since the last snapshot.
Good for caches where warm-up from Postgres is acceptable.
AOF (Append Only File)
Every write appends to a log. appendfsync always is safest;
everysec (default) trades at most one second of loss for better
throughput. AOF rewrite compacts the log periodically. Use AOF when Redis holds
data you cannot easily reconstruct — session stores, rate-limit state across
restarts.
Hybrid
Redis 4+ can combine RDB preamble with AOF for faster restarts. Test restore time on production-sized datasets — a 20 GB AOF replay can exceed your RTO.
High availability and scaling
Replication
A primary accepts writes; replicas asynchronously
mirror the replication stream. Reads can scale to replicas with acceptance of
replication lag — never read your own writes from a replica for session checks
unless you use WAIT or read from primary.
Redis Sentinel
Sentinel processes monitor primaries, perform automatic failover when a primary is unreachable, and publish the new primary address to clients. Run at least three Sentinel instances for quorum. Client libraries must support Sentinel discovery — hard-coding primary IPs defeats the purpose.
Redis Cluster
Data is partitioned across 16,384 hash slots. Each key maps
to a slot via CRC16(key) mod 16384. Multi-key operations require
keys in the same slot — use hash tags
({userId}:profile and {userId}:cart share a slot).
Cluster adds horizontal write scaling; operational complexity rises — prefer a
single primary with replicas until memory or CPU saturates.
Common production use cases
- Session store — hash per session ID, TTL aligned with idle timeout, primary read for auth checks.
- Rate limiting — sliding window counters in sorted sets or fixed windows with
INCR+EXPIRE; pair with API rate-limiting at the gateway. - Distributed locks —
SET lock NX EXwith token verification on release; use Redlock only when you understand its limits. - Pub/Sub — fire-and-forget broadcast (not durable); for persisted events prefer Streams or Kafka.
- Leaderboards and rankings — sorted sets with
ZINCRBYon score events. - Feature flags — string keys toggled without redeploying app servers.
Worked example: API gateway cache and rate limits
A public REST API serves product details from Postgres. Requirements: p95 latency under 50 ms, 100 requests/minute per API key, session tokens valid for 24 hours.
Product cache
SET cache:product:SKU-42 '{"name":"Widget","price":19.99}' EX 300
GET cache:product:SKU-42
# On miss: SELECT from Postgres, then SET with EX 300
Five-minute TTL balances freshness with load. On product update webhook, run
DEL cache:product:SKU-42.
Per-key rate limit (fixed window)
INCR ratelimit:apikey:abc123:2026060813
EXPIRE ratelimit:apikey:abc123:2026060813 60
# Reject if count > 100
For smoother limits, store request timestamps in a sorted set and trim entries older than 60 seconds before counting — a sliding window at the cost of more memory per key.
Session hash
HSET session:sess_xyz user_id 1001 plan pro last_seen 1717852800
EXPIRE session:sess_xyz 86400
Auth middleware reads HGET session:{token} user_id from the
primary. Deploy three app instances behind
nginx — all
share one Redis primary with one replica for read scaling of non-auth paths.
Redis vs alternatives
| Need | Redis | Alternative |
|---|---|---|
| Simple key-value cache, sub-ms reads | Strong fit | Memcached (simpler, no persistence/structures) |
| Leaderboards, rate limits, pub/sub | Native structures | App-level logic in SQL (slower, more load) |
| Durable event log, replay | Streams (limited retention) | Kafka, NATS JetStream |
| Static asset caching at edge | Wrong layer | CDN / nginx proxy_cache |
| Primary transactional data | Not recommended alone | PostgreSQL, MongoDB |
| Single-process dev cache | Overkill | In-memory LRU in app (lost on restart) |
Redis complements your database stack — pair it with connection pooling (PgBouncer for Postgres) so cache misses do not exhaust DB connections during spikes.
Common pitfalls
- No TTL on cache keys — memory grows until OOM or eviction surprises production.
- Storing large values — multi-megabyte JSON blobs block the single thread; keep values small.
- KEYS * in production — scans the entire keyspace; use
SCANiteratively. - Thundering herd on expiry — hot keys expiring simultaneously hammer the database.
- Multi-key ops across Cluster slots —
MGETfails if keys hash to different nodes. - Treating Redis as the only database — persistence gaps and memory limits bite during incidents.
- Unbounded pub/sub subscribers — slow consumers do not get backlog; messages are dropped.
- No connection pooling — one TCP connection per HTTP request exhausts file descriptors.
Production checklist
maxmemoryset below available RAM (leave headroom for fork copy-on-write during RDB).- Eviction policy matches workload (
allkeys-lrufor caches,noevictionfor mandatory data). - TTL on every ephemeral key; monitoring alerts on
used_memoryandevicted_keys. - Persistence configured and restore tested (RDB, AOF, or managed backup).
- Replication with at least one replica; Sentinel or managed failover for production.
- TLS and AUTH (or ACL users) enabled; not exposed on public interfaces.
- Client connection pooling sized per app instance.
- Slow-log threshold set; latency monitoring via
LATENCY DOCTOR. - Key naming convention documented; forbidden commands blocked via ACL (
FLUSHALL,KEYS). - Runbook for failover, memory spike, and cache warm-up after cold start.
Key takeaways
- Redis is an in-memory data structure server — not a drop-in SQL replacement.
- Strings, hashes, sorted sets, and streams map directly to cache, session, leaderboard, and queue patterns.
- TTL and eviction policies define cache behavior; configure them before launch.
- Cache-aside with stampede protection is the default pattern for read-heavy APIs.
- Replication, Sentinel, and Cluster address HA and scale — choose the simplest option that fits.
Related reading
- HTTP caching explained — CDN and browser cache layers above application Redis
- Database connection pooling explained — protect Postgres when cache misses spike
- nginx fundamentals explained — edge proxy_cache and upstream routing in front of Redis-backed APIs
- API rate limiting explained — gateway algorithms that pair with Redis counters