Guide

Redis fundamentals explained

Your API dashboard shows Postgres CPU at 94% — every product page fires six identical catalog queries. You add an in-process cache; deploys wipe it, and three app instances each hold stale copies. Redis is an in-memory data structure server that sits beside your primary database: sub-millisecond reads, shared across every service instance, with rich primitives for caching, sessions, rate limits, leaderboards, and pub/sub. Twitter, GitHub, and Stack Overflow lean on Redis daily. It is not a replacement for PostgreSQL or MongoDB — it is a fast, ephemeral layer that fails loudly when you treat it as durable storage without configuring persistence. This guide covers Redis data structures, key expiry and eviction policies, caching patterns (cache-aside, write-through), RDB snapshots vs AOF journaling, replication and Sentinel failover, Redis Cluster hash slots, common production use cases, an API gateway worked example, when Redis beats Memcached or an edge HTTP cache, pitfalls, and a production checklist.

What Redis is

Redis (Remote Dictionary Server) stores data in RAM as key-value pairs where values are typed data structures, not opaque blobs. A single redis-server process handles commands over TCP (default port 6379) using the Redis Serialization Protocol (RESP). Clients exist for every major language; redis-cli is the built-in shell for debugging.

Single-threaded command execution

The core command loop is single-threaded — one command at a time per shard, which eliminates lock contention and keeps latency predictable. I/O threads (Redis 6+) and background child processes handle persistence and replication without blocking the hot path. Throughput scales by adding more CPU for I/O threads or by sharding with Redis Cluster, not by expecting multi-core parallelism inside one instance for arbitrary commands.

Memory-first, optional durability

All hot data lives in RAM. When memory fills, Redis applies an eviction policy (if configured) or rejects writes. Durability is optional: RDB point-in-time snapshots, AOF append-only logs, or both. Treat Redis as a cache with recovery options, not as your sole source of financial truth unless you have tested failover and backup restore drills.

Core data structures

Choosing the right structure avoids serializing JSON on every read and unlocks atomic operations Redis performs in O(1) or O(log N) time.

Strings

The simplest type — counters (INCR), feature flags, serialized JSON blobs, and distributed locks via SET key value NX EX ttl. GETSET and MGET/MSET batch hot paths.

Hashes

Field-value maps inside one key — ideal for user session objects or product attribute caches. HGETALL returns the full hash; HINCRBY atomically bumps a counter field without read-modify-write races in application code.

Lists

Doubly-linked lists supporting LPUSH/RPOP queues. Use for lightweight job queues when you do not yet need Kafka-level durability. BLPOP blocks consumers efficiently.

Sets and sorted sets

Sets store unique members — tags, online-user ID sets, deduplication. Sorted sets (ZSET) pair members with scores; ZADD + ZRANGEBYSCORE power leaderboards, time-windowed rate-limit buckets, and priority queues. ZREVRANK returns a player’s rank in O(log N).

Streams, bitmaps, HyperLogLog

  • Streams — append-only log with consumer groups; Redis-native alternative to lightweight event buses.
  • Bitmaps — bit-level operations for daily active-user sets and bloom-filter-style membership at tiny memory cost.
  • HyperLogLog — approximate unique counts (cardinality) in ~12 KB regardless of set size.
  • GeospatialGEOADD/GEORADIUS for nearby-store queries.

Keys, TTL, and eviction

Key naming and TTL

Use a consistent namespace: session:{userId}, cache:product:{sku}, ratelimit:{ip}:{minute}. Set time-to-live (TTL) on cache keys with EXPIRE or SET ... EX seconds. Sessions without TTL leak memory forever; caches without TTL serve stale data until manual invalidation.

Eviction policies

When maxmemory is reached, maxmemory-policy decides what to remove:

  • allkeys-lru — evict least-recently-used keys (common default for pure caches).
  • volatile-lru — evict LRU among keys with a TTL set.
  • allkeys-lfu / volatile-lfu — evict least-frequently-used (Redis 4+); better for skewed access patterns.
  • noeviction — return errors on write when full; use when data must not disappear silently.

Monitor evicted_keys and used_memory. Rising evictions under steady traffic mean your working set exceeds RAM — scale up, shard, or shorten TTLs.

Caching patterns

Cache-aside (lazy loading)

The application checks Redis first. On miss, it reads Postgres, writes the result to Redis with a TTL, and returns. Invalidation happens on writes: delete the cache key or publish an invalidation event. Simple and resilient — if Redis dies, traffic falls through to the database (watch for thundering herds on cold start).

Write-through and write-behind

Write-through updates Redis and the database synchronously on every write — consistent but slower. Write-behind writes to Redis immediately and batches database persistence asynchronously — higher throughput, risk of data loss on crash. Reserve write-behind for analytics counters, not payment ledgers.

Stampede protection

When a hot key expires, thousands of requests may hit the database at once. Mitigations: probabilistic early expiration, per-key mutex locks (SET lock NX EX 5), or serving slightly stale data while one worker rebuilds the cache.

Persistence: RDB vs AOF

RDB snapshots

Redis forks a child process that writes a compact binary snapshot of memory at intervals (save 900 1 = save if 1 key changed in 900 seconds). Fast restarts, small files, but you may lose writes since the last snapshot. Good for caches where warm-up from Postgres is acceptable.

AOF (Append Only File)

Every write appends to a log. appendfsync always is safest; everysec (default) trades at most one second of loss for better throughput. AOF rewrite compacts the log periodically. Use AOF when Redis holds data you cannot easily reconstruct — session stores, rate-limit state across restarts.

Hybrid

Redis 4+ can combine RDB preamble with AOF for faster restarts. Test restore time on production-sized datasets — a 20 GB AOF replay can exceed your RTO.

High availability and scaling

Replication

A primary accepts writes; replicas asynchronously mirror the replication stream. Reads can scale to replicas with acceptance of replication lag — never read your own writes from a replica for session checks unless you use WAIT or read from primary.

Redis Sentinel

Sentinel processes monitor primaries, perform automatic failover when a primary is unreachable, and publish the new primary address to clients. Run at least three Sentinel instances for quorum. Client libraries must support Sentinel discovery — hard-coding primary IPs defeats the purpose.

Redis Cluster

Data is partitioned across 16,384 hash slots. Each key maps to a slot via CRC16(key) mod 16384. Multi-key operations require keys in the same slot — use hash tags ({userId}:profile and {userId}:cart share a slot). Cluster adds horizontal write scaling; operational complexity rises — prefer a single primary with replicas until memory or CPU saturates.

Common production use cases

  • Session store — hash per session ID, TTL aligned with idle timeout, primary read for auth checks.
  • Rate limiting — sliding window counters in sorted sets or fixed windows with INCR + EXPIRE; pair with API rate-limiting at the gateway.
  • Distributed locksSET lock NX EX with token verification on release; use Redlock only when you understand its limits.
  • Pub/Sub — fire-and-forget broadcast (not durable); for persisted events prefer Streams or Kafka.
  • Leaderboards and rankings — sorted sets with ZINCRBY on score events.
  • Feature flags — string keys toggled without redeploying app servers.

Worked example: API gateway cache and rate limits

A public REST API serves product details from Postgres. Requirements: p95 latency under 50 ms, 100 requests/minute per API key, session tokens valid for 24 hours.

Product cache

SET cache:product:SKU-42 '{"name":"Widget","price":19.99}' EX 300
GET cache:product:SKU-42
# On miss: SELECT from Postgres, then SET with EX 300

Five-minute TTL balances freshness with load. On product update webhook, run DEL cache:product:SKU-42.

Per-key rate limit (fixed window)

INCR ratelimit:apikey:abc123:2026060813
EXPIRE ratelimit:apikey:abc123:2026060813 60
# Reject if count > 100

For smoother limits, store request timestamps in a sorted set and trim entries older than 60 seconds before counting — a sliding window at the cost of more memory per key.

Session hash

HSET session:sess_xyz user_id 1001 plan pro last_seen 1717852800
EXPIRE session:sess_xyz 86400

Auth middleware reads HGET session:{token} user_id from the primary. Deploy three app instances behind nginx — all share one Redis primary with one replica for read scaling of non-auth paths.

Redis vs alternatives

Need Redis Alternative
Simple key-value cache, sub-ms reads Strong fit Memcached (simpler, no persistence/structures)
Leaderboards, rate limits, pub/sub Native structures App-level logic in SQL (slower, more load)
Durable event log, replay Streams (limited retention) Kafka, NATS JetStream
Static asset caching at edge Wrong layer CDN / nginx proxy_cache
Primary transactional data Not recommended alone PostgreSQL, MongoDB
Single-process dev cache Overkill In-memory LRU in app (lost on restart)

Redis complements your database stack — pair it with connection pooling (PgBouncer for Postgres) so cache misses do not exhaust DB connections during spikes.

Common pitfalls

  • No TTL on cache keys — memory grows until OOM or eviction surprises production.
  • Storing large values — multi-megabyte JSON blobs block the single thread; keep values small.
  • KEYS * in production — scans the entire keyspace; use SCAN iteratively.
  • Thundering herd on expiry — hot keys expiring simultaneously hammer the database.
  • Multi-key ops across Cluster slotsMGET fails if keys hash to different nodes.
  • Treating Redis as the only database — persistence gaps and memory limits bite during incidents.
  • Unbounded pub/sub subscribers — slow consumers do not get backlog; messages are dropped.
  • No connection pooling — one TCP connection per HTTP request exhausts file descriptors.

Production checklist

  • maxmemory set below available RAM (leave headroom for fork copy-on-write during RDB).
  • Eviction policy matches workload (allkeys-lru for caches, noeviction for mandatory data).
  • TTL on every ephemeral key; monitoring alerts on used_memory and evicted_keys.
  • Persistence configured and restore tested (RDB, AOF, or managed backup).
  • Replication with at least one replica; Sentinel or managed failover for production.
  • TLS and AUTH (or ACL users) enabled; not exposed on public interfaces.
  • Client connection pooling sized per app instance.
  • Slow-log threshold set; latency monitoring via LATENCY DOCTOR.
  • Key naming convention documented; forbidden commands blocked via ACL (FLUSHALL, KEYS).
  • Runbook for failover, memory spike, and cache warm-up after cold start.

Key takeaways

  • Redis is an in-memory data structure server — not a drop-in SQL replacement.
  • Strings, hashes, sorted sets, and streams map directly to cache, session, leaderboard, and queue patterns.
  • TTL and eviction policies define cache behavior; configure them before launch.
  • Cache-aside with stampede protection is the default pattern for read-heavy APIs.
  • Replication, Sentinel, and Cluster address HA and scale — choose the simplest option that fits.

Related reading