Guide
API gateway explained
An API gateway is the front door to your backend services — a
specialized reverse proxy that understands HTTP APIs, not just TCP connections. Where
a plain nginx config routes /api/users to one upstream and
/api/orders to another, a gateway adds policy: validate JWTs before
traffic hits your Node process, throttle abusive API keys, rewrite legacy paths to new
microservices, and aggregate three downstream calls into one mobile-friendly response.
Teams adopt gateways when a monolith splits into services and every team re-implementing
auth, CORS, and rate limits in each repo becomes unmaintainable. This guide explains
what gateways do that reverse proxies alone do not, how they fit into
microservices
and BFF architectures, popular implementations, and a production checklist so you add
a gateway for the right reasons — not because a vendor slide deck said you should.
Gateway vs reverse proxy vs load balancer
These three layers overlap but solve different problems. A load balancer distributes traffic across identical instances of the same service — round-robin across four API pods. A reverse proxy terminates TLS, routes by hostname and path, and can cache static assets. An API gateway is a reverse proxy with API-aware features: OAuth/JWT validation plugins, per-consumer rate limits, request/response schema validation, API versioning headers, and sometimes protocol translation (REST to gRPC).
In practice you often stack them. Cloudflare or an AWS ALB handles DDoS and L4 distribution; nginx or Envoy sits behind as the gateway; individual services run behind internal load balancers only your VPC can reach. The gateway is the last hop clients see before your application logic — the place to enforce cross-cutting policies once instead of in every service.
Core capabilities
Routing and service discovery
Gateways map external paths to internal services. GET /v2/products
might route to a new catalog service while /v1/products still hits a
legacy monolith during migration. Dynamic routing integrates with Consul, etcd, or
Kubernetes service DNS so you do not hand-edit nginx configs every time a pod
reschedules. Header-based routing sends mobile clients to a lighter BFF while web
clients hit fuller payloads from the core API.
Authentication and authorization
The gateway validates credentials before forwarding. Common patterns:
- API keys in headers or query params — simple for server-to-server and partner integrations; rotate keys and scope them per consumer.
- JWT validation — verify signature, issuer, audience, and expiry
at the edge using the IdP's JWKS endpoint. Downstream services trust headers the
gateway injects (
X-User-Id,X-Scopes) only if traffic cannot bypass the gateway. See our JWT guide for token structure and pitfalls. - OAuth 2.0 token introspection — for opaque tokens, the gateway calls the authorization server to confirm validity on each request (cache results briefly to avoid hammering the IdP).
- mTLS — mutual TLS for B2B APIs where both client and server present certificates; common in financial and IoT integrations.
Authorization — deciding what a validated identity may do — can live at
the gateway (route-level ACLs) or in each service (fine-grained resource checks).
Gateways excel at coarse rules ("only admins reach /admin/*");
services should still enforce ownership ("user 42 can only read order 99 if they
placed it").
Rate limiting and throttling
Protect backends from abuse and enforce commercial tiers. Gateways apply limits per
API key, IP, or JWT subject — 100 requests per minute on the free tier, 10,000 on
enterprise. Algorithms include fixed windows, sliding windows, and token buckets.
When limits trip, return 429 Too Many Requests with a
Retry-After header rather than letting overload cascade into database
connection exhaustion. Our
rate limiting guide
covers algorithm trade-offs; the gateway is the natural enforcement point before
traffic fans out to dozens of microservices.
Request and response transformation
Gateways can strip internal headers, add correlation IDs for distributed tracing, map JSON field names for backward compatibility, or convert XML legacy payloads to JSON. Response aggregation — the backend-for-frontend (BFF) pattern — lets one mobile endpoint call user, cart, and inventory services in parallel and return a single JSON object, cutting round trips on high-latency networks.
Protocol and version management
Expose REST externally while services communicate via
gRPC
internally; the gateway translates between them. Version by path (/v1/,
/v2/), header (Accept-Version: 2), or subdomain. Canary
releases route 5% of traffic to a new upstream based on a header or cookie —
safer than flipping a DNS record for all users at once.
Managed vs self-hosted gateways
Managed gateways (AWS API Gateway, Google Apigee, Azure API Management) charge per request and integrate with cloud IAM. They shine for serverless stacks — API Gateway triggers Lambda without you running any proxy infrastructure — and for teams that want billing, developer portals, and API key management out of the box. Downsides: vendor lock-in, per-request costs that sting at scale, and config expressed in cloud-specific YAML or consoles that resist git-based review.
Self-hosted gateways run on your VMs or Kubernetes cluster:
- Kong — popular open-core gateway with a rich plugin ecosystem (auth, rate limiting, logging). Kong Gateway runs as containers; Kong Konnect adds a control plane for multi-cluster management.
- Envoy Gateway / Gloo — Envoy-based, Kubernetes-native, strong gRPC and service-mesh adjacency. Envoy is the data plane behind many meshes; a standalone gateway uses the same battle-tested proxy without full Istio complexity.
- Traefik — automatic discovery of Docker and Kubernetes services, Let's Encrypt integration, good for smaller teams already using Traefik as ingress.
- nginx with OpenResty / lua — maximum control if your team already lives in nginx configs; you build auth and rate-limit logic yourself or via modules.
Choose managed when request volume is moderate, you want zero ops, and cloud integration matters. Choose self-hosted when you need custom plugins, predictable cost at millions of requests per day, or air-gapped deployments.
Where gateways fit in architecture
In a typical microservices layout, external clients hit one public gateway hostname. The gateway routes to domain-specific services (users, billing, notifications) that are not directly internet-exposed. Internal service-to-service calls may use a separate mesh or skip the public gateway entirely — east-west traffic between pods does not need the same API-key validation your mobile app requires.
A BFF per client type (web BFF, mobile BFF) is a thin service or gateway route that shapes responses for that platform. The web BFF might return verbose product descriptions; the mobile BFF returns thumbnails and prices only. Both share the same core services behind the gateway.
For monoliths, a gateway is often premature. A single app behind nginx with middleware for auth and rate limits is simpler until you have genuine multi-team service boundaries. Adding Kong before you have a second deployable service is complexity theater.
Security considerations
- Never trust client-supplied identity headers — strip
X-User-Idon ingress; only the gateway may set it after JWT validation. - TLS everywhere — terminate HTTPS at the gateway with modern cipher suites; use mTLS or private networking for gateway-to-service hops in zero-trust setups.
- Validate request size and content type — reject oversized
bodies and unexpected
Content-Typebefore they reach parsers in application code. - WAF integration — many teams put a web application firewall (Cloudflare, AWS WAF) in front of the gateway for OWASP Top 10 pattern blocking.
- Secrets in gateway config — API key verification salts and JWKS cache credentials belong in a secrets manager, not git. Rotate gateway admin credentials separately from application DB passwords.
- CORS at the gateway — centralize
CORS
policy so individual services do not return conflicting
Access-Control-Allow-Originheaders.
Observability and operations
Gateways see 100% of external API traffic — ideal for metrics, access logs, and
trace propagation. Emit structured logs with route name, consumer ID, latency, and
upstream status. Attach a W3C traceparent header at the gateway so
downstream services join a single distributed trace. Alert on elevated 5xx rates
per route, p99 latency spikes, and rate-limit saturation (many 429s may mean a
legitimate partner needs a higher quota, not that you are under attack).
Config changes are production events. Store gateway routes in git (Kong declarative
config, Envoy xDS snapshots, Terraform for AWS API Gateway) and run
config test in CI before deploy. Blue-green gateway deployments —
stand up a parallel gateway tier, shift traffic via DNS or load balancer weights,
roll back if error rates climb — mirror the patterns in our
CI/CD guide.
Common pitfalls
- Gateway as a god object — stuffing business logic, database queries, or complex aggregations into gateway plugins creates an undeployable monolith in Lua. Keep gateways thin; put domain logic in services.
- Single point of failure — run gateway replicas behind a load balancer; health-check them aggressively. A dead gateway takes down every API at once.
- Latency tax — every plugin (JWT verify, transform, log) adds milliseconds. Profile hot paths; cache JWKS keys and introspection results.
- Bypass routes — if internal services are reachable from the internet without passing the gateway, attackers skip your auth entirely. Network policies and private subnets matter as much as gateway config.
- Timeout mismatch — gateway timeout shorter than upstream causes false 504s; longer than client timeout wastes resources on abandoned requests. Align gateway, load balancer, and service timeouts deliberately.
- Drift between environments — staging gateway missing the production rate-limit plugin ships surprises on launch day. Promote config, not hand-tuned console clicks.
Production checklist
- Justify the gateway — multiple services or external API consumers with shared auth/throttle needs; not a solo monolith.
- Define routing table in version control — paths, upstreams, timeouts, and retry policy reviewed in PRs.
- Enforce auth at the edge — JWT/API key validation before upstream; strip spoofable identity headers on ingress.
- Configure per-consumer rate limits — return 429 with Retry-After; monitor saturation.
- Run at least two gateway instances — behind LB, with health checks and automatic failover.
- Propagate trace and request IDs — correlate gateway logs with service logs during incidents.
- Align timeouts end-to-end — client, gateway, LB, service.
- Lock down east-west paths — services not directly public; only the gateway faces the internet.
- Test failure modes — upstream down, slow, invalid token, oversized body; confirm correct status codes and no credential leakage in errors.
- Document the external API contract — OpenAPI spec published from the gateway's view of routes, not each team's internal README.
Key takeaways
- An API gateway is a policy-aware reverse proxy — routing plus auth, rate limits, transformation, and versioning in one place.
- Reverse proxies route traffic; gateways enforce API contracts and protect backends from cross-cutting concerns duplicated per service.
- Validate JWTs and API keys at the edge; services trust only gateway-injected identity on private networks.
- Managed gateways reduce ops; self-hosted (Kong, Envoy) wins at scale and customization.
- Keep gateways thin — no business logic; use BFFs for aggregation when needed.
- High availability and observability are non-negotiable — the gateway is your entire API's front door.
Related reading
- Reverse proxy explained — TLS termination, path routing, and nginx fundamentals beneath every gateway
- Microservices architecture explained — when service boundaries justify a gateway in the first place
- API rate limiting explained — algorithms, headers, and enforcement strategies at the edge
- Load balancing explained — how traffic distributes across gateway replicas and service instances