Guide
gRPC and Protocol Buffers explained: RPC, schemas, and streaming
gRPC is Google's open-source framework for remote procedure calls:
your service calls a function on another machine as if it were local, with a
strongly typed contract and a compact binary wire format. The payload language is
almost always Protocol Buffers (protobuf) — a schema-first
serialization format defined in .proto files and compiled into
client and server stubs in Go, Java, Python, Rust, and dozens of other languages.
gRPC rides on HTTP/2,
so one TCP connection can multiplex many concurrent RPCs, support flow control,
and carry metadata headers alongside message bodies. Inside a
microservices
mesh, gRPC is the default lingua franca for low-latency, high-throughput
service-to-service traffic. This guide covers how protobuf schemas work, the four
gRPC call types, deadlines and interceptors, versioning strategy, and when REST
or GraphQL
still wins for browser-facing APIs.
Why gRPC exists
JSON over HTTP/1.1 REST is excellent for public APIs humans debug in curl. It is less ideal when fifty internal services exchange millions of small messages per second. JSON parsing is CPU-heavy, field names repeat on every request, and HTTP/1.1's one-request-per-connection model encourages connection sprawl unless you add pooling layers.
gRPC addresses three pain points at once:
- Contract-first APIs — the
.protofile is the source of truth; breaking changes are caught at compile time instead of in production JSON parsing errors. - Efficient encoding — protobuf uses field numbers and variable-length integers; payloads are typically 3–10x smaller than equivalent JSON and faster to serialize.
- First-class streaming — server streaming, client streaming, and bidirectional streams are part of the core model, not bolted on with WebSockets or SSE hacks.
The trade-off is ergonomics: protobuf binaries are not human-readable without tooling, browser support requires gRPC-Web proxies, and every consumer must regenerate stubs when schemas change. That is why gRPC dominates backend east-west traffic while REST and GraphQL remain common for north-south (client-to-server) APIs.
Protocol Buffers: schemas and wire format
A protobuf message is a typed struct. You declare it once in a
.proto file, then run protoc (the protocol compiler)
with a language plugin to generate structs and encode/decode helpers.
syntax = "proto3";
package orders.v1;
message CreateOrderRequest {
string customer_id = 1;
repeated LineItem items = 2;
string idempotency_key = 3;
}
message LineItem {
string sku = 1;
int32 quantity = 2;
}
message CreateOrderResponse {
string order_id = 1;
int64 created_at_unix = 2;
}
Field numbers are permanent
Each field has a numeric tag (1, 2, 3…) that appears on the wire, not
the field name. Never reuse a field number after deletion —
old clients would mis-decode data. Reserve removed tags with
reserved 3; and add new fields with new numbers. Unknown fields
are skipped on read, which is how backward-compatible evolution works: old
binaries ignore fields they do not know; new binaries populate optional new
fields while old clients still function.
Scalar types and well-known types
Protobuf provides string, int32, int64,
bool, bytes, and floating-point types. Use
google.protobuf.Timestamp for instants and
google.protobuf.Duration for timeouts instead of rolling your own
epoch integers everywhere. For nullable semantics in proto3, use wrapper types
like google.protobuf.StringValue or the optional
keyword where your compiler version supports it.
Services in the same file
RPC methods are declared alongside messages:
service OrderService {
rpc CreateOrder(CreateOrderRequest) returns (CreateOrderResponse);
rpc StreamOrderEvents(OrderEventsRequest) returns (stream OrderEvent);
}
The compiler generates abstract client interfaces and server base classes.
Your server implements CreateOrder; the framework handles
framing, compression, and status codes.
The four gRPC call types
Every gRPC method is one of four patterns. Picking the right one avoids polling loops and oversized single responses.
Unary
One request, one response — the gRPC equivalent of a normal function call or
REST POST. Most CRUD operations fit here: CreateOrder,
GetBalance, ValidateToken.
Server streaming
Client sends one request; server streams many responses. Use for log tailing, large result sets that should arrive incrementally, or live price feeds. The client reads messages until the server closes the stream or returns an error status.
Client streaming
Client streams many requests; server returns one aggregated response. Common for bulk uploads, batched metric ingestion, or file chunk assembly where the server acknowledges once all pieces arrive.
Bidirectional streaming
Both sides send a sequence of messages independently — chat protocols, collaborative editing sync, or game state replication between backend shards. Ordering is per-stream; you design application-level heartbeats and backpressure because either side can stall.
HTTP/2 under the hood
gRPC maps each RPC to an HTTP/2 stream. Request and response metadata travel
in HEADERS frames (content-type application/grpc, custom
key-value pairs like auth tokens). Message bodies are length-prefixed protobuf
blobs in DATA frames. A single HTTP/2 connection between two pods can carry
hundreds of concurrent unary calls without opening hundreds of TCP sockets —
a major win behind
load balancers
compared to HTTP/1.1 keep-alive pools.
gRPC status codes reuse HTTP semantics translated to a rich error model:
NOT_FOUND, DEADLINE_EXCEEDED,
RESOURCE_EXHAUSTED, UNAVAILABLE. Clients should
distinguish retryable codes (UNAVAILABLE, ABORTED)
from permanent failures (INVALID_ARGUMENT,
PERMISSION_DENIED) — the same discipline as
idempotent retries
on REST.
Deadlines, cancellation, and interceptors
Every gRPC call should carry a deadline (absolute time) or timeout (relative duration). When the deadline passes, the client cancels the RPC and the server should stop work — propagating the same deadline to downstream calls prevents one slow leaf service from pinning thread pools across the chain. This pairs naturally with circuit breakers at the client stub layer.
Interceptors (middleware for gRPC) wrap unary and streaming calls on both client and server. Typical uses: inject trace IDs for distributed tracing, attach JWT validation, log latency histograms, enforce rate limits, and redact sensitive fields. Keep interceptors fast — they run on every RPC.
Versioning and compatibility
Treat .proto files like database schemas: additive changes are
safe, renames and type changes are breaking unless you version the package.
Common patterns:
- Package versioning —
orders.v1,orders.v2as separate services or packages; run both during migration windows. - Field addition only — new optional fields with new numbers; never change the type of an existing field.
- Deprecation annotations — mark fields
[deprecated = true]and document removal timelines in changelogs. - Buf or prototool linting — CI checks that block field-number reuse and enforce style before merge.
Unlike JSON APIs where clients ignore unknown keys by default, protobuf's strict decoding means server and client must agree on field semantics. Rolling deploys therefore require backward-compatible schema changes first, then consumer updates, then producer cleanup — the same two-phase dance as database migrations.
gRPC vs REST vs GraphQL
| Concern | gRPC + protobuf | REST + JSON | GraphQL |
|---|---|---|---|
| Browser clients | Needs gRPC-Web + proxy | Native | Native |
| Payload size / speed | Excellent | Moderate | Moderate (often POST) |
| Streaming | Built-in four modes | SSE / chunked hacks | Subscriptions (server push) |
| Contract enforcement | Strong (codegen) | OpenAPI optional | Schema required |
| Debugging with curl | Hard (use grpcurl) | Easy | Moderate |
| Best fit | Internal microservices | Public HTTP APIs | Flexible client queries |
Many teams expose REST or GraphQL at the edge and translate to gRPC behind an API gateway. That keeps developer experience friendly for mobile and web while preserving efficient binary RPC between core services. Our REST API design guide covers the public-surface patterns gRPC intentionally does not optimize for.
Security and operations
gRPC supports TLS by default — mutual TLS (mTLS) is standard in service meshes like Istio and Linkerd: each workload presents a certificate, and the mesh encrypts east-west traffic without application code changes. For authentication claims, pass bearer tokens in metadata headers; validate in server interceptors the same way you would on REST middleware.
Operational gotchas to plan for:
- L7 load balancing — naive TCP load balancers pin HTTP/2 connections to one backend; use client-side load balancing, service mesh, or proxies that understand gRPC routing.
- Health checks — implement the standard
grpc.health.v1.Healthservice so orchestrators mark pods ready only when dependencies are warm. - Reflection — enable server reflection in dev so
grpcurlcan discover methods without local.protofiles; disable in production unless tightly access-controlled. - Message size limits — default max message sizes may be 4 MB; raise consciously for bulk streaming and enforce at ingress.
Production checklist
- Define services and messages in versioned
.protopackages; lint in CI. - Generate stubs in both client and server repos (or a shared schema repo) on every schema merge.
- Set per-RPC deadlines; propagate deadlines across downstream gRPC calls.
- Add client interceptors for retries (idempotent RPCs only), tracing, and metrics.
- Use server interceptors for auth, logging, and panic recovery.
- Enable TLS; prefer mTLS for service-to-service traffic in production.
- Implement health checks and graceful shutdown (drain in-flight RPCs on SIGTERM).
- Load-test streaming paths separately — backpressure bugs show up only under sustained streams.
- Document which RPCs are idempotent and safe to retry; use idempotency keys in request messages where needed.
- Keep a REST or GraphQL edge if browser clients exist; do not force gRPC through the public internet without gRPC-Web.
Key takeaways
- gRPC is contract-first RPC over HTTP/2 with protobuf payloads — optimized for fast, typed service-to-service calls.
- Protobuf schemas use numbered fields; evolve APIs by adding fields, not reusing numbers or changing types.
- Unary, server-streaming, client-streaming, and bidirectional streaming cover most backend communication patterns without polling.
- Deadlines, interceptors, and proper status-code retry logic are non-optional for production resilience.
- Use gRPC internally; keep REST or GraphQL at the public boundary unless you control every client.
Related reading
- REST API design explained — resource modeling and HTTP semantics for public APIs
- Microservices architecture explained — service boundaries and sync vs async communication
- GraphQL API design explained — flexible queries when clients need shape control
- HTTP/2 and HTTP/3 explained — multiplexing and transport features gRPC builds on