Guide
Kubernetes fundamentals explained
Kubernetes (often abbreviated K8s) is an open-source container orchestration platform. It takes the Docker images your CI/CD pipeline builds and runs them reliably across a fleet of machines: scheduling workloads, restarting failed containers, rolling out new versions without downtime, and exposing services to the network. Kubernetes is powerful but operationally heavy. This guide covers the core objects and workflows you need to read a production cluster, ship a stateless web service, and decide when simpler hosting is the smarter bet.
What problem Kubernetes solves
Running one container on one VM is easy. Running hundreds of microservices across dozens of nodes — with health checks, autoscaling, secret rotation, and zero-downtime deploys — is not. Before orchestrators, teams wrote brittle shell scripts, custom systemd units, or glued together load balancers by hand. Kubernetes provides a declarative API: you describe the desired state ("three replicas of API v2.4.1, 512 MiB RAM each, reachable on port 8080") and the control plane continuously reconciles reality toward that spec.
Under the hood, each workload still runs inside Linux containers — isolated processes sharing the host kernel. Kubernetes adds scheduling, networking, storage plugins, RBAC, and a uniform object model so platform teams can enforce standards instead of every service inventing its own deploy script.
When Kubernetes is worth it
- You run many stateless services that need independent scaling and frequent deploys.
- You have (or can hire) platform engineering capacity to operate the control plane or pay for managed K8s (EKS, GKE, AKS).
- You need multi-cloud or hybrid portability with a common deployment format.
When to skip it
- A single monolith or a handful of cron jobs — managed PaaS (Fly.io, Render, Cloud Run) or even systemd on a VPS is simpler.
- Early-stage products where operational complexity slows shipping more than it helps reliability.
- Stateful databases — run Postgres on dedicated instances or a managed service; do not treat K8s as a database host unless you know exactly why.
Cluster architecture at a glance
A Kubernetes cluster has two logical halves:
- Control plane — the brain. Components include the API server (everything talks to it), etcd (cluster state store), scheduler (assigns pods to nodes), and controller manager (runs reconciliation loops for Deployments, Services, etc.).
- Worker nodes — machines that run your containers. Each node runs
kubelet(agent that starts/stops pods), a container runtime (containerd, CRI-O), and usuallykube-proxy(implements Service networking rules).
Managed offerings hide most control-plane toil, but you still reason about
nodes: capacity, taints/tolerations, zone spread, and kubelet version skew.
Namespaces partition objects inside one cluster — common patterns are
prod, staging, and team-specific namespaces with
RBAC boundaries.
Pods: the smallest deployable unit
A Pod wraps one or more containers that share network and storage namespaces. Most pods run a single application container plus optional sidecars (Envoy proxy, log shipper, config sync). Pods are ephemeral — when a node dies or a container crashes, Kubernetes creates a replacement pod with a new IP. Never depend on pod identity or local disk for anything that must survive restarts.
Pod spec fields you will touch daily:
containers[].image— immutable tag (prefer digest or semver tag, not:latest).resources.requestsandlimits— CPU and memory reservations; the scheduler uses requests; limits cap usage and trigger OOM kills.livenessProbeandreadinessProbe— HTTP, TCP, or exec checks. Liveness restarts unhealthy containers; readiness removes pods from Service endpoints until they can accept traffic.envandenvFrom— inject ConfigMap/Secret values as environment variables.
You rarely create bare Pods in production. Instead you use controllers that manage Pod lifecycles for you.
Deployments: declarative rollouts
A Deployment owns a ReplicaSet, which owns Pods. You change the Deployment spec — image tag, replica count, env vars — and Kubernetes performs a rolling update: gradually replace old pods with new ones while keeping minimum availability. Rollback is one command if the new version misbehaves.
Key Deployment strategies:
- RollingUpdate (default) —
maxSurgeandmaxUnavailablecontrol how aggressively pods swap during deploys. - Recreate — kill all old pods before starting new ones; causes downtime but necessary for some single-replica stateful patterns.
Pair Deployments with
metrics and structured logs
so you can detect failed rollouts (rising 5xx rate, crash loops) before
users notice. A common anti-pattern is bumping replica count without
checking downstream limits — your app may scale horizontally while
database connection pools
exhaust max_connections.
Services: stable networking inside the cluster
Because pod IPs change, clients need a stable front door. A
Service is a cluster-internal virtual IP and DNS name
(my-api.default.svc.cluster.local) that load-balances to
matching pods via label selectors.
- ClusterIP (default) — reachable only inside the cluster; use for service-to-service calls.
- NodePort — exposes a port on every node; useful for demos, rarely for production ingress.
- LoadBalancer — provisions a cloud load balancer with a public IP; pairs with cloud provider integration.
kube-proxy implements Service routing (iptables or IPVS
modes). For HTTP routing by hostname or path, add an
Ingress (or the newer Gateway API) in front of Services.
Ingress controllers (nginx, Traefik, AWS ALB) terminate TLS, enforce rate
limits, and route api.example.com to the API Service and
app.example.com to the frontend Service.
Configuration and secrets
Hard-coding config in container images is an anti-pattern. Kubernetes separates ConfigMaps (non-sensitive key/value or file blobs) from Secrets (base64-encoded at rest, encrypted if you enable etcd encryption). Mount them as files or env vars.
Security hygiene:
- Never commit Secret YAML with real credentials to git — use sealed-secrets, External Secrets Operator, or cloud secret managers.
- Scope RBAC so only the ServiceAccount for a given Deployment can read its Secrets.
- Rotate credentials on a schedule; restarting pods picks up updated Secret mounts depending on how they are referenced.
For twelve-factor apps, non-secret config (log level, feature toggles, public API URLs) belongs in ConfigMaps; database URLs and signing keys belong in Secrets with least-privilege access.
Scaling and resource management
Horizontal Pod Autoscaler (HPA) adjusts Deployment replica count based on CPU, memory, or custom metrics (requests per second from Prometheus). Vertical Pod Autoscaler (VPA) suggests or applies new request/limit values — useful for right-sizing but can cause disruptive restarts.
Set requests to what the pod needs under normal load; set limits high enough to absorb spikes but low enough to prevent one runaway container from starving neighbors. Unset requests lead to noisy-neighbor scheduling; limits without requests make scheduling blind. Start with measured p95 usage from staging, then tune after observing real production metrics.
Cluster Autoscaler adds nodes when pending pods cannot be scheduled. It only helps if your cloud quota, instance types, and pod disruption budgets allow expansion — otherwise HPA scales replicas into an unschedulable backlog.
Essential kubectl workflow
kubectl is the CLI for the Kubernetes API. Commands you will
use weekly:
kubectl apply -f deployment.yaml— declarative create/update from YAML manifests or Helm/Kustomize output.kubectl get pods -n prod -w— watch pod status during rollouts.kubectl describe pod <name>— events explaining CrashLoopBackOff, image pull errors, or failed probes.kubectl logs <pod> -c <container> --previous— logs from the last crashed instance.kubectl exec -it <pod> -- /bin/sh— debug inside a running container (avoid in prod unless incident response).kubectl rollout undo deployment/my-api— revert to previous ReplicaSet.
GitOps tools (Argo CD, Flux) replace manual apply with
continuous reconciliation from a git repo — the same declarative model,
but with audit trails and drift detection built in.
Common production pitfalls
- CrashLoopBackOff — container exits immediately; check logs, missing env vars, wrong entrypoint, or failing readiness before the app binds its port.
- ImagePullBackOff — wrong tag, private registry credentials missing from
imagePullSecrets, or registry rate limits. - Pending pods — insufficient CPU/memory on nodes, taints without tolerations, or PersistentVolume claims stuck unbound.
- Probe misconfiguration — liveness probe too aggressive kills slow-starting JVM/Node apps; readiness probe hitting the wrong path sends traffic to pods that cannot serve.
- Stateful data on emptyDir — pod restart wipes local files; use PersistentVolumeClaims or external object storage.
- Over-permissioned ServiceAccounts — default accounts with cluster-admin-equivalent tokens are a lateral movement risk.
Treat Kubernetes as a distributed system: failures are normal, observability is not optional, and blast radius shrinks when you use namespaces, network policies, and pod disruption budgets for critical services.
Key takeaways
- Kubernetes orchestrates containers with a declarative API — you specify desired state; controllers reconcile continuously.
- Pods run containers; Deployments manage replica count and rolling updates; Services provide stable internal networking.
- Use Ingress (or Gateway API) for HTTP routing and TLS; use ConfigMaps/Secrets for configuration, never bake secrets into images.
- Set CPU/memory requests and limits; pair HPA with realistic downstream capacity (DB pools, external APIs).
- Adopt K8s when operational maturity and scale justify it — otherwise managed PaaS or a single VM may ship faster with less toil.
Related reading
- Containers and Linux namespaces explained — what runs inside every pod: cgroups, overlayfs, and the OCI runtime stack
- CI/CD pipelines explained — build container images, run tests, and promote artifacts into cluster environments
- Load balancing and reverse proxies explained — how Ingress controllers and cloud LBs distribute traffic to pod backends
- Observability explained — metrics, logs, and traces that catch failed rollouts and resource starvation