Kubernetes Fundamentals Explained: Pods, Deployments & Services

What problem Kubernetes solves

Running one container on one VM is easy. Running hundreds of microservices across dozens of nodes — with health checks, autoscaling, secret rotation, and zero-downtime deploys — is not. Before orchestrators, teams wrote brittle shell scripts, custom systemd units, or glued together load balancers by hand. Kubernetes provides a declarative API: you describe the desired state ("three replicas of API v2.4.1, 512 MiB RAM each, reachable on port 8080") and the control plane continuously reconciles reality toward that spec.

Under the hood, each workload still runs inside Linux containers — isolated processes sharing the host kernel. Kubernetes adds scheduling, networking, storage plugins, RBAC, and a uniform object model so platform teams can enforce standards instead of every service inventing its own deploy script.

When Kubernetes is worth it

You run many stateless services that need independent scaling and frequent deploys.
You have (or can hire) platform engineering capacity to operate the control plane or pay for managed K8s (EKS, GKE, AKS).
You need multi-cloud or hybrid portability with a common deployment format.

When to skip it

A single monolith or a handful of cron jobs — managed PaaS (Fly.io, Render, Cloud Run) or even systemd on a VPS is simpler.
Early-stage products where operational complexity slows shipping more than it helps reliability.
Stateful databases — run Postgres on dedicated instances or a managed service; do not treat K8s as a database host unless you know exactly why.

Cluster architecture at a glance

A Kubernetes cluster has two logical halves:

Control plane — the brain. Components include the API server (everything talks to it), etcd (cluster state store), scheduler (assigns pods to nodes), and controller manager (runs reconciliation loops for Deployments, Services, etc.).
Worker nodes — machines that run your containers. Each node runs kubelet (agent that starts/stops pods), a container runtime (containerd, CRI-O), and usually kube-proxy (implements Service networking rules).

Managed offerings hide most control-plane toil, but you still reason about nodes: capacity, taints/tolerations, zone spread, and kubelet version skew. Namespaces partition objects inside one cluster — common patterns are prod, staging, and team-specific namespaces with RBAC boundaries.

Pods: the smallest deployable unit

A Pod wraps one or more containers that share network and storage namespaces. Most pods run a single application container plus optional sidecars (Envoy proxy, log shipper, config sync). Pods are ephemeral — when a node dies or a container crashes, Kubernetes creates a replacement pod with a new IP. Never depend on pod identity or local disk for anything that must survive restarts.

Pod spec fields you will touch daily:

containers[].image — immutable tag (prefer digest or semver tag, not :latest).
resources.requests and limits — CPU and memory reservations; the scheduler uses requests; limits cap usage and trigger OOM kills.
livenessProbe and readinessProbe — HTTP, TCP, or exec checks. Liveness restarts unhealthy containers; readiness removes pods from Service endpoints until they can accept traffic.
env and envFrom — inject ConfigMap/Secret values as environment variables.

You rarely create bare Pods in production. Instead you use controllers that manage Pod lifecycles for you.

Deployments: declarative rollouts

A Deployment owns a ReplicaSet, which owns Pods. You change the Deployment spec — image tag, replica count, env vars — and Kubernetes performs a rolling update: gradually replace old pods with new ones while keeping minimum availability. Rollback is one command if the new version misbehaves.

Key Deployment strategies:

RollingUpdate (default) — maxSurge and maxUnavailable control how aggressively pods swap during deploys.
Recreate — kill all old pods before starting new ones; causes downtime but necessary for some single-replica stateful patterns.

Pair Deployments with metrics and structured logs so you can detect failed rollouts (rising 5xx rate, crash loops) before users notice. A common anti-pattern is bumping replica count without checking downstream limits — your app may scale horizontally while database connection pools exhaust max_connections.

Services: stable networking inside the cluster

Because pod IPs change, clients need a stable front door. A Service is a cluster-internal virtual IP and DNS name (my-api.default.svc.cluster.local) that load-balances to matching pods via label selectors.

ClusterIP (default) — reachable only inside the cluster; use for service-to-service calls.
NodePort — exposes a port on every node; useful for demos, rarely for production ingress.
LoadBalancer — provisions a cloud load balancer with a public IP; pairs with cloud provider integration.

kube-proxy implements Service routing (iptables or IPVS modes). For HTTP routing by hostname or path, add an Ingress (or the newer Gateway API) in front of Services. Ingress controllers (nginx, Traefik, AWS ALB) terminate TLS, enforce rate limits, and route api.example.com to the API Service and app.example.com to the frontend Service.

Configuration and secrets

Hard-coding config in container images is an anti-pattern. Kubernetes separates ConfigMaps (non-sensitive key/value or file blobs) from Secrets (base64-encoded at rest, encrypted if you enable etcd encryption). Mount them as files or env vars.

Security hygiene:

Never commit Secret YAML with real credentials to git — use sealed-secrets, External Secrets Operator, or cloud secret managers.
Scope RBAC so only the ServiceAccount for a given Deployment can read its Secrets.
Rotate credentials on a schedule; restarting pods picks up updated Secret mounts depending on how they are referenced.

For twelve-factor apps, non-secret config (log level, feature toggles, public API URLs) belongs in ConfigMaps; database URLs and signing keys belong in Secrets with least-privilege access.

Scaling and resource management

Horizontal Pod Autoscaler (HPA) adjusts Deployment replica count based on CPU, memory, or custom metrics (requests per second from Prometheus). Vertical Pod Autoscaler (VPA) suggests or applies new request/limit values — useful for right-sizing but can cause disruptive restarts.

Set requests to what the pod needs under normal load; set limits high enough to absorb spikes but low enough to prevent one runaway container from starving neighbors. Unset requests lead to noisy-neighbor scheduling; limits without requests make scheduling blind. Start with measured p95 usage from staging, then tune after observing real production metrics.

Cluster Autoscaler adds nodes when pending pods cannot be scheduled. It only helps if your cloud quota, instance types, and pod disruption budgets allow expansion — otherwise HPA scales replicas into an unschedulable backlog.

Essential kubectl workflow

kubectl is the CLI for the Kubernetes API. Commands you will use weekly:

kubectl apply -f deployment.yaml — declarative create/update from YAML manifests or Helm/Kustomize output.
kubectl get pods -n prod -w — watch pod status during rollouts.
kubectl describe pod <name> — events explaining CrashLoopBackOff, image pull errors, or failed probes.
kubectl logs <pod> -c <container> --previous — logs from the last crashed instance.
kubectl exec -it <pod> -- /bin/sh — debug inside a running container (avoid in prod unless incident response).
kubectl rollout undo deployment/my-api — revert to previous ReplicaSet.

GitOps tools (Argo CD, Flux) replace manual apply with continuous reconciliation from a git repo — the same declarative model, but with audit trails and drift detection built in.

Common production pitfalls

CrashLoopBackOff — container exits immediately; check logs, missing env vars, wrong entrypoint, or failing readiness before the app binds its port.
ImagePullBackOff — wrong tag, private registry credentials missing from imagePullSecrets, or registry rate limits.
Pending pods — insufficient CPU/memory on nodes, taints without tolerations, or PersistentVolume claims stuck unbound.
Probe misconfiguration — liveness probe too aggressive kills slow-starting JVM/Node apps; readiness probe hitting the wrong path sends traffic to pods that cannot serve.
Stateful data on emptyDir — pod restart wipes local files; use PersistentVolumeClaims or external object storage.
Over-permissioned ServiceAccounts — default accounts with cluster-admin-equivalent tokens are a lateral movement risk.

Treat Kubernetes as a distributed system: failures are normal, observability is not optional, and blast radius shrinks when you use namespaces, network policies, and pod disruption budgets for critical services.

Key takeaways

Kubernetes orchestrates containers with a declarative API — you specify desired state; controllers reconcile continuously.
Pods run containers; Deployments manage replica count and rolling updates; Services provide stable internal networking.
Use Ingress (or Gateway API) for HTTP routing and TLS; use ConfigMaps/Secrets for configuration, never bake secrets into images.
Set CPU/memory requests and limits; pair HPA with realistic downstream capacity (DB pools, external APIs).
Adopt K8s when operational maturity and scale justify it — otherwise managed PaaS or a single VM may ship faster with less toil.

Kubernetes fundamentals explained