Tools

Docker & Kubernetes Essentials

ToolsFREELast updated: June 2026 · By gitGood Editorial

The docker and kubectl commands you reach for daily, Dockerfile best practices, how layer caching actually works, the core k8s objects in one screen, requests vs limits, liveness vs readiness, and a step-by-step CrashLoopBackOff debug flow.

Why this matters

Containers show up in three interview moments: the platform / SRE round that grills the mechanics (what is a layer, what does a Service actually do), the system design round where you deploy whatever you just drew, and the debugging scenario - "a pod is crash-looping, walk me through it" is a stock question. The framing that keeps answers crisp: Docker packages a process with its filesystem (an image is a stack of read-only layers; a container is that stack plus a writable layer and namespaces/cgroups around a process). Kubernetes is a reconciliation loop - you declare desired state in objects, controllers work to make reality match.

Docker commands

Command	What it does	Common flags	Example
docker build	Build an image from a Dockerfile.	-t name:tag, -f path/Dockerfile, --target stage (multi-stage), --no-cache, --platform linux/amd64	`docker build -t api:local --target dev .`
docker run	Create + start a container from an image.	-d (detached), -p host:container (publish port), -e KEY=val, -v host:container (mount), --rm (clean up on exit), -it (interactive shell), --name	`docker run -d --rm -p 3000:3000 -e NODE_ENV=production api:local`
docker ps	List running containers.	-a (include stopped - where crashed containers hide), -q (IDs only)	`docker ps -a`
docker logs	Container stdout/stderr.	-f (follow), --tail 100, -t (timestamps), --since 10m	`docker logs -f --tail 100 api`
docker exec	Run a command in a running container.	-it (interactive TTY), -u user, -w dir	`docker exec -it api sh`
docker images / pull / push / tag	List local images, fetch / publish from a registry, alias a tag.	images -a. tag SRC TARGET before pushing to a registry path.	`docker tag api:local registry.io/team/api:1.4.2 && docker push registry.io/team/api:1.4.2`
docker stop / rm / rmi	stop sends SIGTERM then SIGKILL after a grace period; rm deletes containers; rmi deletes images.	stop -t 30 (grace seconds), rm -f (force running)	`docker stop api && docker rm api`
docker inspect	Full JSON state of any object (container, image, network, volume).	--format '{{.State.ExitCode}}' (Go template extraction)	`docker inspect --format '{{.State.OOMKilled}}' api`
docker stats	Live CPU / memory / IO per container.	--no-stream (one snapshot)	`docker stats --no-stream`
docker system prune	Delete stopped containers, dangling images, unused networks, build cache.	-a (also unused images - aggressive), --volumes (careful: data)	`docker system prune -a`
docker compose	Multi-container apps from compose.yaml. The standard local dev harness.	up -d, down, down -v (also volumes), logs -f svc, up --build, ps	`docker compose up -d --build`

Dockerfile best practices

·Multi-stage builds: build in a fat image (deps, compilers), COPY --from=build only the artifacts into a slim runtime stage. Routinely cuts images from GBs to tens of MBs.
·Small, pinned base images: FROM node:22-slim or alpine, never FROM node:latest. latest makes builds non-reproducible and rebuild-roulette.
·Order instructions least- to most-frequently-changing: deps manifest before source. COPY package*.json + RUN npm ci first, COPY . . last - so editing source doesn't bust the dependency cache layer.
·Use .dockerignore: exclude .git, node_modules, build output, .env. Shrinks the build context (faster sends) and keeps secrets out of layers.
·Run as non-root: USER node (or create one). A container escape from root-in-container is a much worse day. Most interview answers forget this one.
·One process per container. Sidecars and process managers belong to the orchestrator, not the image.
·Combine related RUN steps and clean up in the same layer: RUN apt-get update && apt-get install -y X && rm -rf /var/lib/apt/lists/*. Deleting files in a later layer does not shrink the image - the bytes live on in the earlier layer.
·Prefer COPY over ADD. ADD's magic (URL fetch, auto-extract tarballs) is surprising; COPY does exactly one thing.
·Never bake secrets into layers (ENV or COPY .env). They're readable with docker history. Use build secrets (--mount=type=secret) or inject at runtime.
·Use exec-form CMD ["node", "server.js"] (not shell form) so PID 1 is your process and actually receives SIGTERM - the difference between graceful shutdown and a 10s SIGKILL.
·Add HEALTHCHECK (or rely on k8s probes) so the platform knows "running" from "working".

Image layers and build cache

Each Dockerfile instruction produces a read-only layer - a filesystem diff stacked via overlayfs. Layers are content-addressed and shared: ten images FROM the same base store the base once, and a push only uploads layers the registry lacks. At build time, Docker walks instructions top-down and reuses the cached layer when the instruction and its inputs are unchanged (for COPY/ADD, that means file checksums). The first cache miss invalidates every later layer - which is the entire reason for the ordering rule: COPY package.json + install deps before COPY . ., so a one-line source edit rebuilds only the cheap final layers instead of re-running npm ci. Two follow-up traps worth pre-empting: RUN apt-get update in its own layer can serve a stale package index from cache forever (combine update+install), and deleting a file in a later layer hides it but doesn't reclaim the space - the bytes still ship in the earlier layer.

kubectl commands

Mental model: every command reads or writes objects in the API server; controllers do the rest. -n <namespace> applies everywhere (default namespace otherwise).

Command	What it does	Common flags	Example
kubectl get	List objects: pods, deploy, svc, ingress, nodes, events...	-o wide (more cols), -o yaml (full spec), -w (watch), -A (all namespaces), -l app=api (label select)	`kubectl get pods -o wide -l app=api`
kubectl describe	Human-readable detail + the Events section at the bottom - the first place to look when anything is stuck.	describe pod/<name>, describe node <name>	`kubectl describe pod api-7d4b9-x2k1f`
kubectl logs	Container logs from a pod.	-f (follow), --previous (the crashed container's logs - essential for crash loops), -c container (multi-container pods), --tail 100	`kubectl logs api-7d4b9-x2k1f --previous`
kubectl exec	Shell or command inside a running container.	-it, -c container	`kubectl exec -it api-7d4b9-x2k1f -- sh`
kubectl apply / delete	Declaratively create/update objects from manifests; delete removes them.	-f file.yaml, -f dir/, -k (kustomize), delete pod <name> (force a restart via recreation)	`kubectl apply -f k8s/`
kubectl rollout	Manage Deployment rollouts.	status deploy/api (wait/observe), undo deploy/api (rollback), restart deploy/api (rolling restart - the polite way to bounce pods), history	`kubectl rollout undo deploy/api`
kubectl scale	Set replica count by hand (HPA does it automatically).	--replicas=N deploy/<name>	`kubectl scale --replicas=5 deploy/api`
kubectl port-forward	Tunnel a local port to a pod or service - test in-cluster things without an Ingress.	svc/<name> local:remote	`kubectl port-forward svc/api 8080:80`
kubectl top	Live CPU/memory per pod or node (needs metrics-server).	top pods, top nodes, --containers	`kubectl top pods -n production`
kubectl get events	Cluster events: scheduling failures, OOMKills, probe failures, image pull errors.	--sort-by=.lastTimestamp, -A	`kubectl get events --sort-by=.lastTimestamp \| tail -20`
kubectl debug	Attach an ephemeral debug container to a running pod - the answer for distroless images with no shell.	-it --image=busybox --target=<container>	`kubectl debug -it api-7d4b9-x2k1f --image=busybox --target=api`
kubectl config	Manage cluster contexts. Verify before anything destructive.	current-context, use-context <name>, get-contexts	`kubectl config current-context`

Core Kubernetes objects

Pod: Smallest deployable unit: one or more containers sharing network namespace (localhost) and volumes. Mortal by design - you almost never create bare pods; a controller owns them.
Deployment: Desired state for stateless pods: image, replica count, update strategy. Manages ReplicaSets to do rolling updates and rollbacks. The default answer for "how do I run my API."
Service: Stable virtual IP + DNS name load-balancing across pods selected by label. Types: ClusterIP (in-cluster, default), NodePort (port on every node), LoadBalancer (cloud LB). Solves "pods die and change IPs."
Ingress: L7 HTTP routing into the cluster: host/path rules + TLS termination, fulfilled by an ingress controller (nginx, ALB, Traefik). One LB fanning out to many Services instead of one LB each.
ConfigMap: Non-secret config as env vars or mounted files, decoupled from the image. Pods don't see updates to env-injected values until restarted.
Secret: Same shape as ConfigMap for sensitive values - but only base64-encoded, not encrypted, by default. Real answer: encryption at rest + RBAC, or an external manager (Vault, AWS Secrets Manager) via CSI/operator. Saying "Secrets are encrypted" is the trap.
StatefulSet: Pods with stable identity: sticky names (db-0, db-1), stable DNS, per-pod PersistentVolumes, ordered rollout. For databases, Kafka, anything where replicas aren't interchangeable.
DaemonSet: Exactly one pod per (matching) node. Node-level agents: log shippers, monitoring, CNI plugins.
Job / CronJob: Run-to-completion workloads with retries (Job); on a schedule (CronJob). Batch processing, migrations, nightly reports.
Namespace: Virtual cluster partition for grouping, RBAC scoping, and ResourceQuotas. team-a/prod-vs-staging separation without separate clusters.
HPA (HorizontalPodAutoscaler): Scales a Deployment's replicas on CPU/memory/custom metrics. Targets utilization relative to requests - which is why unset requests break autoscaling.

Resources and probes

Requests vs limits

Requests

The guaranteed floor used by the scheduler for bin-packing - a pod only lands on a node with that much unreserved capacity. Also the denominator for HPA utilization.

Limits

The hard ceiling. CPU over limit -> throttled (latency spikes, no kill). Memory over limit -> OOMKilled, exit code 137.

When to choose each

Always set requests (scheduling and autoscaling are blind without them). Memory: set limit = request - memory isn't compressible, and overcommit means random OOMKills under node pressure. CPU: set requests, and know the debate on limits - throttling hurts tail latency, so many teams skip CPU limits and let pods burst. Requests=limits gives the Guaranteed QoS class, which is evicted last. The exit-code-137-means-OOMKilled fact pays for itself in debugging questions.

Liveness vs readiness (vs startup) probes

Liveness

"Is the process irrecoverably wedged?" Failure -> kubelet restarts the container. Keep it dumb (process responds at all) - never depend on a database or downstream service.

Readiness

"Can this pod take traffic right now?" Failure -> pod is pulled from Service endpoints; no restart. Can include dependency checks; failing readiness during overload sheds load gracefully.

Startup

Disables the other probes until first success - for slow-booting apps (JVMs, big caches) so liveness doesn't kill them mid-startup.

When to choose each

The classic outage: a liveness probe that checks the database. DB blips -> every pod fails liveness -> kubelet restarts the entire fleet simultaneously -> self-inflicted total outage from a transient blip. Liveness = process health only; readiness = traffic-worthiness including dependencies; startup = patience for slow boots. Restart loops punish the wrong probe choice; traffic blackholes punish the missing one.

Debugging a CrashLoopBackOff

CrashLoopBackOff means the container starts, exits, and kubelet is backing off between restarts (10s, 20s, 40s... capped at 5m). The status is the symptom; find why the process exits. Walk this in order:

·kubectl get pods - confirm the state and restart count; -o wide tells you if it's node-specific.
·kubectl describe pod <name> - read Events bottom-up and grab Last State: exit code and reason (OOMKilled, Error). Also catches the lookalikes: ImagePullBackOff, failed mounts, unschedulable.
·Decode the exit code: 137 = OOMKilled (raise the memory limit or fix the leak); 1/2 = app error (go read logs); 126/127 = bad command/entrypoint (typo in CMD, missing binary).
·kubectl logs <pod> --previous - the dying container's last words. --previous is the key flag; the current container may be seconds old and empty.
·Crash is at startup? Check config: missing env var, bad ConfigMap/Secret key reference (describe shows CreateContainerConfigError), wrong DB hostname, migration failing.
·Suspect the probes: a too-aggressive liveness probe (short timeout, checks a dependency) kills healthy-but-slow containers in a loop. describe shows probe failures in Events.
·Reproduce without the orchestrator: docker run the same image locally, or kubectl debug to poke around (distroless images have no shell for exec).
·Still opaque? Override the entrypoint to keep the container alive - command: ["sleep", "3600"] - then exec in and run the real command by hand to watch it fail.
·Fix, kubectl apply, kubectl rollout status deploy/<name>, and confirm restart count stops climbing.

How B-tree indexes actually work, composite index column order, covering indexes, reading EXPLAIN ANALYZE, why the planner ignores your index, join algorithms, N+1, keyset pagination, and the 'why is this query slow' scenarios interviews are built on.

Practice the patterns

Reading is the floor. The signal in interviews comes from working problems out loud and defending your tradeoffs. Spin up an AI mock interview or run a coding challenge to put these to work.

Coding challenges AI mock interview CKA practice exam CKAD practice exam