Top 50 Linux Interview Questions 2026

If you are interviewing for SRE, DevOps, platform, or backend infrastructure roles in 2026, you will get hit with Linux questions. Sometimes they are conceptual, sometimes you are dropped into a shell and told to figure out why a service is degraded. Both flavors show up.

This post covers the 50 questions that come up over and over, grouped by category, with the answers shaped the way an interviewer wants to hear them. Most of these are also the questions that separate "I have used Linux" from "I can debug Linux at 3 a.m."

Part 1: Processes and Signals (1-10)

1. What is the difference between a process and a thread on Linux?

In the Linux kernel, both are "tasks" represented by task_struct. A process has its own address space, file descriptors, and PID. A thread shares those with its parent process and has its own TID. Linux schedules threads independently.

2. How do you list processes by memory usage?

ps aux --sort=-%mem | head -20

For interactive triage, htop with M to sort by memory. For long-term monitoring, top -o %MEM.

3. What does `kill -9` actually do, and when should you avoid it?

kill -9 sends SIGKILL, which the kernel delivers without giving the process any chance to clean up. Avoid it when the process owns shared resources (locks, temp files, database connections), because they will leak. Try SIGTERM first.

4. Walk me through the lifecycle of a Unix process.

Fork creates a copy of the parent. Exec replaces the process image. Wait lets the parent reap the child. When a process dies before the parent calls wait, it becomes a zombie. When a parent dies before its children, init (PID 1, or systemd on most modern distros) adopts them.

5. How do you find which process is holding a port open?

ss -tlnp | grep :8080
# or, older systems:
lsof -i :8080

6. What is a zombie process and how do you fix one?

A zombie is a process that has terminated but whose parent has not called wait() to reap its exit status. The fix is to either signal the parent to reap (often by sending SIGCHLD) or kill the parent so init can reap. You cannot kill a zombie directly.

7. What is the difference between SIGTERM and SIGINT?

SIGTERM (15) is the polite "please shut down" signal. SIGINT (2) is what Ctrl-C sends. Both can be caught and handled. SIGKILL and SIGSTOP cannot.

8. How do you find the parent of a given process?

ps -o ppid= -p <PID>

Or read /proc/<PID>/status and look at PPid.

9. What happens when a process exceeds its cgroup memory limit?

The OOM killer in that cgroup picks a victim and kills it. On a containerized system this is often the process you cared about. Check dmesg or the kernel ring buffer for "Killed process" lines.

10. How do you change the priority of a running process?

renice -n 10 -p <PID>     # nicer (lower priority)
renice -n -5 -p <PID>     # meaner (higher priority, requires root)

For real-time scheduling, use chrt.

Part 2: File Systems and Storage (11-20)

11. What is an inode and what does it store?

An inode is the on-disk record describing a file: ownership, permissions, timestamps, size, and pointers to data blocks. The filename is not in the inode; the directory entry maps a name to an inode number.

12. What is the difference between a hard link and a symbolic link?

A hard link is another directory entry pointing to the same inode. A symbolic link is a separate inode whose contents are a path to another file. Hard links cannot cross filesystems, symlinks can. Deleting the original file leaves a symlink dangling, but a hard link is still valid.

13. How do you find the largest files in a directory tree?

du -ah /var | sort -rh | head -20

Or if you only want files (not directories):

find /var -type f -exec du -h {} + | sort -rh | head -20

14. What does `df` show vs `du`?

df reports filesystem-level usage from the superblock (fast, may be inaccurate if files are deleted but still held open). du walks the tree and adds up file sizes. If df and du disagree, suspect open file handles on deleted files - check with lsof | grep deleted.

15. How do you mount a filesystem at boot?

Add an entry to /etc/fstab:

UUID=abc-123  /data  ext4  defaults,noatime  0 2

Then systemctl daemon-reload and mount -a to test before reboot.

16. Explain the difference between ext4, xfs, and btrfs.

ext4 is the safe default - mature, fast, reliable. XFS scales better for large files and high-concurrency I/O, used heavily in databases and big-data workloads. Btrfs and ZFS (out-of-tree) offer snapshots, copy-on-write, and built-in volume management at the cost of complexity.

17. What is a tmpfs?

A filesystem that lives in RAM (and swap). Used for /tmp, /run, and ephemeral container layers. Fast but volatile.

18. How do you check disk I/O usage?

iostat -xz 1
# or:
iotop -oP

For per-process I/O, pidstat -d 1 is excellent.

19. What is journaling, and why does it matter?

A journal is a log of intended filesystem changes written before they are committed. After a crash, the system replays the journal to bring the filesystem to a consistent state. ext4 has three modes: journal (full), ordered (default), and writeback (fast, less safe).

20. How do you recover a deleted file on ext4?

Honestly: usually you cannot, unless you stop writes to that filesystem immediately. Tools like extundelete or debugfs can sometimes pull blocks back if the inode hasn't been reused. The real answer in an interview: "I treat backups and snapshots as the recovery mechanism, not undelete tools."

Part 3: Networking (21-30)

21. Walk me through what happens when I type `curl https://example.com`.

DNS resolves the hostname (typically via systemd-resolved or libc, hitting /etc/resolv.conf or a local cache). The kernel opens a TCP socket and completes the three-way handshake on port 443. TLS handshake establishes an encrypted session. curl writes an HTTP request, reads the response, prints it. Connection closes (or stays alive for keep-alive).

22. How do you check what is listening on which port?

ss -tlnp        # TCP listeners
ss -ulnp        # UDP listeners
ss -anp         # everything

23. What is the difference between TCP and UDP?

TCP is connection-oriented, ordered, retransmitted, congestion-controlled. UDP is fire-and-forget, unordered, no delivery guarantees. TCP is what you want by default. UDP is what you want for DNS, video, gaming, and QUIC's underlying transport.

24. How do you trace a packet's path?

mtr example.com
# or:
traceroute example.com

mtr is more useful in practice because it gives you continuous loss percentages per hop.

25. What is `ip route` and how do you read it?

ip route lists the kernel's routing table. The default route is what handles everything that doesn't match a more specific entry. Read top-down, most specific match wins. ip route get 8.8.8.8 shows which route a destination would use.

26. Difference between `iptables` and `nftables`?

nftables is the modern replacement for iptables, with a unified syntax for IPv4/IPv6/ARP and better performance. Most distros now ship nft and provide an iptables-nft shim. New rules should be written in nftables.

27. What is a Linux network namespace?

An isolated network stack: its own interfaces, routing table, firewall rules. Containers use namespaces to give each container the illusion of its own network. Inspect with ip netns list.

28. How does NAT work on Linux?

The kernel rewrites packet source/destination addresses as they traverse the firewall. Common case: SNAT (masquerade) on outbound traffic so internal IPs appear to come from the host's public IP. Defined in nftables/iptables NAT tables.

29. How do you debug a hanging TCP connection?

Start with ss -tnpi to see the connection state. Check window sizes, retransmits, and rtt. tcpdump -i any -nn 'port 443 and host x.x.x.x' for packet-level inspection. If you suspect MTU, try ping -M do -s 1472 to test fragmentation.

30. What is conntrack and when does it bite you?

Conntrack tracks connection state for stateful firewall rules. The table has a fixed size (net.netfilter.nf_conntrack_max). On busy hosts (load balancers, proxies), it fills up and new connections start failing silently. Watch /proc/sys/net/netfilter/nf_conntrack_count vs the max.

Part 4: Systemd and Service Management (31-37)

31. How do you create a systemd service?

Write a unit file at /etc/systemd/system/myservice.service:

[Unit]
Description=My service
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/myservice
Restart=on-failure
User=myservice

[Install]
WantedBy=multi-user.target

Then systemctl daemon-reload && systemctl enable --now myservice.

32. What is the difference between Type=simple, forking, oneshot, and notify?

simple - the process you exec is the main process. Most common.
forking - the binary forks and the parent exits. Old-school daemons.
oneshot - runs once and exits, often used for setup tasks.
notify - the service signals systemd when it is ready (using sd_notify). Best for production services with health gating.

33. How do you debug a failing service?

systemctl status myservice
journalctl -u myservice -e
journalctl -u myservice --since "10 minutes ago" -f

Status gives the last few lines and exit code. Journalctl gives the full log.

34. What does `systemd-analyze blame` do?

Shows which services took the longest to start at boot. Useful for boot-time optimization.

35. How do you set environment variables for a service?

Environment= directives in the unit file, or EnvironmentFile=/etc/myservice.env to keep secrets out of the unit. The env file is ini-style key=value.

36. How do you run a service as a specific user with limited privileges?

User=myservice
Group=myservice
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true

These hardening options are interview-favorite answers because they show you understand systemd as a security boundary, not just an init system.

37. What is a timer unit?

The systemd replacement for cron. A .timer unit triggers a .service unit on a schedule. More powerful than cron because timers can depend on other units, run on calendar events, have randomized delays, and integrate with the journal.

Part 5: Performance and Debugging (38-46)

38. A server is at 100% CPU. How do you investigate?

top                           # which process
pidstat -u 1                  # which thread
perf top -p <PID>             # which function
strace -p <PID> -c            # syscall breakdown

Mention you would also check load average vs CPU count, and look for run queue depth with vmstat 1.

39. A server is slow but CPU is low. What now?

Suspect I/O. iostat -xz 1, iotop, pidstat -d 1. Check for high %iowait in top. Then network: ss -i, nstat, ethtool -S eth0 for interface errors. Then memory pressure even when free memory exists - check /proc/pressure/* (PSI metrics).

40. What is load average and what does "load 8 on a 4-core machine" mean?

Load average is a rolling average of runnable + uninterruptible processes over 1/5/15 minutes. A load of 8 on a 4-core machine means twice as many processes want CPU as you have cores. Tasks are waiting. Note that uninterruptible (D state) processes also count, so high load can come from disk I/O, not CPU.

41. What is the OOM killer and how do you tune it?

When the kernel runs out of memory, it picks a process to kill based on oom_score. You can bias the score with oom_score_adj (-1000 to make a process unkillable, +1000 to make it preferred). Set this in your systemd unit with OOMScoreAdjust=.

42. Walk me through `vmstat 1` output.

Columns to highlight: r (run queue depth), b (blocked on I/O), si/so (swap in/out - any nonzero is a smell), bi/bo (block I/O), us/sy/id/wa/st (CPU breakdown). Steal time (st) high = noisy neighbor on a VM.

43. What is eBPF and when should I reach for it?

eBPF is kernel-level programmability without writing kernel modules. Modern observability tools (bcc, bpftrace, Cilium, Tetragon) use it for tracing, profiling, networking, and security. In a debug context: when standard tools cannot answer "which syscall, with what args, at what frequency, from which process," reach for bpftrace.

44. What does `strace -c` give you?

A summary of all syscalls a process made: count, time spent, errors. Excellent for spotting "wait, why is this process making 50,000 stat() calls per second?"

45. What is the page cache?

The kernel's cache of recently read/written file blocks. Why your free output looks alarming until you realize "buff/cache" is reclaimable. To force a flush for testing: echo 3 > /proc/sys/vm/drop_caches. Never do this in production for "fixing" anything.

46. How do you profile a running process without restarting it?

perf record -p <PID> -F 99 -g -- sleep 30, then perf report. Or bpftrace with profile probes. For Python: py-spy top --pid <PID>. For Node: node --inspect is intrusive; try 0x or clinic for production-friendly options.

Part 6: Containers and Kernel (47-50)

47. How does a Docker container differ from a VM?

A VM runs its own kernel via a hypervisor. A container shares the host kernel and uses namespaces and cgroups for isolation. Containers are lighter, faster to start, and have more attack surface against the host kernel.

48. What namespaces does a container use?

Typically: pid, net, mnt, uts, ipc, user, cgroup. The user namespace is the security-critical one - without it, root in the container is root on the host (modulo capability drops).

49. What are cgroups v2 and what changed from v1?

Cgroups v2 unified the resource controllers under a single hierarchy. v1 had separate hierarchies per controller, which led to contradictory configurations. v2 is the default on modern distros and is what newer container runtimes target. Memory accounting is more accurate, and PSI (Pressure Stall Information) is exposed per cgroup.

50. How do you set CPU and memory limits on a process without containers?

systemd-run --scope -p MemoryMax=500M -p CPUQuota=50% myprocess

Or use nice, cgexec, or write a unit file with resource limits. Modern answer: systemd is your cgroup manager.

How to Use This List

Do not memorize. Build the muscle.

Pick five questions a day.
For each, run the commands on a real machine (not a screenshot).
When you get to the debugging questions, set up a contrived problem (a runaway process, a full disk) and walk yourself through the diagnosis out loud.

The interview question is a proxy. What the interviewer is really testing is whether you can sit at a strange machine, smell a problem, and start narrowing it down without panicking. That comes from reps, not flashcards.

If you want to drill these under interview pressure, gitGood has a Linux question bank covering exactly this material, plus a chat-based AI mock interview where an AI interviewer walks you through technical scenarios and probes your diagnosis out loud.

#linux #interviews #sre #devops #systems #career