Why this topic matters in interviews
Linux troubleshooting is still core for DevOps and SRE roles. Interviewers test whether you can diagnose live system issues safely using commands, logs, metrics and structured reasoning.
15 interview questions to prepare
Use top/htop, ps, pidstat, check process behavior, threads, recent deployments and logs.
Use free, top, ps, smem, dmesg for OOM, application metrics and memory limits.
Use df, du, lsof deleted files, logs, Docker images, journal size and cleanup safely.
Load average shows runnable or uninterruptible tasks over time. Interpret it relative to CPU cores and workload type.
Use systemctl status, journalctl -u, config validation, permissions, ports, dependencies and recent changes.
Check IP, route, DNS, firewall, ports, ss/netstat, ping, curl, traceroute and tcpdump.
Use ss -lntp or netstat to see listening services and owning processes.
Check resolv.conf, systemd-resolved, dig/nslookup, search domains and upstream DNS reachability.
Use du, find, sort and check logs/cache directories carefully before deleting.
Check ownership, mode, ACLs, SELinux/AppArmor, mount options and parent directory permissions.
A filesystem can run out of inodes even with free space. Use df -i and find directories with too many small files.
Check CPU, memory, disk IO, network, load, processes, logs, application metrics and external dependencies.
Use journalctl, tail, grep, awk, timestamps, correlation IDs and recent-change timeline.
Check console, load, IO wait, dmesg, blocked processes, storage, network and avoid unsafe reboot unless necessary.
A senior explains safe diagnosis, impact, commands, root cause, mitigation, validation and prevention.