Linux Troubleshooting: A Practical Field Guide
Most Linux problems can be narrowed down by walking through the same short checklist: read the right log, identify the service involved, and decide whether the failure is configuration, hardware or upstream. This page is that checklist, with pointers to the commands that do the work.
The general approach
Before reaching for a specific recipe, it usually pays to spend a minute on the question itself. Troubleshooting on Linux benefits from a small amount of discipline:
- Describe the symptom precisely. "Wi-Fi doesn't work" can mean several different things — the network doesn't appear, the connection fails, the connection succeeds but no traffic flows. The right fix depends on which one you have.
- Establish what changed. Did this work yesterday? Did you update packages, install a driver, change a config file, plug in new hardware? Most regressions correlate with the most recent change.
- Read the relevant log first. Linux is unusually generous with logs, and a single line is often all you need. The hard part is finding the right log.
- Isolate the layer. Is the failure in the kernel, in a system service, in a user-space application, or in a misconfiguration? Each layer has its own tools.
- Change one thing at a time. When testing a fix, change only the thing you're testing. It's tempting to apply three suggested workarounds at once; that's how partial fixes turn into long-standing mysterious behaviour.
Reading logs
On any modern Linux distribution, journalctl is the entry point. It reads the systemd journal, which holds messages from the kernel, system services and many user-space programs.
# Recent log entries, newest at the bottom
journalctl -e
# Live tail, like 'tail -f'
journalctl -f
# Only this boot's messages
journalctl -b
# Only kernel messages
journalctl -k
# Just a specific service
journalctl -u NetworkManager.service
# Only errors and worse
journalctl -p err -b
The older log files in /var/log/ are still useful and on many distributions are kept in parallel. /var/log/syslog (Debian/Ubuntu) or /var/log/messages (Red Hat family) duplicates much of what's in the journal. /var/log/Xorg.0.log covers an X11 graphical session, if you're running one. Application-specific logs typically live in /var/log/<application>/.
dmesg shows the kernel ring buffer specifically — useful when the issue involves hardware, drivers or modules:
# Most recent kernel messages
dmesg -T | tail -50
# Live tail
dmesg -wT
Identifying services with systemctl
Most things that fail to start or fail to keep running on Linux are systemd units. If the systemd terminology is new, the systemd basics page walks through what units, targets and the journal actually are. Two commands cover most of what you need from a diagnostic standpoint:
# What's not running cleanly right now?
systemctl --failed
# What's the state of a specific service?
systemctl status NetworkManager.service
# Restart it
sudo systemctl restart NetworkManager.service
# Stop it from starting on boot
sudo systemctl disable cups.service
The status output includes the most recent log lines for the service, which often shows the immediate cause of a failure. If it doesn't, follow up with journalctl -u <service> for the wider context.
Boot problems
Boot issues are a common source of panic because they happen before the system you're used to is available. They split roughly into three categories:
Bootloader fails
You see a GRUB prompt rather than a boot menu, or no menu at all. The usual recovery path is a live USB of the same distribution: boot it, mount your installed system, and run grub-install followed by update-grub. The Arch Wiki's GRUB article is a good reference even on non-Arch systems.
Boots but hangs or panics
Hold Shift (BIOS) or Esc (UEFI) at boot to bring up the GRUB menu. Pick an older kernel from the "Advanced options" submenu if one is available. If that works, the issue is likely with the newest kernel or with a driver it loaded; rolling back the kernel package and reporting the bug to the distribution is usually the right next step.
If even the older kernel fails, boot in single-user (rescue) mode and investigate from there. On most distributions you can edit a kernel line in GRUB to append single or systemd.unit=rescue.target.
Boots but no graphical session
You get a text login prompt where you expected a desktop. The display manager (gdm, sddm, lightdm) has failed to start. systemctl status display-manager and the corresponding journalctl output usually identify the cause — frequently a graphics driver problem.
Network problems
Most modern desktop Linux distributions use NetworkManager. A short diagnostic loop:
# Is NetworkManager running?
systemctl status NetworkManager
# What does it think is going on?
nmcli general status
nmcli device
nmcli connection show
# Is anything coming from the kernel?
dmesg -T | grep -iE 'wlan|wifi|eth|enp|wlp'
For Wi-Fi specifically — adapter not detected, connections that fail, sluggish or dropping connections — the Wi-Fi troubleshooting guide walks through the steps in detail.
Hardware problems
If a device works in another OS but not on Linux, the question is almost always whether the right driver is loaded. lspci -k shows PCI devices and the kernel module bound to each:
# Show PCI devices and which driver is in use
lspci -k
# Same for USB devices
lsusb -t
# Currently loaded kernel modules
lsmod
If a device appears in lspci or lsusb but no driver is bound, the kernel either doesn't have the driver or doesn't recognise the hardware. Possible answers: enable a non-free firmware repository (Debian/Ubuntu), install linux-firmware (most distributions), upgrade to a newer kernel, or install a vendor-provided driver package.
Performance problems
"My system is slow" is rarely useful on its own. The tools that turn it into something specific:
# Live CPU and memory usage by process
htop
# Disk I/O by process
sudo iotop
# What's currently writing to disk, system-wide
sudo iostat -xz 2
# Where has the disk gone?
sudo du -h --max-depth=1 / | sort -h
sudo ncdu /
# How fast is the boot itself?
systemd-analyze
systemd-analyze blame
Two specific patterns come up often. The first is a process that's quietly using all the RAM and pushing the system into swap — htop sorted by memory usage will show it. The second is filling the disk on a Btrfs root with old snapshots, which can make the system grind even when there appears to be space free; btrfs filesystem usage / tells you the truth.
For the "disk is full and I have no idea why" case specifically, the find the largest files page walks through the most reliable patterns and includes a ready-to-use script.
Application crashes
For applications launched from the desktop, the journal is usually the right place to look:
# Recent crashes recorded by systemd-coredump (if installed)
coredumpctl list
# Most recent crash, in detail
coredumpctl info -1
For an application you're launching from the terminal, the crash output is usually printed there directly. Failing that, run it with strace -f or under a debugger; that level of detail is rarely needed for end-user problems but is worth knowing exists.
One specific cause is common enough to call out: when an application says it can't read or write a particular file, the issue is almost always permissions rather than the application itself. The file permissions reference covers the model and the usual fixes.
When to ask for help
If you've narrowed the problem down and need help, the information you bring matters. A useful question normally includes:
- Your distribution and version (
cat /etc/os-release). - The kernel version (
uname -r). - Exactly what you did and what happened.
- The first error message in the log — quoted, not paraphrased.
- What you've already tried.
The distribution-specific forums (Ask Ubuntu, the Fedora Discussion, the Arch BBS, the Debian user list) are usually where the people most likely to recognise your problem already are.
Related reading on this site
- Essential Linux commands — reference for the commands used above.
- Fix Wi-Fi on Linux — a detailed walkthrough specific to wireless networking.
- How to install Ubuntu LTS — if your troubleshooting concludes that a reinstall is the right answer.
- The Linux kernel — understanding kernel versions and how to roll back to an older one.