How to Diagnose High Server Load on Linux: A Practical Troubleshooting Guide

What Does "High Load" Actually Mean?

When sysadmins say a server is "under high load," they usually mean one of three things: CPU saturation, memory pressure, or I/O bottlenecks. The tricky part is that symptoms often overlap. This guide gives you a systematic framework to quickly identify the real culprit and resolve it.

Step 1: Check the Load Average

Run uptime or top to see the load averages for the last 1, 5, and 15 minutes. A load average above your CPU core count indicates saturation. For example, a load of 8.0 on a 4-core server means processes are queueing for CPU time.

However, a high load average alone doesn't tell you why. Move to the next steps to identify the cause.

Step 2: Identify CPU-Heavy Processes

Open top and press P to sort by CPU usage. Look for processes consistently using high percentages. Common culprits include:

Runaway PHP-FPM or Node.js worker processes
Unoptimized database queries (MySQL, PostgreSQL)
Backup or compression jobs running during peak hours
Cryptocurrency mining malware (a security red flag)

For a more detailed view, use htop (install with apt install htop).

Step 3: Check Memory and Swap Usage

Run free -h to see RAM and swap utilization. If your system is heavily using swap, it means it has run out of physical RAM and is writing to disk — causing severe performance degradation.

In top, press M to sort by memory. Identify which processes are consuming the most RAM. For persistent memory leaks, compare usage over time with:

watch -n 5 free -h

Step 4: Investigate I/O Bottlenecks

High I/O wait is a common cause of load spikes. Check it with:

iostat -x 1 5

Look at the %iowait column. Values consistently above 20% suggest a disk I/O problem. Use iotop to identify which process is hammering the disk:

sudo iotop -o

Common I/O culprits: database write storms, log rotation, large file uploads, or an indexer running amok.

Step 5: Check Network Activity

Network saturation can also spike load averages. Use nload or iftop to see real-time bandwidth usage per interface:

sudo apt install iftop -y
sudo iftop -n

A sudden traffic spike could indicate a DDoS attack, a viral traffic event, or a misconfigured service generating excessive outbound traffic.

Step 6: Examine System Logs

Check the system journal for error messages around the time load spiked:

sudo journalctl -xe --since "1 hour ago"

Also check application-specific logs in /var/log/ — particularly nginx/error.log, mysql/error.log, and syslog.

Quick Reference: Commands at a Glance

Issue to Check	Command
Load average	`uptime`
CPU usage by process	`top` or `htop`
Memory / swap usage	`free -h`
Disk I/O	`iostat -x 1 5`
Per-process I/O	`sudo iotop -o`
Network throughput	`sudo iftop -n`
System logs	`journalctl -xe`

When to Escalate vs. Optimize

If the high load is caused by legitimate traffic growth, the fix is vertical scaling (more RAM/CPU) or horizontal scaling (load balancing). If it's a runaway process or misconfiguration, kill the process and fix the root cause. If it looks like a security incident — unexpected processes, outbound traffic to unknown IPs — isolate the server immediately and begin an incident response.