Server Load Explanation
Trying to measure the number of active processes at any time, as a measure of CPU utilization, the load average is simplistic, poorly defined, but far from useless. High load averages usually mean that the system is being used heavily and the response time is correspondingly slow. What's high? Ideally, you'd like a load average under, say, 3. Ultimately, 'high' means high enough so that you don't need uptime to tell you that the system is overloaded. When seeing the results of the load averages, they are for the past 1, 5, and 15 minutes.
Checking the servers load There are a few different ways to keep an eye on your servers load, the first thing you need to do is login to your server by SSH.
Method 1 - using the uptime command: The uptime shell command produces the following output: [pax:~]% uptime 9:40am up 9 days, 10:36, 4 users, load average: 0.02, 0.01, 0.00 It shows the time since the system was last booted, the number of active user processes and something called the load average.
Method 2 - using the procinfo command: On Linux systems, the procinfo command produces the following output: [pax:~]% procinf Linux 2.0.36 (root@pax) (gcc 2.7.2.3) #1 Wed Jul 25 21:40:16 EST 2001 [pax] Memory: Total Used Free Shared Buffers Cached Mem: 95564 90252 5312 31412 33104 26412 Swap: 68508 0 68508 Bootup: Sun Jul 21 15:21:15 2002 Load average: 0.15 0.03 0.01 2/58 8557 The load average appears in the lower left corner of this output.
Method 3 - using the w command: The w command produces the following output: [pax:~]% w 9:40am up 9 days, 10:35, 4 users, load average: 0.02, 0.01, 0.00 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT mir ttyp0 :0.0 Fri10pm 3days 0.09s 0.09s bash neil ttyp2 12-35-86-1.ea.co 9:40am 0.00s 0.29s 0.15s w Notice that the first line of the output is identical to the output of the uptime command. Method 4 - using the top command - prefered: The top command is a more recent addition to the UNIX command set that ranks processes according to the amount of CPU time they consume. It produces the following output: 4:09am up 12:48, 1 user, load average: 0.02, 0.27, 0.17 58 processes: 57 sleeping, 1 running, 0 zombie, 0 stopped CPU states: 0.5% user, 0.9% system, 0.0% nice, 98.5% idle Mem: 95564K av, 78704K used, 16860K free, 32836K shrd, 40132K buff Swap: 68508K av, 0K used, 68508K free 14508K cached PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME COMMAND 5909 neil 13 0 720 720 552 R 0 1.5 0.7 0:01 top 1 root 0 0 396 396 328 S 0 0.0 0.4 0:02 init 2 root 0 0 0 0 0 SW 0 0.0 0.0 0:00 kflushd 3 root -12 -12 0 0 0 SW< 0 0.0 0.0 0:00 kswapd ... We like to use the top command because it also shows server uptime, memory information and the list of processes that you can sort by CPU usage, etc.
Other system monitoring tools - SIM (System Integrity Monitor) The folks at R-fx networks have developed this utility that has a variety of features such as; - Ability to auto restart system with definable critical load level - System load monitor with customizable warnings & actions - Priority change configurable for services, at warning or critical load level For more information on SIM please visit the R-fx networks SIM page.
Which one to choose? So, "What is a good load, bad load and in between?” Anything around 1.0 and below is fine, try to stick to under 1.0 for regular load averages. If you notice your server slowing down, check the load first. When your regular average starts to creep up around 2.0, then your server is very busy and you should consider getting another machine or upgrading your hardware. In here, regular average means when the system is idle during the day and isn't processing all your logs or backing up data. Having an overloaded server can lead to many problems and should always be avoided.
|