Examining Linux Load Averages

In a previous article I shared a few quick ways I use top as a systems administrator to get a quick overview of the health of the server. I was quickly taken to task by an anonymous commenter for not being as in depth as he would have liked. Fortunately, we are coming up on another season of load testing new machines, so brushing up on what the numbers mean seems to coincides nicely. As you will soon see though, glossing over the load average the first time was intentional. Digging into the three magic numbers is not as straight-forward as one would like to believe; as Harry said to the goblin, it's complicated.

At a very high level, the three numbers that make up the load average are a representation of the amount of work the server has been asked to do over the past 1, 5, and 15 minutes. In a perfectly loaded, one CPU server, the load average would be "1.00". In a perfectly loaded, two CPU server, the load average would be "2.00". As a rule of thumb, the load average can be interpreted as meaning the amount of work the server could do if it had this number of CPU cores available. For example, if your server load average is 8.00, but you only have six CPU cores available, the server would have needed two more cores to be able to process all of the work it was being asked to do in the time frame monitored by the load average.