How you deal with these load-causing processes is up to you and depends
on a lot of factors. In some cases, you might have a script that has gone
out of control and is something you can easily kill. In other situations, such
as in the case of a database process, it might not be safe simply to
kill the process, because it could leave corrupted data behind. Plus, it
could just be that your service is running out of capacity, and the real
solution is either to add more resources to your current server or add
more servers to share the load. It might even be load from a one-time job
that is running on the machine and shouldn't impact load in the future,
so you just can let the process complete. Because so many different things
can cause processes to tie up server resources, it's hard to list them
all here, but hopefully, being able to identify the causes of your high
load will put you on the right track the next time you get an alert that
a machine is slow.

Kyle Rankin is a Systems Architect in the San Francisco Bay Area and the
author of a number of books, including The Official Ubuntu Server
Book,
Knoppix Hacks and Ubuntu Hacks.
He is currently the president of the North
Bay Linux Users' Group.

Kyle Rankin is a director of engineering operations in the San Francisco Bay Area, the author of a number of books including DevOps Troubleshooting and The Official Ubuntu Server Book, and is a columnist for Linux Journal.

Just to comment on the question, I used to manage a few FreeBSD boxes (as internet gateway, mail server, DNS server and Samba server) for a company that virtually had no budget for their IT department.

We did not have brand named hardware. One of the servers was even built using part I scavenged from old boxes.

Most of them were very stable with uptime of more than 365 days.

The only time we had to switch off the server was because of a planned electricity upgrade by the electricity department - a possibility of outage of more than 5 hours just after midnight. Other than that, it was one case of bad sector on one of the mirrored hard disk.

In my opinion, it might not be Linux or the OS that causes low uptime. Rather, its the applications that we run on it bring down the system most of the time.

Thanks for the info. I guess Debian stable has, from my two years of experience, had kernel security updates at least every 3 months.

I was waiting to see which would come out first, Ubuntu 10.04 or CentOS 5.5, and had decided that which ever was released first is what I would use (for now). Knowing that Ubuntu would be supported for 5 years and CentOS for roughly the same time frame for 5.5 or 7 years from release.

Anyway, I've installed Ubuntu and it's been running nicely for 14 days now. When I ssh into the server it has an interesting summary of system information like: System Load, Swap usage, CPU temperature, Users logged in, Usage of /home, and then it tells me how many packages can be updated and how many are security updates which I thought was kind of nice.

During install it asked if I wanted to setup unattended-upgrades for security updates, I know it can be a little scary, but this is just a home server so I agreed. So everyday it checks for security updates and if there are any it installs them and sends me an email of what was done (Thanks must go to Kyle for his great tutorials of setting up a mail server with postfix. Thanks Kyle!).

I've been running Gentoo these last few years and the only thing that shortens the uptime on my servers is when a kernel comes out with new features or security updates. That aside, I can/could break the 365 day uptime with ease on these boxes.

The only real time I had problems getting a box to run for any lenght of time, was when I finally figured out that it had some badly manufactured memory in it. Swapped the junk out for some name brand sticks and it ran as I expected it to.

With linux, any problems with operation, I would suspect hardware issues before digging aroung the OS.

I use a mix of distributions and have gotten 1-2 year uptimes on most of them. In production I'm also resistant to update something just so the version number is higher, so as long as I have reasonably stable applications and reliable power, it's not too difficult to maintain a high uptime. The main enemy of it these days is kernel upgrades, which again, I resist unless there is a good reason (security) to do so.

Kyle Rankin is a director of engineering operations in the San Francisco Bay Area, the author of a number of books including DevOps Troubleshooting and The Official Ubuntu Server Book, and is a columnist for Linux Journal.

Trending Topics

Webinar: 8 Signs You’re Beyond Cron

Scheduling Crontabs With an Enterprise Scheduler
11am CDT, April 29th

Join Linux Journal and Pat Cameron, Director of Automation Technology at HelpSystems, as they discuss the eight primary advantages of moving beyond cron job scheduling. In this webinar, you’ll learn about integrating cron with an enterprise scheduler.