In this article I'm going to go over some of the details of how Apache works, and uses memory. As you will see, the apache processes that serve requests will grow to include the amount of memory they need for a particular request, and they will still be using that memory, even when they are serving something simple like a .css file or an image.

For this reason, the first limitation you are likely to encounter running Apache is a limitation on memory. I will explore this topic and show you how you can do the same.

The first thing need to determine about Apache is what mode it is running in. If your plan is to use Apache with php via mod_apache, then the prevailing wisdom for many years has been that there may be library extensions in PHP that are not thread safe. For this reason, people have tended to avoid running Apache as a threaded server, for fear that scripts may randomly die producing the dreaded 500 Internal server error pages we all know and love.

This fear may be overstated depending on the nature of your php code, but the defaults for apache packages I've seen, seem almost always to be using the prefork worker.

Prefork vs Worker?

So the first question you might be asking yourself is, which mode is my current apache server running in? There is a simple safe way to determine this -- run the httpd(apache) program with a specific command line switch. There are programs like which and whereis that might help you locate it, if you don't know where it is.

Well this looks promising --- httpd -l should give us a list of modules!

/usr/sbin/httpd -l

CODE:

Compiled in modules:
core.c
prefork.c
http_core.c
mod_so.c

So as you can see, there's a small group of core modules compiled into Apache, and in this case the server is running prefork and NOT running worker (aka the threaded server).

What is Prefork?

In unix, "forking" is a way for a parent program to essentially clone a copy of itself as a seperate child program. For this reason when you run ps or top you'll see a number of different httpd processes running.

So as you can see, the MaxSpareServers of 20 matches the number of child processes we have, and this is because this particular server has been running for a while, and has seen some periods where it was receiving enough traffic that apache forked additional child processes, keeping those 20 child servers around. If however, we restart apache, what will we find (assuming there is little to no load when it restarts).

I'm guessing nobody will be surprised to find that we have exactly 8 child processes thanks to the "StartServers 8" configuration item. Now you might be interested in the other parameters. Take the time to read about them in the apache documentation! I am going to talk about the "MaxRequestsPerChild 4000" parameter.

MaxRequestsPerChild

So you might be wondering at this point if apache ever kills one of its child processes?

The answer is yes! Controlling the killing of child processes is what the MaxRequestsPerChild parameter does. Any child will service 4k HTTP requests before it is killed by the parent, and replaced by a new forked child process.

You also might be wondering why Apache doesn't just keep a child process going forever. One reason, is to return memory back to the operating system.

If we run ps aux | more, so we can see the column headers, we'll find that several of the headers are displaying memory statistics.

With these flags for the ps command, we can now see the VSZ and RSS columns, which display memory usage for the process, are showing a variety of different numbers (all of which are in KB) to indicate the amount of memory the process requires. Or do they?

As it turns out these numbers give you a general idea of what the process would require in a vacuum, but doesn't really give you accurate information. This is because substantial amounts of memory use needed for the code segments of the program are in shared libraries which can be shared by all the processes that need them, without having to make a seperate copy of the code (hence the explanation for the word "shared" in the name). We need a better utility to look at the real memory usage.

We can use pmap to look at a specific child process by passing in it's process id:

At the very bottom of the pmap, is a summary breaking down memory use into mapped, private and shared memory.

If we key in on the writeable/private number, we come away with an understanding that this particular process is using about 23mb of memory. If we assume that most of the other apache processes will be in the same range, then our total memory use for 20 processes, is going to be about 500mb+. While not completely accurate, it's closer to the truth than if we just looked at what ps was telling us and added those numbers up.

The other thing that should be obvious in scanning through the pmap, is that there are a lot of different libraries being used, and also a lot of apache modules. This is a place where you can start looking at the list, and asking some questions about what those modules do, and whether or not they are appropriate for your site.

I'll suggest if you want to explore memory utilization further, that you locate a copy of this amazing python memory use script written by Pádraig Brady, who was an engineer for Redhat and now works for Facebook.

It reports the total memory use by a process -- in essence summarizing the use by all the forked children. Give it a try.

Armed with this information you now have a way of determining how to configure your apache server conf file, to make the most efficient use of available server memory, as well as gain insight into whether you have enough resources to serve the amount of traffic you expect and encounter.

So, in reading this I come up with the question "Would it be a good idea then to set a high max child and a low MaxRequests to allow serving under load but locking memory up for as short a time as possible? Let's say values of 100 and 10, respectively.

There is a significant cost to creating a new process, so trying to force process recycling isn't going to solve any problems. The best thing to do is to go through the list of apache modules and remove those that aren't required by your site. Typically there are numerous apache modules that you don't need.