I'm experiencing some strange memory management behaviour on Squeeze with Kernel 2.6. I'm trying to get to the bottom of this with nmon.

Sometimes this will freeze up the system for 30 seconds to 5 minutes.

The server has 4GB of RAM.

The figures from nmon are:

+1.6GB memfree.

cached dropped by ~350MB

swapcached dropped by ~80MB

swap free jumped by ~200M

active memory dropped by ~1.3GB

When this happens, the server isn't near any limits. Very soon after it happens, the active memory creeps back up, along with cached... and naturally memfree drops.

This doesn't appear to be a runaway process. The OS just seems to reallocate a bunch of ram very suddenly, then it slowly reallocates it back. Swappiness shouldn't be a factor because swap is barely ever touched.

Is there any way to track what is happening? Why does this free memory suddenly appear in seconds, only to be given back to the cache within 30 minutes?

The 'bunch' is the 1.6GB. It's odd. The Swap is 2GB, but mostly unused (50MB occasionally shows up in there). vmstat -SM 3 would be hard to run in a timely fashion. When the system experiences this, sometimes the terminal is unresponsive. Applications: They freeze, we occasionally get Bug: Soft Lockup in the syslog, and their load values get artifically inflated. Everything slowly comes back.
–
mgjkJun 19 '12 at 21:27

I suspect the load afterwards is a backlog of activity. Base load on the server is a steady 2-3% with a 5% average load. I recently brought that down from higher figures by removing spamd, changing awstats jobs etc. The server is a shared web host with pop3 and imap on Dovecot. It's overtaxed, but this kind of performance probelm and this kind of weird memory behaviour is abnormal, even for a server which is under steady load.
–
mgjkJun 19 '12 at 21:33

Oh and no hint of the OOM killer.... lots of memory free.
–
mgjkJun 19 '12 at 21:35

Given all this extra information, I would say that the primarily active process is switching, from say httpd to dovecot. At this time you will see a lot of swapping going around, but only for this limited time. My suggestion is that you disable the OOM killer, and hence Linux weird memory management, and make it a little easier for you to foretell what will happen by setting vm.overcommit_memory = 2 That way you'll see when your applications are actually running out of memory, or don't have enough to begin with -- the moment you start them.
–
Igor GalićJun 20 '12 at 8:16