Our Mephisto install kills Mongrels and causes Apache to pound the
CPU. This started when we moved to Apache+mod_proxy_balancer+Mongrel.
Here's what we know:
The following things are working OK, except when used in the
combination listed above: mongrel, mongrel_rails, MySQL, Apache,
mod_proxy_balancer. We believe these are all OK because we moved five
other Rails apps to this configuration on the same box, and they are
working fine.
Here's what we have figured out:
Connecting directly to the mongrel instance works, sort of. It's S L
O W. Clicking on a monthly blog archive link takes about 5 seconds,
most of which is spent in the MySQL database. I turned the
production.log up to debug, and found that Mephisto (0.7.3) makes
dozens of round trips to build the monthly archive page.
Here's the weird part:
Mephisto is slow but OK until we put Apache in front of the Mongrels.
Then things go sideways. The mongrels become unresponsive, and
Apache's CPU usage spikes. A single unresponsive mongrel thread
causes Apache's utilization to stay about 50% forever, and a few
dozen of them peg the processor.
Here's some stuff we have explored:
* mod_proxy_balancer vs. ProxyPass to a single Mongrel. Same problem
either way.
* Monit. Very cool tool for diagnosing the problem, and killing the
runaway processes. The problem occurs too frequently to just let
monit bounce everything.
* Apache server status. The trouble requests look like this:
CPU Client Request
0-1 4401 0/5/5 R 30.88 348 0 0.0 0.06 0.06 ? ? ..reading..
Note CPU time is huge. These seem to hang around forever, but the CPU
gets pegged long before we run out of threads.
* Mongrel -USR1 debugging. Sample thread entry at the bottom of this
post, nothing out of the ordinary in the other logs.
* Shutdown portion of Mongrel's main log also shown below. Could this
explain the problem?
The part that mystifies me:
* Why is Mongrel fine without Apache, but not with it?
* When the Mongrels start to suffer, why does Apache's CPU usage peg?
I would expect Apache's threads to have about 0% CPU utilization
while waiting on a struggling Mongrel.
I am happy to try other experiments anyone might suggest, or to
provide more detail from other logs.
Thanks,
Stuart Halloway
www.relevancellc.com
[Environment]
Mac OS X
Mephisto 0.7.3
Most recent version of all gems (nothing on edge)
[Excerpt from threads.log]
Mon Feb 26 14:37:16 PST 2007 REQUEST /archives/2006/5
0.0.0.0:8111 -- THREADS: 3 -----
KEYS:
-- #<Thread:0x35e4030>: [:__inspect_key__, :started_on]
-- #<Thread:0x36f8c78>: [:__inspect_key__, :started_on]
-- #<Thread:0x35e352c>: [:__inspect_key__, :started_on]
[Excerpt from mongrel.log when killing unresponsive mongrels]
** TERM signal received.
Mon Feb 26 14:37:06 PST 2007: Reaping 3 threads for slow workers
because of 'shutdown'
Waiting for 3 requests to finish, could take 60 seconds.object.log
ERROR: undefined method `class' for #<Liquid::Strainer:0x3132ba4>
object.log ERROR: undefined method `class' for #<Liquid::Strainer:
0x3132ba4>
Mon Feb 26 14:37:13 PST 2007: Reaping 3 threads for slow workers
because of 'shutdown'
Waiting for 3 requests to finish, could take 60 seconds.object.log
ERROR: undefined method `class' for #<Liquid::Strainer:0x3132ba4>
Mon Feb 26 14:37:20 PST 2007: Reaping 3 threads for slow workers
because of 'shutdown'
Waiting for 3 requests to finish, could take 60 seconds.Mon Feb 26
14:37:30 PST 2007: Reaping 3 threads for slow workers because of
'shutdown'
Waiting for 3 requests to finish, could take 60 seconds.Mon Feb 26
14:37:36 PST 2007: Reaping 2 threads for slow workers because of
'shutdown'