I have an install of SugarCRM. It is heavy on memory, which is a separate issue to deal with. In the meantime, any page that attempts to use more than 128M of memory, results in the Apache (mod_php) process halting with a segmentation fault:

[notice] child pid 6852 exit signal Segmentation fault (11)

If I set the PHP memory limit to 128M, then I never get the signal 11; I just get the normal PHP error telling me it cannot allocate any more memory. If I set the PHP memory limit to greater than 128M - even slightly - then any process veering into that memory causes the segmentation fault and a white screen/broken connection for the user.

I'm using PHP 5.3.21 on a CentOS 6.2 server with the Atomic repository. It is a production server, so there are no compilers present, so recompiling the Apache processes to do core dumps is not possible.

We have APC 3.1.13 installed, and need that for some sites. It is (I believe) disabled for my SugarCRM site.

I'm not sure what other details are needed to diagnose this? I'm hoping there are a few obvious things that I can look at, if someone here has encountered something similar. What is it about that 128M that causes the crash if memory usage exceeds it?

Edit:

The plot thickens. This test script runs, with 256M set as the memory_limit, until the PHP process runs out of memory at around 256M. It does not cause a segmentation fault; it runs and is terminated gracefully:

So we are down to something that SugarCRM does (a plain PHP/MySQL application) that causes the segmentation fault, but only when the memory_limit is set to more than 128MBytes and a page is reached that exceeds that 128MByte threshold. That is going to be really tricky to narrow down. Whether it is caused by one specific line of PHP code or construct in the application, or is the result of a whole sequence of things the application happens to do (e.g. doing stuff that triggers the garbage collector to start tidying up, or unsetting objects that have database connections open, or all sorts of things like that which I could imagine) to trigger a PHP bug, is unclear.

I would not be surprised if this turns out to be a server problem, with perhaps a programming workaround.

A seg fault may be expected doing that kind of thing. Personally I would hope it would be handled by PHP a bit better, but maybe that's par the course for PHP? So long as the memory_limit is set to 128M, then it does seem to be handled nicely. When I set it to 256M then it seg faults instead.

By following through the execution of the application, step by step, it seems to be getting into a recursive loop as writes its initial cache files (SugarCRM does a lot of cacheing). It is hard to follow, but I suspect it is caused by lack of error handling - it writes a file then just expects the file to be there, without checking the results of fopen(). We are running SELinux, and I am now wondering if that is getting in the way - it certainly has in other apps, where PHP's is_writeable() says "fine, you can write a file here" but then when doing so, SELinux kicks in an says, "no way are you writing that file here". SugarCRM checks the first question, then does not always check whether it really was successful.

So - I expect this can be considered a programming issue now? It is still an issue with the server and PHP not playing nice with each other to my mind, but adequate bug fixes in SugarCRM should work around that.

2 Answers
2

Get rid of APC, and replace it with one of its alternatives eAccelerator or XCache. It's been the cause of years worth of mysterious crashes similar to this in my production environments, all of which went away as soon as APC was gone.

Thanks. I did try removing APC, and unfortunately that did not fix anything. I have had to put it back because one of the shops relies on it for query cacheing. It is a pain to installed, so I will look at the other suggestions for the longer term.
–
JasonJan 31 '13 at 10:47

This is as close as I have got, and probably as far as I am going to get for now:

SugarCRM does not error-check its file writing commands properly, so is sometimes unaware that a file has not been written.

SugarCRM writes a lot of cache files when it constructs its run-time files on first running.

When constructing these cache files, some incorrect permissions resulted in some files not being written correctly.

The missing files put SugarCRM into an endless recursive loop, with the cache-writing method calling itself up recursively.

With PHP's memory_limit set to great than 128M, rather than reporting that the process has run out of memory, it segmentation faults. Below 128M and the PHP process is killed more cleanly. I can only assume this is a PHP bug, or a bug in a plugin (not APC). Where the 128M value comes from, I don't know.

I fixed it by reinstalling SugarCRM, setting all files and directories to 777, and then it works. Permissions will now need to be wound back until I find where it is failing. A ticket to SugarCRM (community edition) will be useful, as will a ticket to PHP (though I expect I will be unable to give them the core dumps they inevitably request, so that may be a dead end).