Description:
------------
Apache 1.3.33 is sitting and spinning if PHP 4.3.10 not compiled with --enable-debug.
RedHat 9. Apache 1.3.33. PHP 4.3.10.
Config.status:
./configure --with-apxs=/usr/local/apache/bin/apxs --with-mysql=/usr/local/mysql --prefix=/usr/local/php_4.3.10 --with-mcrypt=/usr/local/lib --with-gd --with-jpeg-dir=/usr/lib --with-zlib-dir=/usr/lib --with-png-dir=/usr/lib --enable-memory-limit
Running php.ini-recommended. diplaying errors on. max exec time 120 seconds. max mem use 25megs.
I have a huge body of code that implements a component framework. It involves large objects trees. I do a tremendous amount of object reference passing.
I have a situation where if I load a certain number of object trees on a page, apache will sit and spin at 99% CPU utilization. One object tree less, it works fine. One more, it spins (where spinning means forever .. the max exec time interrupt never occurs, the process does not continue to grow, no messages in the logs.)
I am sure that this bug is being invoked due to some error in my code ... probably a calling a method on an invalid object reference which always seems to confuse the script engine.
However this code has worked flawlessly in the past ... and it works flawlessly if I compile PHP with --enable-debug.
I realize this isn't enough info to go on. I'm posting it in the hopes that maybe others have seen similar behavior.
If you want to contact me, I get about 2000 pieces of SPAM a day. To contact me directly please use the form at:
http://www.yml.com/Contact_Yermo.html
so I can add you to my white list.
Actual result:
--------------
Running httpd in gdb and getting it to spin to 99.9% utilization:
This GDB was configured as "i386-redhat-linux-gnu"...
(gdb) run -X -F
Starting program: /usr/local/apache/bin/httpd -X -F
Program received signal SIGINT, Interrupt.
0x4011cf77 in _int_malloc () from /lib/libc.so.6
(gdb) where
#0 0x4011cf77 in _int_malloc () from /lib/libc.so.6
#1 0x4011d810 in _int_realloc () from /lib/libc.so.6
#2 0x4011c10d in realloc () from /lib/libc.so.6
#3 0x40308f5d in _erealloc (ptr=0x200f00, size=35, allow_failure=0)
at /usr/local/src/php-4.3.10/Zend/zend_alloc.c:329
#4 0x40313c65 in add_string_to_string (result=0xbfff9150, op1=0xbfff9150,
op2=0x95e7ad4) at /usr/local/src/php-4.3.10/Zend/zend_operators.c:1029
#5 0x403243f2 in execute (op_array=0x81eef54)
at /usr/local/src/php-4.3.10/Zend/zend_execute.c:1508
#6 0x40324c13 in execute (op_array=0x81b97c8)
at /usr/local/src/php-4.3.10/Zend/zend_execute.c:1686
#7 0x40324c13 in execute (op_array=0x810af5c)
at /usr/local/src/php-4.3.10/Zend/zend_execute.c:1686
#8 0x40316cab in zend_execute_scripts (type=8, retval=0x0, file_count=3)
at /usr/local/src/php-4.3.10/Zend/zend.c:900
#9 0x402f04af in php_execute_script (primary_file=0xbfffeb80)
at /usr/local/src/php-4.3.10/main/main.c:1736
#10 0x40328cda in apache_php_module_main (r=0x80ff654, display_source_mode=0)
at /usr/local/src/php-4.3.10/sapi/apache/sapi_apache.c:54
#11 0x40329709 in send_php (r=0x80ff654, display_source_mode=0, filename=0x0)
at /usr/local/src/php-4.3.10/sapi/apache/mod_php4.c:621
#12 0x403298ad in send_parsed_php (r=0x80ff654)
at /usr/local/src/php-4.3.10/sapi/apache/mod_php4.c:636
etc.
The kicker is that if I compile PHP with --enable-debug it will not spin. However I get endless memory leak messages such as:
/usr/local/src/php-4.3.10/Zend/zend_execute.c(789) : Freeing 0x0D20E5DC (44 bytes), script=/usr/local/WWW/mobie.yml.com/html/mobie/content_server/publish.php
/usr/local/src/php-4.3.10/Zend/zend_variables.c(123) : Actual location (location was relayed)
Last leak repeated 1165 times
/usr/local/src/php-4.3.10/ext/xml/xml.c(262) : Freeing 0x0D20C524 (5 bytes), script=/usr/local/WWW/mobie.yml.com/html/mobie/content_server/publish.php
Last leak repeated 6677 times
/usr/local/src/php-4.3.10/ext/xml/xml.c(647) : Freeing 0x0D20A214 (12 bytes), script=/usr/local/WWW/mobie.yml.com/html/mobie/content_server/publish.php
Last leak repeated 6076 times
/usr/local/src/php-4.3.10/ext/xml/xml.c(258) : Freeing 0x0D209D84 (12 bytes), script=/usr/local/WWW/mobie.yml.com/html/mobie/content_server/publish.php
Last leak repeated 6677 times
/usr/local/src/php-4.3.10/Zend/zend_API.c(842) : Freeing 0x0D2080C4 (12 bytes),

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports

[2005-01-13 00:58 UTC] yml at yml dot com

Bug 31525 may be related to this one. The same section of code run under PHP 5.0.3 generates consistent errors. Under PHP 4.3.10 the problems range from a object reference error, to a core dump to Apache just spinning endlessly.
However, under PHP 5.0.3 the same section of code causes consistent and immediate errors. I've reproduced a much smaller body of code the demonstrates the PHP 5 problem. My intuition is that it's related. At least this one is demonstrable.
All info is contained in bug 31525 and in this article with sample code:
http://www.yml.com/homepage.html?COMP=clog_list&cmd=detail&cs_clog_entries_ref=149
(make sure the URL isn't chopped)
It may very well be something I'm doing since I'm doing some complicated object trees and reference passing, however I would think that in no case should the $this reference just get dropped or become an unknown type.

Repeated the experiments using php4-STABLE-200501130530 using the same configure line as before.
When configured without --enable-debug this one segfaults instead of spinning. When compiled with --enable-debug it displays lots of leak messages but no buffer overrun messages.
Please see the 5.0.3 bug I also filed which is one that I was able to create a relatively small sample script for. I believe it may be the same bug because at one time at the point where php 4.3.10 segfaulted it output an error saying that '$this' was not a valid object .. which is what's happening consistently in the 5.0.3 sample script.
For this 4.3.11-dev bug I don't have a sample script. It's a case of a very large body of code where one particular setup causes the fault; change anything in the code and the fault moves. (symbol table corruption? buffer overrun?)
Running httpd in gdb with -F -X:
backtrace:
(gdb) run -F -X
Starting program: /usr/local/apache/bin/httpd -F -X
Program received signal SIGSEGV, Segmentation fault.
0x403090de in _erealloc (ptr=0x95d7728, size=16, allow_failure=0)
at /usr/local/src/php4-STABLE-200501130530/Zend/zend_alloc.c:328
328 REMOVE_POINTER_FROM_LIST(p);
(gdb) where
#0 0x403090de in _erealloc (ptr=0x95d7728, size=16, allow_failure=0)
at /usr/local/src/php4-STABLE-200501130530/Zend/zend_alloc.c:328
#1 0x40313e15 in add_string_to_string (result=0xbffeb804, op1=0xbffeb804,
op2=0x83a3060)
at /usr/local/src/php4-STABLE-200501130530/Zend/zend_operators.c:1029
#2 0x40324547 in execute (op_array=0x8fb55dc)
at /usr/local/src/php4-STABLE-200501130530/Zend/zend_execute.c:1494
#3 0x40324e07 in execute (op_array=0x86f38ec)
at /usr/local/src/php4-STABLE-200501130530/Zend/zend_execute.c:1690
#4 0x40324e07 in execute (op_array=0x86f308c)
at /usr/local/src/php4-STABLE-200501130530/Zend/zend_execute.c:1690
#5 0x40324e07 in execute (op_array=0x81f1e3c)
at /usr/local/src/php4-STABLE-200501130530/Zend/zend_execute.c:1690
#6 0x40324e07 in execute (op_array=0x839c454)
at /usr/local/src/php4-STABLE-200501130530/Zend/zend_execute.c:1690
#7 0x40324e07 in execute (op_array=0x847b1cc)
at /usr/local/src/php4-STABLE-200501130530/Zend/zend_execute.c:1690
#8 0x40324e07 in execute (op_array=0x8494eec)
at /usr/local/src/php4-STABLE-200501130530/Zend/zend_execute.c:1690
#9 0x40324e07 in execute (op_array=0x86f38ec)
at /usr/local/src/php4-STABLE-200501130530/Zend/zend_execute.c:1690
#10 0x40324e07 in execute (op_array=0x86f308c)
at /usr/local/src/php4-STABLE-200501130530/Zend/zend_execute.c:1690
---Type <return> to continue, or q <return> to quit---
#11 0x40324e07 in execute (op_array=0x81f1e3c)
at /usr/local/src/php4-STABLE-200501130530/Zend/zend_execute.c:1690
#12 0x40324e07 in execute (op_array=0x839c454)
at /usr/local/src/php4-STABLE-200501130530/Zend/zend_execute.c:1690
#13 0x40324e07 in execute (op_array=0x847b1cc)
at /usr/local/src/php4-STABLE-200501130530/Zend/zend_execute.c:1690
#14 0x40324e07 in execute (op_array=0x8141758)
at /usr/local/src/php4-STABLE-200501130530/Zend/zend_execute.c:1690
#15 0x40324e07 in execute (op_array=0x81bc264)
at /usr/local/src/php4-STABLE-200501130530/Zend/zend_execute.c:1690
#16 0x40316e5b in zend_execute_scripts (type=8, retval=0x0, file_count=3)
at /usr/local/src/php4-STABLE-200501130530/Zend/zend.c:900
#17 0x402f064b in php_execute_script (primary_file=0xbffff000)
at /usr/local/src/php4-STABLE-200501130530/main/main.c:1739
#18 0x40328ece in apache_php_module_main (r=0x80ff634, display_source_mode=0)
at /usr/local/src/php4-STABLE-200501130530/sapi/apache/sapi_apache.c:54
#19 0x403298fd in send_php (r=0x80ff634, display_source_mode=0, filename=0x0)
at /usr/local/src/php4-STABLE-200501130530/sapi/apache/mod_php4.c:621
#20 0x40329aa1 in send_parsed_php (r=0x80ff634)
at /usr/local/src/php4-STABLE-200501130530/sapi/apache/mod_php4.c:636

Thank you for this bug report. To properly diagnose the problem, we
need a short but complete example script to be able to reproduce
this bug ourselves.
A proper reproducing script starts with <?php and ends with ?>,
is max. 10-20 lines long and does not require any external
resources such as databases, etc.
If possible, make the script source available online and provide
an URL to it here. Try avoid embedding huge scripts into the report.

[2005-01-14 07:04 UTC] yml at dtlink dot com

Unfortunately this is one of those bugs that I have not been able to create a short reproducing script. I think it's a symbol table corruption problem probably due to a buffer overflow problem in the parser code somewhere. If I change the PHP code slightly the location of the segfault changes. If I compile with --enable-debug it stops segfaulting.
If you are interested, with some work, I can provide you a test machine to log into with all my code on it and exact instructions on how to reproduce this problem. You are welcome to use my hardware to diagnose this problem.
You may wish to look at bug http://bugs.php.net/31525 for which I do have a test script that I believe is related to this bug.
I have added your sniper at php.net email to my whitelist, so please feel free to contact me directly. I am very motivated to help track this bug down and as I mentioned before it may very well be due to something I'm doing in my code. Aside from providing you a box to log into where the bug is demonstrated, is there anything else I can do to help track this down?

yes, I make alot of use of recursion. It's a system that constructs trees of objects based on an XML source file.
I do a tremendous amount of object reference passing as well.
The first version was built in 2001 and it's been growing since. I've run into several PHP symbol table corruption bugs since that time; most I haven't bothered to report since they're so difficult to reproduce. PHP versions since 4.3.4 have been /much/ more stable; until this bug which seems to be caused by a particular combination of object trees.

It can do that, but not infinitive recursion. The limit is about 2500 levels of function calls deep, but that may variate quite some depending on your situation.

[2005-01-21 09:24 UTC] yml at dtlink dot com

Maybe we are miscommunicating. This is not an infinite recursion problem. As you can see in the bug reports below, if I compile PHP with --enable-debug it works like a champ (i.e. no spinning) and clearly lists PHP /INTERNAL/ buffer overruns in the debug log.
This is not an endless recursion problem. It is most likely a symbol table or stack management bug.
At most I might have 20 levels of recursion when I parse XML documents but no more.
As I've mentioned a few times I believe it may be similar or even related to a 100% reproducible PHP 5.0 bug that I have a test script for.

[2005-03-06 05:53 UTC] yml at dtlink dot com

Problem found!
Turns out I have been incorrectly using ()'s in my returns from methods that return references as in:
function &someFunc()
{
return( $somevar );
}
this did not generate any errors and has worked for /ages/ in a very large body of code.
However, it looks like this was causing the sit and spin phenomenon in rare case and also the array references/symbol table corruption problem I was noticing. (at least in test cases that were very reliably reproducing the problem.)
Makes one wonder why it worked at all.