Patches

Pull Requests

History

So.
I'm just going to make some notes.
Two bits of the stack traces have:
#0 0x00007f6db5adb36a in __strchr_sse2 () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007f6db5a8d8d8 in putenv () from /lib64/libc.so.6
No symbol table info available.
And:
#0 0x00007f6db5a8d88d in getenv () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007f6db5a80d76 in setlocale () from /lib64/libc.so.6
I've known setlocale is not thread safe - but apparently neither is getenv() + putenv()
From http://man7.org/linux/man-pages/man3/getenv.3.html
> Concurrently calling this function is safe, provided that the environment remains unchanged.
http://man7.org/linux/man-pages/man7/attributes.7.html under 'Other safety remarks - env'
> Functions marked with env as an MT-Safety issue access the
> environment with getenv(3) or similar, without any guards to
> ensure safety in the presence of concurrent modifications.
>
> We do not mark these functions as MT-Unsafe, however, because
> functions that modify the environment are all marked with
> const:env and regarded as unsafe. Being unsafe, the latter
> are not to be called when multiple threads are running or
> asynchronous signals are enabled, and so the environment can
> be considered effectively constant in these contexts, which
> makes the former safe.

So possibly this might be an altering the environment issue.
One of the stacks appears to be something writing "MAGICK_THREAD_LIMIT=1" into the environment.
This can also be achieve by editing the policy.xml file that was installed by ImageMagick, that will be on your system somewhere.
Either editing or adding an entry for thread, like:
<policy domain="resource" name="thread" value="1"/>
Can you try doing that, and seeing if that at least removes those entries from your system?

Please could you also try to find what is calling setlocale and seeing if that can be disabled?

[2019-03-01 07:15 UTC] pascal dot nobus at webservice dot be

Danack:
- there is nog imagemagic on this system.
- sites are Wordpress-5.1, Wordpress-4.9.9, Drupal-8.2.6, so no special things that is calling setlocale.
It seems to me that disabling opcache helped a bit (4 Segfault in 24h, insteadoff 10-20).
Another thought that is was hardware, or the fact that apache is mpm_event (with php compiled as mod_php) was also ruled out because I see the same effect on other servers (also ones compiled with mpm_prefork).

> there is nog imagemagic on this system.
Whether or not ImageMagick is on the system, something is called putenv with the string "MAGICK_THREAD_LIMIT=1". From your crash log:
#1 0x00007f6db12af58e in zif_putenv (execute_data=<optimized out>, return_value=0x7f6d9eff2730)
at /usr/local/src/php-7.1.26/ext/standard/basic_functions.c:4178
setting = 0x7f6d829f19a8 "MAGICK_THREAD_LIMIT=1"
setting_len = 21
p = 0x7f6d423e0a4b ""
env = 0x7f6d6c02f778
pe = {putenv_string = 0x7f6d8b26b438 "MAGICK_THREAD_LIMIT=1", previous_value = 0x0, key = 0x7f6d423e0a38 "MAGICK_THREAD_LIMIT", key_len = 19}
> so no special things that is calling setlocale.
Again, the crash log says that's exactly where one of the crashes comes from:
#1 0x00007f6db5a80d76 in setlocale () from /lib64/libc.so.6
No symbol table info available.
#2 0x00007f6db12e5de0 in zif_setlocale (execute_data=<optimized out>, return_value=0x7f6d9bfec640)
For reference, I can see that there are some setlocale calls in Drupal: https://github.com/drupal/core/blob/6864b728155310851b3919e41c0d32941c5e62ae/lib/Drupal/Core/DrupalKernel.php#L1028
It's not guaranteed to be the cause, but seeing as that is where the errors are occurring, it does seem worth the effort to track these down and try removing them to see if that fixes the problem.

[2019-03-01 16:46 UTC] pascal dot nobus at webservice dot be

I just had another crash, and yes: all is pointing now towards setlocale.
(gdb) bt full
#0 0x00007f6db5a8d88d in getenv () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007f6db5a80d76 in setlocale () from /lib64/libc.so.6
No symbol table info available.
In the script that caused the crash I saw:
setlocale(LC_ALL, 'nl_NL');
Because it's impossible to scan all our websites I set setlocale in the disable_functions in php.ini.
The website that was calling this function didn't report any errors, nor an error in the php-log.
For the MAGICK_THREAD_LIMIT=1 thing:
As you can see in our modules list: no imagmagic compiled (couldn't be, as it is not on our servers).
However in the WP-plugin woocommerce I did find this call
wp-content/plugins/woocommerce/includes/class-wc-regenerate-images-request.php
@putenv( 'MAGICK_THREAD_LIMIT=1' );
Theres a reason for this: https://core.trac.wordpress.org/ticket/36534
I tried it myself with a script but no crashes.
I have no idea how to prevent these crashes, but maybe the reason for this crash lies with the earlier setlocale.
As setlocale in the php-docs say:
The locale information is maintained per process, not per thread. If you are running PHP on a multithreaded server API like IIS, HHVM or Apache on Windows, you may experience sudden changes in locale settings while a script is running, though the script itself never called setlocale(). This happens due to other scripts running in different threads of the same process at the same time, changing the process-wide locale using setlocale().

I commented on that wordpress bug.
Imagick::setResourceLimit(\Imagick::RESOURCETYPE_THREAD, 1); should be safe to use, (if wrapped in a check for if Imagick exists).
But it isn't required if the appropriate entry to one in the policy.xml anyway.
Pascal - please can you update the ticket in a few days time to say if disable the other setlocale / putenvs eliminates the crashes?
I'm going to leave the ticket open for now, to think about it.

[2019-03-01 20:24 UTC] pascal dot nobus at webservice dot be

I will report if the problem is fixed by putting setlocale in disable_functions.
Is it possible that this has something to do with it:
7.0.0 Support for the category parameter passed as a string has been removed. Only LC_* constants can be used as of this version.
(the crashes came after upgrading from 5.6)
For the MAGICK_THREAD_LIMIT:
@putenv( 'MAGICK_THREAD_LIMIT=1' );
isn't safe wrapped for only Imagick.
It's in the constructor of class WC_Regenerate_Images_Request which is used for many processes (including WP_Image_Editor_GD)
And offcourse there is no policy.xml if Imagick isn't installed at all.
However I'm not certain that this crash wasn't a result of previous error with setlocale.

I had several days without any segfaults now.
So it's pretty safe to conclude that Apache MPM-event together with mod_php is causing Segfault when doing something with the enviroment or locales.
So it's not only non-thread-safe, but also causing crashes.
I have no idea that this is something that can be fixed within PHP/apache.

Active support for PHP 7.1 ended months ago[1], so this issue will
not be fixed. If you experience the same problems with an
actively supported PHP version, please re-open the ticket and
state the PHP version.
[1] <http://php.net/supported-versions.php>