Improving cron spawning and other non-blocking HTTP requests

Description

The order of preference for transport methods in the HTTP API is cURL, streams, fsockopen. However cURL and streams cannot perform non-blocking requests, but fsockopen can. Therefore, fsockopen should be the highest priority transport method for non-blocking HTTP requests.

Here's an example. I have a script at http://ctftw.com/sleep.php which sleeps for 5 seconds.

This request does not block the page because fsockopen returns immediately after sending the request.

Cron Spawning

This is a benefit to core because it improves the cron spawner (and can potentially fix #8923). The cron spawner uses a timeout of 0.01 seconds and a non-blocking request, but actually takes longer than 0.01 seconds.

We can therefore improve the cron spawner by setting fsockopen as the preferred transport method for non-blocking HTTP requests.

In an attempt to address #8923, we can change the cron request timeout to 1 second. If fsockopen is used, the request is lightning fast at ~0.001 seconds. If it's not available and the HTTP API falls back to cURL or streams then it takes ~1.1 second, which is the same time it takes currently. (Hopefully that makes sense.)

Change History (51)

Not sure what's going on here, but it needs some further investigation.

Fsockopen writes the data, and then fclose()'s on it.

Streams writes the data, uses the stream_set_blocking() function to create a non-blocking request, and then fclose()'s on the handle

Curl uses curl_exec() followed by curl_close() - this could be causing it to wait, we "recently" changed the headers handling of the call, that might've affected it.

At one point in time, we had a different order for blocking/non-blocking as the PHP HTTP Extension didn't support non-blocking requests, when we removed that, the requests appeared to be working then, so I'm wondering if something changed in the way commands are executed. After reading a few paged on curl, it appears that DNS timeouts might come into it as well, as well as, long connection times (ie. 300ms of latency whilst waiting for the host to respond, might cause the curl_Exec() to hang for up to 900ms).

If non-blocking requests weren't working at all, I'd expect to see a lot of reports of slow blogs, given it's in use by the cron system..

I see similar speed issues against local resources (ie. google.com.au returns in 1.5seconds usually, except for fsockopen with a non-blocking request which does it in 0.08 seconds).

Looking at the stream_set_blocking() function, it only operates on _socket_ and _local file_ resources, HTTP resources can't use non-blocking mode by itself. Streams would need to use stream_socket_create( 'tcp://...') or 'ssl://...' and send HTTP headers manually in order to be able to use non-blocking requests (Effectively turning it into a Sockets class rather than a "streams" class). This would explain why 0.01s timeouts for cron doesn't work for some users, as the connection needs to be made, which often takes longer.

Looking at curl, It supports Asynchronous requests through cURL's multi_exec functionality: curl_multi_init() - however, it appears that you still need to call curl_multi_exec() in a loop to ensure that the request actually takes place "in the background".

The curl_multi functions aren't non-blocking either (well, they are, but they don't actually allow you to perform a non-blocking request). Your script needs to wait for a result from curl_multi_exec in a do while loop. If you call curl_multi_exec once then carry on, the handle will be killed by PHP during cleanup and you risk killing the handle before it's completed its request (eg. it might be in the middle of sending data).

If you call curl_multi_exec once then carry on, the handle will be killed by PHP during cleanup and you risk killing the handle before it's completed its request (eg. it might be in the middle of sending data).

Yeah, It needs to be called a few times whilst it processes the connection request at least. with multi_exec I'd want a shutdown hook to finalise the connection.

Pushing fsockopen to the start of the queue for non-blocking requests is a no-brainer short term, and probably better moved to the start longer term as well (for the simple fact that it's controlled by us - and as a result, has a few bugs of it's own.. but fixable ones).

Yeah, It needs to be called a few times whilst it processes the connection request at least. with multi_exec I'd want a shutdown hook to finalise the connection.

Excellent idea. 18738-curl_multi.patch​ does this. I've run this against the HTTP unit tests and there are no failures. There are no tests that use the blocking flag, but this patch doesn't break anything. And the output from the test script is:

We control the way the fsockopen class works in every respect, we can catch when it doesnt work

It'd be nice to have 2 classes support non-blocking correctly, especially the ones given priority

fsockopen was originally added last as the builtin libraries were supposed to do it better than we would in PHP.. ultimately, bug reports have always pointed to the opposite, often with server configs we just can't work around ( safe mode issues and curl not being able to do DNS lookups, off the top of my head)

Streams (fopen) is also great at failing to spawn cron on setups using .local dns, Often seen in mac's, fopen() fails to open a loopback connection, also seen in Ubuntu installs occasionally.. this is what prompted me to come find this ticket again.

Always fail if the transport used is WP_Http_Streams, which is the default if cUrl is disabled.
I know i can use ALTERNATIVE_WP_CRON but lot's of user complains ( and i can reproduce it on my dev server ) that it blocks the request, and i also reproduced that in develop

I spent a while tracking this one down and fixing it before I copped on and checked your bug tracking system.

My solution was simply to fix the timeout issue where is turns the passed timeout into ceil(timeout) in seconds. Since php 5.2.3 you can pass the timeout to curl as miliseconds, so I just altered it to use that method instead. I verified that the cron does get run, but even though the original code used the same timeout for both, changing the connect timeout to 10ms may have repercussions.

If libcurl is built to use the standard system name resolver, that portion of the connect will still use full-second resolution for timeouts with a minimum timeout allowed of one second.

That's just the name resolver though, which shouldn't be an issue for the cron anyway. Using the milliseconds option ends up being much more like the expected action, even if you combine it the the curl_multi.

As additional consideration is there any reason to spawn cron smack in the middle of the init where it can seriously hold up page load? I would think shutdown might be more appropriate hook, but not sure if I am missing reasons it got stuffed into init in first place.

Hi @johnbillion - do you think we can get this in trunk early for testing for 5.1? Even if its use is gated by a constant that defaults to falsem it would be good to get some more widespread testing instead of punting because of a SO post. Without the changes being in core I think that testing is unlikely to happen. If it works reliably, even for some hosting configurations, I think it could have a significant positive performance impact.

This diff looks pretty good. Thanks! wp-cron seems to be a good place to start and probably has the least chance of breaking stuff. There are some other related tickets though that suggest using it more places:

You're more adventurous than I am. I think we could remove the constant in 5.2 (or default to true) if there aren't any major bugs reported in 5.1. We should urge hosts running FPM and recent PHP versions to set the constant to true once 5.1 is released.

Let's commit it without a constant for now, and write a dev-note to inform people that it's there. We can ask hosts to specifically test it during beta/RC, too. If it turns out to be problematic, we can turn it off during beta.

This should make cron spawning faster by ensuring requests to wp-cron.php return immediately regardless of transport method. It is enabled only on recent PHP versions with fastcgi, due to historical bugs and availability of fastcgi_finish_request(). This needs testing on a range of platforms, to help determine if it's safe to use in other contexts also.