My Server Response Times Suck! What Gives?

This is a question that I was hoping someone might be able to answer for me.

My website's response times are highly erratic, and more often than not, slow into the seconds. With Debug enabled on my XF forum, load times are anywhere between 0.4 seconds and 4 seconds or more. I was using Nginx + PHP/FastCGI + XCache but switched over to PHP-FPM + APC on Sunday. Average server load dropped marginally as did memory usage, but server responsiveness utterly blows when I have just around 150 users on the site at one time.

My DNS records are also pumped through CloudFlare, so there's that too. Helps with the caching effort quite a bit. Doesn't help with the server response times.

The SQL queries are taking almost no time to execute (0.2 seconds down to 0.05 during quieter periods), so it can't be that. Upon checking htop on the VPS, CPU on the various php-fpm threads is fluctuating constantly up to 15%.

The load and memory stats are attached - I made the tweaks mentioned above on Sunday, to no avail. Was abandoning Apache all those months ago an unproductive waste of my time? Any ideas or suggestions you could give will be greatly appreciated.

It's been getting worse the past few weeks. I wanted to see if I could improve it, so I switched from one particular FastCGI configuration over to FPM. It doesn't seem to have much the desired effect, and if anything, is worse at times when there are more users on.

I'm just wondering if I'm missing something massive, because I hear all this noise about how amazing Nginx is supposed to be (I switched over to it once before but moved back to Apache because of slow performance issues and memory ballooning). Every time I've tried it, it's sucked.

I can spew out config files and what-not, just in case there's one golden value that happens to be mis-configured. Short of that, I can't see any solution other than bailing and going back to Apache yet again (which I really don't want to do).

It's been getting worse the past few weeks. I wanted to see if I could improve it, so I switched from one particular FastCGI configuration over to FPM. It doesn't seem to have much the desired effect, and if anything, is worse at times when there are more users on.

I'm just wondering if I'm missing something massive, because I hear all this noise about how amazing Nginx is supposed to be (I switched over to it once before but moved back to Apache because of slow performance issues and memory ballooning). Every time I've tried it, it's sucked.

I can spew out config files and what-not, just in case there's one golden value that happens to be mis-configured. Short of that, I can't see any solution other than bailing and going back to Apache yet again (which I really don't want to do).

Do you have a "direct" subdomain set up in CloudFlare? If you do, see if you can access that, to see if its faster. It might be named something else other than "direct".http://direct.ukofequestria.co.uk

This will give you a breakdown of each HTTP request by script and static elements. It will show you hierarchically what is loading when and exactly how long it's taking so you should be able to see what's holding up your page loads.

Apologies for the delay in coming back to this. I "paused" CloudFlare for a little while, and didn't see any sort of improvement, so figured I'd turn it back on for the time being.

Page loading times on debug mode appear to be anywhere between 0.2 and 5 seconds, with SQL queries not going above 0.05 seconds. So the problem lies within the page processing in Nginx/PHP-FPM somewhere.

This VPS was set-up for the intention of using Nginx - we moved away from a server using Apache. I'm now tempted to install Apache, set up the vhosts and .htaccess and just switch back over to Apache/mod_php to see what happens.

Do you have a "direct" subdomain set up in CloudFlare? If you do, see if you can access that, to see if its faster. It might be named something else other than "direct".

Click to expand...

Yep. Enquire via PM if you'd like to get involved and get the domain that bypasses the reverse proxy. I don't make these things to obvious to guess, just in case someone with less than honourable intentions wants to do something.

Yep. Enquire via PM if you'd like to get involved and get the domain that bypasses the reverse proxy. I don't make these things to obvious to guess, just in case someone with less than honourable intentions wants to do something.

Click to expand...

Your load times aren't great but they're hardly absurd on my end. You could do with a lot of server optimizations though.

Your cache headers are awful. You have a lot of static content with extremely low cache times set. See the list below:

Two hours is an inappropriately low cache time for static content. This is stuff that most people set their cache headers to 1 year + on and you've got them set to reload several times a day for regular visitors. I imagine as a MLP forum you end up with a lot of regular visitors and could reduce your server load if they were loading static content from cache instead of your server.

Serving your static content from Imgur is not a good idea. You have article image previews and some icons being served from imgur. Your site relies on the responsiveness of imgur. Imgur is not a CDN. It's an image sharing site. If you're using Cloudflare then you're actually bypassing your CDN by serving images from Imgur. If you'd served them locally then they'd be served from Cloudflare.

There's also something wrong with your Google Analytics code. It's failing even with my tracking block turned off and it is greatly increasing your reported page load time. This can make page load times look obscene in some browsers that wait for it. How are you implementing Analytics? I don't recommend using the built-in XenForo setting. Manually enter it into your footer template and update it as needed when changes are made to the Analytics API.

Thanks very much for all the suggestions. The improvements suggested by the bench-markers are useful, albeit negligible compared to delay being caused by the web server.

So, I did a bit of diagnosing on the server, and just happened to come along with the idea of testing the I/O of the server. I have several containers on the particular provider I am using - they are all on different VPS Nodes.

Here's one result from one of my containers on one particular VPS Node:

Maybe it's just me, but the disk speed on the server doesn't really seem up to scratch. I have filed a ticket with the provider with these results, and it could indicate a problem with the Node's RAID, or some other utilisation issue.

Edit: Scratch that. Rebooted the VPS at 2:30 this morning and got this result...

Unless another customer on the node is doing really stupid things at "peak" times, I really can't understand what the problem might be, other than a PHP-FPM or OS configuration issue. I am currently running this on CentOS 6. Could there be any benefit in trying with Debian or another OS?