We recently upgraded to vbulletin 4 and have been fighting a hard battle to get speed back where it used to be.

We've done a lot of server side optimisation which has made a big difference, but we still have a long way to go. We now feel that front end optimisations will be where we can yield the big impact for users.

I'd be really really grateful of your expert views on these results and where we might best focus our efforts.

It looks like you may need to do a little work on back-end scaling as well. How is it hosted if you don't mind me asking (cloud, VPS, shared, dedicated boxes)? Particularly the database. Throwing SSD's at the database can have a huge impact for something like a discussion forum is it's an option available to you.

Otherwise there's basicaly a lot of block-and-tackle work combining the css requests and getting the javascript out of the way. I also recommend testing on IE8 or newer because IE7 and below are pretty pathalogical and are an extreme worst-case.

(05-30-2012 09:10 PM)andydavies Wrote: You need to prioritise the request loading e.g. CSS first, JS last, lazy load as much as you can

Is there no way you can merge up the CSS files i.e. one CSS file only?

Number of DOM is going to be on the large side (but that's always an issue with discussion boards)

There are opportunities for sprites too...

Thanks. Will draw up a list of all those obvious gotchas and start prioritising I guess.

Hoping that someone might spot "the big one" in those results

Is the CPU maxing out an issue? We have seen a particular user activity impact in IE and on lower screen resolutions and we're wondering if that is suggesting lower end machines are struggling more than faster ones?...

(05-31-2012 05:36 AM)pmeenan Wrote: It looks like you may need to do a little work on back-end scaling as well. How is it hosted if you don't mind me asking (cloud, VPS, shared, dedicated boxes)? Particularly the database. Throwing SSD's at the database can have a huge impact for something like a discussion forum is it's an option available to you.

We've got what I think is a pretty mega back-end.

We're running a virtual set-up on a private cloud - with a physical capacity of 48 cores and 384 GB RAM. 4 x 128GB SSDs in Raid 10 for the DB. The virtual layout includes 8 webservers, a couple of varnish servers, a couple of memcache servers, plus one master and two slave DB's and we devolve search to Solr. We've had the guys from Percona fine tune the DB's and queries.

We're doing a bit more work on Memcache and Varnish optimisation, but feeling we're likely to yield more from front end optimisation now.

Quote:Otherwise there's basicaly a lot of block-and-tackle work combining the css requests and getting the javascript out of the way. I also recommend testing on IE8 or newer because IE7 and below are pretty pathalogical and are an extreme worst-case.

Yeah. The slog lies ahead I guess. Any clues in what might be the biggest wins in this front end work would be very gratefully received. Keen to get the biggest gains first

FYI, there are "Dynatrace" configurations available in the Dulles location (IE 7 and IE 8) which capture a dynatrace run and let you download the session (also lets you share it with everyone else to work off of the same one).

I'm running a test right now but you generally want to look at the hot spots.

Yes, IE 6 and 7 have HORRIBLE javascript performance, particularly around inefficient selectors. If there is Javascript on your site that does something like $(".someclass").... then the older IE's will need to traverse the entire DOM (slowly) every time it is called. That will cause CPU pegging and gaps in the waterfall and will certainly be worse for older computers with older browsers.

It took close to 4 seconds of CPU time to execute in the test that I ran.

If you open up the "Pure Paths" UI and sort by CPU time (descending) it will sort the code execution from most expensive to least.

Looks like the Facebook javascript was the 2nd most expensive with 2 seconds of execution time. I haven't looked at it closely but if you have an option to tell the facebook code the ID of widgets it needs to populate that would reduce the overhead a lot (assuming it is scanning for a specific class).

3k+ DOM elements is a pretty huge DOM (which is why it is so expensive). Not sure if there's anything you can do about that but it would be worth looking into to see if there are easy options with the layout that may be inflating it.

It took close to 4 seconds of CPU time to execute in the test that I ran.

If you open up the "Pure Paths" UI and sort by CPU time (descending) it will sort the code execution from most expensive to least.

Looks like the Facebook javascript was the 2nd most expensive with 2 seconds of execution time. I haven't looked at it closely but if you have an option to tell the facebook code the ID of widgets it needs to populate that would reduce the overhead a lot (assuming it is scanning for a specific class).

3k+ DOM elements is a pretty huge DOM (which is why it is so expensive). Not sure if there's anything you can do about that but it would be worth looking into to see if there are easy options with the layout that may be inflating it.

Ah great. That's brilliant having it on the Dulles install.

Thanks so much for your help. It's really shining a light on things

I've been having a good dig and am a bit confused about this...using your first test as an example...

When I sort by CPU Time within Pure Paths I am commonly seeing that last child problem at 3400+ ms and a readystatechange event on <window> at 2500ms+ (which seem therefore to be a problem)

What is throwing me is that their start times are 7s and 11.5s roughly. So I looked back at the waterfall to see if that is where we are getting the gaps/spread out items & CPU maxing and the problem seems to begin much earlier i.e. around 3s

I look to see what JS is going on at around 3s and actually it says the start time for the html is 3.38s - which contradicts with the waterfall which shows it running from 0-1.5s

The first significantly slow js seems to start at 6.3s with a CPU time of 765ms. Again too late according to the waterfall

Should the timings from Dynatrace and the Waterfall synchronise?

How can I see which JS is causing problems at the point in time where we are getting the big gaps and CPU max in the waterfall?

Am I missing something and maybe it isn't JS that is holding up the beginning of these waterfalls.

Probably worth adding that I am more interested in the cached views at the moment as the problem is impacting logged in users most.

Sorry, we start dynatrace before we start up the browser so the times won't line up exactly. If you go to the Timeline view you will be able to see the javascript execution and the matching network waterfalls. If you double-click on the javascript (particularly any long-running ones) it will take you to the call tree.

(06-02-2012 01:48 AM)pmeenan Wrote: Sorry, we start dynatrace before we start up the browser so the times won't line up exactly. If you go to the Timeline view you will be able to see the javascript execution and the matching network waterfalls. If you double-click on the javascript (particularly any long-running ones) it will take you to the call tree.

Thanks. Off to dig!

It's tricksy stuff this...Any insights that anyone is able to provide will be rewarded with lashings of positive karma (swapable for beer in all good local ale houses) and masses of respect and smilies