stop this pagespeed madness

Hey, listen, apologies if I’ve just wasted 579 milliseconds of your life getting you this far but, hopefully, it will have been half a second well spent, you won’t regret waiting a while for it and you’ll want to spend a few more minutes digesting this controversial topic – that of Pagespeed or, to put it another way, how long it takes for a web page to render in a viewer’s browser.

As people who make websites our job is to make sure everything we have control over such as the processing speed, efficiency, consistency and robustness of our hosting platform, the images we use and the tools we use to make the website are as good as they can be, within reason. Here’s a simple fact – if Pagespeed is your #1 goal, go make a single, plain HTML file in Notepad (other simple text editors are available) and call it index.html. Include only the bare necessity headers & meta and write some minimal plain text. No CSS, no JS, no external fonts, no CDN and certainly no images, and with any half decent hosting, you’ll achieve a sub-second page speed. Job done. Sit back and congratulate yourself.

Apologies for this little graphic, I don’t need to to make a point but you, dear reader, expect something to break-up the flow of text (it goes back to childhood and learning to read by the way). Anyway, back to the point … But it’s not that easy because our clients and their potential clients expect more. They expect to be ‘entertained’ by pin-sharp graphics, twiddly bits and interesting, compelling text and that’s where it all starts going downhill where page speed is concerned and, at that point, everything thereafter becomes a compromise and/or a test of the size of the client’s wallet.

Why the iceberg?

I use an iceberg because it illustrates nicely what we, as makers of websites, can influence and what we have no real control over. That which sits above the waterline we can, and absolutely should, control and influence greatly – that’s our job alongside design & functionality. The portion under the water, however, we can’t, not really.

Let’s work it back from the viewer of our website. We have no clue what device they’re viewing on, we don’t know which browser and we certainly have no idea (and never will) about how they’ve set up their browser; what features they have turned on and what they’ve unwittingly disabled or changed. We don’t know what anti-malware or what other resource-sucking apps they’re running on their device or what other websites they may have open in other tabs or windows all working-away in the background draining their resources which will impact on the rendering of our website before their eyes. As an example, in 2018 it was estimated that around 1% of users worldwide has disabled JavaScript – so you know that twinkly menu you took hours making? It ain’t going to work for them! What’s more, a raft of new Chromium-based browsers and hitting the streets, like TOR, which give even more scope for hapless users to screw their web-browsing experience. By the way, on average 10% of TOR users have JS disabled – I wonder if they know?

The thing is, we’re in a battle against stupidity and misinformation. I bet you pick up any PC or Web magazine and there will be articles on how to block this and how to disable that on the web. I read just yesterday of an image blocking Chrome extension. Why for god’s sake and how can we, as makers of websites, legislate against that? The answer is that we can’t. We can only ever hope to meet the needs of most of the people, most of the time.

So there is our viewer with his app-heavy, under-resourced and hacked browser trying to view the website we made and, against all the odds, it starts rendering on his or her screen – slowly. Possibly the worst scenario. But, wait. At the same time as our website is downloading its resources, he’s playing Call of Duty whilst watching the latest episode of GoT and, frankly, he’s only looking at our website because he’s bored and needs something for his fingers to do while GoT has an ad break and he’s waiting to respawn and, meanwhile in an adjacent room, the rest of the family are streaming a movie on Prime, the kids are, well, doing what kids do on their phones while wireless cameras continually monitor the perimeter of his home. In other words, we have no control over the quality, bandwidth and contention of his internet connection and what’s more, in most homes, WiFi signal quality is poor due to interference and mis-configuration.

Against all odds, he still continues to be able to view the website we lovingly made, albeit a bit more slowly that we’d have liked.

Anecdotally speaking

The words that strikes fear into the ears of any self-respecting web developer; hearing from a client “My mate said that when he visits our website our logo doesn’t appear” to which your response is always “It looks fine from here” but you need to investigate. I had this recently when a client called and said that a customer of hers was unable to buy a product. Now, this was quite a high-ticket item so I wanted to investigate. I called up the chap and, rather than bombard him with questions, I just asked where the problem was. He said that he could add the item to the basket, go to checkout, fill out all of his information but when he clicked on “pay now”, nothing happened. This was a site using Stripe payments. Then I asked him to go to https://www.whatsmybrowser.org/ and read me the results over the phone, top down. Three lines down we got to “Javascript enabled” NO. “Ah” said I “That’s most likely the problem, why do you have Javascript turned off?” To which he replied “I think my son did it when he came round recently because I was having problems with windows opening and closing randomly”. Says it all really but the key message is, you cannot develop and consider all eventualities, especially nowadays when we have literally millions of device/screen/processor combos to which you then add random acts of self-inflicted cyber terrorism and legitimate ‘css modifiers’ to aid accessibility!

Output from whatsmybrowser.com

the internet

The Internet, that’s the problem! Seriously, just like our viewer’s computer, browser and a million other local factors, the journey between our viewer’s screen and our finely-tuned, max performance server is paved with potholes, craters and booby traps that nobody can foresee on any given day, hour, minute or second. Many end-users probably wrongly assume that when they visit a website a direct connection is opened-up to the web server host but the truth is that their browser is opening-up many simultaneous ‘channels’ to various servers around the planet and, yes, the lion’s share is probably coming from the host, but there’ll be traffic from a CDN, font services, JQuery hosts etc. all coming together to compose the masterpiece that is your website and all of those resources have to travel through a complex array of wires, fibre-optic cables, hubs, routers internet switches, caching mechanisms, software filters, malware checkers and DNS servers twixt server and browser.

Page Speed Slow

From a practical point of view some of these ‘potholes’ on the journey break and stop working so, rather than the internet grinding to a halt when, say, a $100 ethernet switch in Tibet suffers a hardware failure there is what’s called a ‘spanning tree algorithm’ which essentially, in the first instance, plots the most direct route between to places (nodes) but then if something along the path breaks to then define the second most direct and so on. It can be a tortuous route.

The point is that this is all magic in a ‘black-box’ and stuff we, as makers of websites, don’t need to concern ourselves with BUT the physical data path and software filters, largely beyond our control or influence, can have a significant impact on pagespeed, especially if you’ve become obsessed with sub-second performance.

Content delivery networks

In so many ways, much of the technology that stands between server and browser is designed to overcome some other problem. Arguably if we all had super-fast Gigabit internet connections at home, in the office and on the move, then there would be no need for caching and CDN but we have them because initially they were to compensate for slow dial-up connections but, now, it’s all about efficiency and the drive for near-instant end-user gratification. Let’s talk about CDN or Content Delivery Network(s).

Let’s say that your server is located in New Zealand but the viewer of your website is in the UK. Cearly it’s inefficient to drag your images, CSS and Javascript files across the world in order to display them; not only does that impact page load times but also puts additional strain on the internet’s backbone communications network so, what if we had a bunch of servers located at strategic points around the globe which store a copy of your ‘static files’, those that rarely change like your photos, CSS and JS files. Then, when someone connects to your website from the UK, all that needs to be transmitted across the world from your host is mostly text wrapped in HTML tags and the static content served from a CDN node in, say Germany. The impact of this approach may be significant on the load placed on the big fibre-optic trunks responsible for trans-continental delivery and the end user may see a web page materialising slightly faster.

The problem is that not all CDN’s are equal. At the end of the day, a CDN is just one or more computers strung together with some clever software and just like our computers at home, they slow down if we ask too much of them. Also consider that unless you’re spending megabucks on professional grade CDN, you’ll be sharing the CDN computer(s) resources with many other, possibly hundreds or thousands, of other connections at any one time, all in contention for those precious CPU cycles and bandwidth.

A CDN does not mean faster page loading times but it can, it just depends on the level of resources available to your specific data request at any one moment in time and this boils down to how much the CDN network owner is prepared to invest in their infrastructure. In theory there’s nothing to stop me getting my old 486 computer out of the loft, loading it up with some Open Source software, putting it on the web and calling it a CDN to serve your images. Trust me, that would slow down your page load speed! A good example of a free-to-use CDN is that provided by WordPress through the JetPack plugin – it can be configured to use the WordPress CDN but tests have revealed using it can substantially slow down image delivery – the exact opposite of what you wanted to achieve by using it.

The key message is that using a CDN is not always a good thing and not always required. Perhaps, for example, like many of my clients – they are UK based and their target audience is in the UK. Using a CDN is, in my opinion, an unnecessary potential pothole in the journey from server to browser. I only have two clients which serve an international market and for these, I do use a CDN in the form of CloudFlare. You can think of a CDN as nothing more than a dynamic, geo-aware caching service.

Caching in

Another part of the server-browser journey is the cache. A cache is a place where stuff is stored to be used again and again without the need to continually re-fetch it from the server. Makes a lot of sense. If I make a website with the same image at the top of each page and you navigate between several pages, there’s no point in keep re-fetching it – better to store a copy locally and re use it over and over again. The trouble is that caches exist in a number of places on the data-journey – they certainly exist in your browser, and on your computer, they exist with your internet service provider, they exist on CDN networks, ethernet switches and at various nodes around the internet – all designed to reduce traffic and the load on infrastructure.

You’ll have all seen this – you edit a web page which you then view in another browser and, guess what? Your changes aren’t showing. After questioning your sanity, you force a page refresh and lo and behold those changes appear as if by magic – that’s a cache at work. Now, how long a cache holds on to your images, CSS, JS etc. is almost anyone’s guess and most browsers will allow you some control over that – I actually use a Chrome extension called “cache killer” to make sure I’m not viewing cached work. When testing page load speeds caching won’t be making much of a difference to results but when a real end user views your website, it will. Out of the box WordPress doesn’t appear to force any caching rules, although you can use a plugin to force matters if, for example, your website remains largely static for long periods, you can force the browser cache to hang on to its static resources for a long time; hours, weeks or even months. Best left alone though unless you have a real good reason to want to influence matters.

filtering the mess with DNS

Next step on the journey we come to what are best described as ‘filters’ – mostly software that aims to give you a better browsing experience by filtering-out bad, illegal and harmful content or applying ‘parental controls’. These filters can exist within your ISP’s infrastructure, within your broadband router/hub and within software apps on your phones, tablets and computers – sometimes in all of them. Filters are just that – they take in web content, look for anything which falls outside of their policies and either blocks that content or allows it through. To speed things up, they will also use blacklists and, sometimes, whitelists which are predefined lists of good & bad web content and these lists are often shared globally.

Like most other things on the internet, these ‘filters’ are simply computers or a network of computers but all processing and filtering takes a finite amount of time and that is an overhead and pothole in road between server and browser.

The world wide web absolutely relies upon something called DNS to make it work. In sort, every device on the internet has an IP address and this sequence of numbers relate to a real domain name like ancientgekkery.com. DNS works like a telephone directory and saves us all having to remember sequences of numbers when we want to visit Amazon or Google. At some point though, some bright spark decided that while they’re matching numbers against real-life addresesses, why not add some filtration and, in that way, individual ISPs can exert some control over what you see or don’t see and try to keep us all safe. Most end users will be oblivious to their DNS settings and only stumble across them when they’re unable to visit a certain website and raise the issue with support or, as happened quite recently, the WordPress CDN IP addresses found their way onto a blacklist, rapidly shared around the web so that overnight, images and other WordPress elements failed to show on millions of websites worldwide. The addresses were pretty quickly removed from the main blacklists but it took over a week for some ISP DNS servers to update and reflect the changes. That’s the power of DNS and a weakness because all that filtering can take time adding an overhead to the time it takes for a website to render on a user’s screen and you, as the maker of websites, have no control over that.

By the way, as someone who makes websites – what DNS are you using? That of your ISP? If so, I strongly recommend using one of the ‘public’ DNS servers – either Google’s own (8.8.8.8/8.8.4.4), CloudFlare (1.1.1.1/1.0.0.1) or OpenDNS (208.67.222.222/208.67.220.220) although the latter tends to have a little more lag than the others, in my experience. All of these though will update faster than your ISP’s DNS and it’s easy to make the change either at your device or at your hub/router or both. Details here for most configurations https://developers.google.com/speed/public-dns/docs/using

WordPress and page load times

WordPress is a content management system (CMS). You may see it as a blogging platform, an online shop or just a website but, at its heart, it’s a CMS. CMS is just a grand title for techies to drop into conversations at parties (?) but it’s worth looking at what a CMS is and why it’s responsible for the majority of sloooooow page speeds. Back in the early days of the World Wide Web and the birth of HTML there were no CMS but, there was the concept. In the publishing software there was the concept of “write once, use many”. Essentially, we stored lumps of pre-formatted text and maybe images and we gave them a ‘tag’ a sort of reference. Then, we we needed to use that image or text elsewhere in a document, we simply referenced the tag and boom, the image or whatever would appear. That, in itself was a major time saver. Then, first seen in the SGML standard (a forerunner of HTML used in document publishing), Tim Berners Lee and others extended the concept into HTML. But where are these standard blocks of content stored? The answer was in a database. At the time we used a very expensive Oracle database called SQL and each time we called a tag into a document, the content would be queried in the database, fetched over a network and placed in our document. It was pure magic! Today the Open Source derivative of Oracle’s SQL, MySQL, is widely used and sits quietly behind all the common CMS packages on the market and largely, it does a great job.