Video

Talk transcript

Today I’m here to talk about a different approach to working with web apps, which was a big mindset shift for me, so I’m very excited to share it with you!

But before I get into the technical bits, I’d like to talk a bit about water, as different ways of accessing it have a lot of interesting parallels to web apps!

As you probably know, water is a very precious resource. A clean water supply is the single most important determinant of public health.

The plaque on this photo is in the memory of William Pranket in the village of Stoke Sub Hamdon, where his well was the only source of water until the early 20th century.

Improvements in delivering water to people are relatively recent, and far from consistent around the World.

A good example is Hong Kong, where until 1964 water rationing was a constant reality, occurring more than 300 days per year.

In the worst crisis, water was delivered only every 4 days for 4 hours each time.

In this photo you can see how bad the situation became, people had to queue in the sun for hours to get some water from the public well.

So these shared wells suffer from a few different problems:

People need to potentially go very far from their house, and queue while the water is drawn for the people in front of them

It’s impossible to completely isolate the well and the water source from contamination (as you can see in this illustration), which leads to doubtful safety

Relying on a single groundwater reserve or spring, each well is exposed to droughts, making water temporarily unavailable at times

While current municipal water systems of course aren’t perfect either, the service is much better than public wells.

They make access to water quick, by bringing it close to the point of use through an extensive network.

Water coming out of the tap is safe – it’s difficult to compromise the sealed, underground pipes.

Supply is reliable – the available water sources are interconnected and widely distributed, getting rid of inconsistencies.

Unfortunately a lot of time facts alone are not quite enough to enact change…

This is the map of the infections in the Broad Street cholera outbreak in Soho, London, as drawn by a physician John Snow in his study in 1854.

All the incidents were centred around the Broad Street pump, a public water access point.

Snow’s hypothesis was that contaminated water, not air, was the source of illness and he designed the World’s first double blind experiment to verify his theory.

But since people have always been people, the scientific evidence left the Members of Parliament unmoved and no substantive action was taken.

That is, until 4 years after the study, when the historic heatwave of 1858 came.

Temperatures soaring well above 40°C made the level of the river Thames drop dramatically, exposing more than 2 metres of raw sewage, which quickly started to ferment and caused an unbearable stink.

As the recently rebuilt Houses of Parliament were right next to the river, the stench suddenly created a very strong political will for action where the science could not, so the drafting and passage of new laws to fund a new sewer scheme – on a scale the World hasn’t seen before – happened in just 18 days.

So how do these stories relate to the Web?

As you might have guessed, wells represent the current centralised app server paradigm.

But what do I mean by “centralised app server”? I used LAMP in the title of the talk, but what I’m going to talk about is true for all stacks, where there is a constantly running application daemon on a server somewhere, responsible for dynamically responding with some web content, be it written in PHP, Ruby, Java, Go, Javascript, .NET, Python or anything else.

I call them “app servers” to create a distinction with web servers (like Apache or Nginx), ones which you need no matter what, to listen and send HTTP responses even if you just have the simplest HTML file to show.

So the characteristics of this architecture is:

Executing some application code for most requests – at the very least rendering templates

Content often stored in a database – where constructing a page might mean complicated queries

Orchestrating multiple server instances for scaling can be tricky, both from an application code and infrastructure perspective

So what’s wrong with this approach?

It’s just a bit too easy to have problems with performance, security, reliability and resource use.

Let’s have a look at these one by one!

If a web application needs to execute code and do database queries, no matter how well it is optimised, the response will take some time to prepare.

Caching can help with the issue, but it’s tricky to get right as I’ll explain in a moment.

Also, unless you have a massive hosting budget and some dedicated dev ops time to set up instances in different regions, and tie them together with load balancers, the further people are from your server, the bigger their network delay be – if you ever used the web in Australia, it’s not fun.

Google AMP and Facebook Instant Articles wouldn’t be around if this wasn’t a huge problem, but is creating walled gardens really a great solution?

So caching is a sensible first step for a speedier response, before getting on the journey with instance scaling.

But for a web stack which consists of multiple layers to produce the final content, caching might end up being implemented in various places.

Unfortunately the complexity of multi layer cache can quickly spiral out of control.

Invalidating cache is one of the hardest problems in software development, so doing it in multiple places and between multiple disciplines becomes truly mind bending.

So what started as a simple web site or web app will start to require some serious infrastructure and tooling work.

It feels like there isn’t a week passing nowadays without a serious security vulnerability exposed in one back-end framework or another, and possibly even more undisclosed ones silently exploited.

And it’s not because they are poorly written, but the more they try to do for us, the bigger the potential attack surface is, especially in the hands of less security conscious users and developers, which let’s be honest, is most of us.

While fixing a security problem might mean a few sleepless nights for developers, a mass data breach can easily mean a collapsed business due to evaporating trust.

Despite this, keeping on top of security updates is usually very low on most web team’s todo list.

It seems there is no perfect world with web sites: it’s a problem if no one bothers to visit them, or if they suddenly see a lot of visitors and they stop responding.

What’s the worst is that in most cases they can’t even show read only content, so even though the server side app should really only be needed for user interaction, the content itself also becomes inaccessible - Reddit is the perfect example for this.

According to a study by Jonathan Koomey, a Stanford University professor:

Data centres worldwide are estimated to have ten million unused servers, which could be hiding a potential cost of 30 billion dollars

So around 30% of the world’s physical servers are sitting in a “comatose” or “zombie” state, as no information or compute services have been processed for at least six months

Even more surprisingly, a similar percentage of virtual machines are also zombies, despite them being more expensive and easier to manage

If shut down, the servers could reduce electricity load by at least 4GWs, not even mentioning the cost of manufacturing and housing them

These numbers are just the tip of the iceberg (as servers which are idle most of the time but do some work every now and then are excluded), so the potential cost and environmental savings are huge

Now after having talked about the problem, let me walk you through the promise of this fabled new way of doing things.

The piped water in my little story at the beginning represented a new approach, which can be described as “distributed asset delivery with serverless computing”.

So there are two important parts in this – admittedly quite long – name:

Firstly, the distributed asset delivery means finding a way to deliver a web app’s assets (be that HTML, CSS, Javascript, images, fonts, videos, and so on) from a web server as close to visitors as possible

Secondly, to handle any sort of user interaction which needs data manipulation, communication, payments or any action really, in a way which avoids running your own, persistent app servers

So what are the characteristics of this approach?

Content and assets are preprocessed as much as possible before deployment

Any dynamic functionality and data is delivered through APIs – accessible both at compile time and in the browser client

Static assets are served up from a distributed host – typically a CDN

So how does something built with this architecture would look like from a visitor’s point, or in other words the front-end side?

If you have a look at how progressive web apps work, they pretty much embody the idea:

You start out with an application shell built from static assets, very similar to native mobile apps, just without the extra friction of discovery and explicit download

The content inside can be pre-rendered for the first load, but a Single Page Application takes over navigation and data loading on the client side, once the Javascript app bootstrapped

With browsers implementing Service Workers, this means complete offline support using the existing app shell and already downloaded data

So once running, these apps only need to connect to APIs, which can be served up completely independently of where the application assets are originally from

So this provides us with a potential for a big mindset shift, but first of all let me give you a quick introduction to CDNs. The acronym stands for Content Delivery Networks, and they are mostly known for delivering videos and images as a way to take some of the heavy load off web servers.

But there’s no reason for us to stop there, why not use CDNs as the first point of contact right from the DNS response?

Interestingly this would take us one step closer to Tim Berners-Lee’s original vision of a distributed web, where even catastrophic events can’t take the entire network down – in stark contrast with the AWS S3 outage this February, which took several massive services down, despite only happening in just one of their data centres in the US. And the biggest irony was that Amazon engineers themselves were unable to update their status dashboard as it relied on the same host.

That said, today’s CDNs are still centrally controlled networks of servers, but the largest ones have hundreds of independent endpoints (or in industry jargon: Points of Presence) around the World – including places which are not the US and Europe, I can assure you they do exist!

What makes using CDNs easier nowadays is that they are more and more scriptable through APIs, so for example easier to flush content, even just partially.

My current favourite host is Netlify, they are pretty much the Heroku of smart CDN based hosting and also do some great open source work in this space.

But the future can be completely distributed: IPFS – which stands for InterPlanetary File System – mixed up ideas and technologies from BitTorrent and the blockchain amongst others.

But since CDNs can only work with static content, for any interaction requiring computing we still need to connect to an API.

These APIs of course do need a server to run on, but it doesn’t need to be a constantly running one provisioned and managed by you, eating money even when not in use or getting overloaded so you need to scramble to scale.

Serverless functions (or Function as a Service) can be a great solution for a lot of backend needs, especially when they are connected to a managed database to persist state and data.

But Sports Direct, Deliveroo, Uber or any other firms employing zero hours contractors are waking up to the reality that they can’t wiggle out of all their responsibilities…

Despite all the hype, these serverless functions of course still won’t write, deploy and maintain themselves…

So before writing a single line of code, you can be even better off considering servicefull APIs, or in other words hosted services offered as APIs.

While the idea has been around since Web 2.0 was a thing, nowadays you can find great services even for core functions like:

Auth0 for authentication and authorisation

Firebase for storing user data

Contentful for creating content on an editor friendly interface, and so on

So unless you’re dealing with a service already on a massive scale, the developer time and worry you can save will more than pay for the cost of these “humble sweatshops”.

If you think these metaphors were a bit contentious, you’re totally right. But rather than having the intention of making fun of the people who are suffering, I’d actually like to use this platform to dedicate a few sentences to them.

I think we – the tech community as a whole – aren’t doing a great job at giving enough respect to people around us. Which is in stark contrast with who much we obsess about computers and gadgets.

So if you take away only one thing from this talk, let that be taking a moment every now and then to think about the impact of your work and your words on the people around you and in the wider World, especially if they are not like you!

Right, after this little digression, let’s see how these solutions help with the issues I talked about earlier.

I split each of these between the two parts, so the distributed asset delivery via CDNs and serverless computing.

So performance with CDNs is a joy to see really, since they are only dealing with files and each visitor will get them from the nearest endpoint, load times are pretty much instant.

At the same time, computing on serverless infrastructure automatically scales to whatever the demand is (except for the most extreme cases).

While in your own server no one stops you from shooting yourself in the foot, in a Function as a Service you have to work within limits, so for example on AWS Lambda any execution has to finish within 300 seconds, which might sound like a lot, but knowing the constraints will make you think about splitting up, measuring and optimising functions earlier.

Just a bunch of files on a CDN is practically impossible to hack, but even better, there’s not much in it to do, so it’s unlikely many people will even try.

The virtual machines and containers running your serverless code are transient by nature, so they don’t stick around long enough to give people time to hack into.

Also, once they are spun down, security updates can be applied right away without any disruption, so next time they run, you’ll get a freshly updated container image.

Due to their distributed nature, CDNs are very resilient to denial of service attacks, but even if a well resourced adversary manages to take one down, migrating files to another one is not difficult.

Serverless infrastructure mostly operates on containers, which are very quick to spin up, so unless you go completely nuts with dependencies, even cold starts will be at an acceptable speed.

Again, since CDNs serve only files, there’s not much computing they need to do other than routing. But also, by using and endpoint close to visitors, less network hops will be used, saving on using resources on each switch on the way.

And with serverless, idle time is minimised by cleaning up containers after a very short time, while at the same time you also don’t need to over-provision due to the ability to scale on demand.

These translate to big savings both in terms of monetary and environmental resources.

So how can you get started, you must be thinking “surely it’s a lot of work to do all these new things”?

Well, it used to be, and only big and tech heavy organisations like the Guardian, Google or Alibaba could afford to develop custom solutions for them in house. But hey, it’s 2017 and I work for a studio where we often don’t have more that 6 weeks to turn projects around from business problems to a few versions deployed, so I wouldn’t be here if there weren’t tools available to do most of the hard work for me!

Static website generators have been around for years now, and people have great times with them producing websites based around content. But these are static web sites, lacking the modern browser based powers of web apps which people more and more expect now.

So it was only a matter of time until a new generation emerged, which I call “progressive web app pipelines” and will explain them in a moment.

But first let’s have a look at what static website generators do, just so it’s clear what I’m talking about. The poster child for these is Jekyll, which is the Ruby based generator behind Github Pages. It’s been around for ages and is remarkably good at producing simple and fast websites, taking it way beyond the original role of generating documentation sites for open source projects.

The way it works is taking Markdown content files with YAML metadata and putting them in various Liquid templates. This process does a great job at producing HTML and CSS, but Javascript is just an afterthought, stuck in the jQuery era of DOM selectors.

So what do these new progressive web app pipelines do beyond that?

This diagram is from Gatsby, an open source tool which just emerged from a full rewrite with the final 1.0 version released last Thursday
So what makes it a big deal?

First, it takes more than files as input, there are already plugins out there which can dump data from APIs like Contentful’s, Twitter’s or Instagram’s

Second, it aggregates all the content and assets and creates a GraphQL API to work with them, so you can have custom queries co-located with front-end components, including generating image sizes appropriate for various contexts

Third, it generates a fully formed PWA together with pre-generated JSON data files, ready to be deployed on a static host

Some key ingredients in this are:

React, with its mature support for server side (or in this case compile time) generation and great ecosystem of tools like the Redux state manager

And Webpack with its similarly huge ecosystem, as an advanced build tool providing the pipeline and plugins for processing all assets

But I don’t want to look like I’m blindly worshipping Gatsby, there’s another similar open source tool called Phenomic which is also well worth checking out, especially if you don’t want to use React for any reason, as they are working on supporting multiple Javascript single page application choices.

What makes all of these tools tick though is a big, fat asset compiler, which does the hard work at build time. What this means is some high level orchestration code, which makes – in these cases – Webpack and its plugins work smoothly together and lets you focus on application code.

A few years back I used to dread setting toolchains up, as it meant either a lot of hand coding and head scratching or trying to understand and strip code out of boilerplates. But in both cases usually ending up with compilation times slowing down a few weeks into development.

Luckily nowadays that’s mostly history with incremental builds, live browsers refresh, hot module replacement and state replays, without having to spend a lot of time manually piecing these together.

For dynamic content and data on the input side of these pipelines there are headless CMSs and hosted database APIs. What this means in practice is that website content writers can work on a nice visual editing interface and the content output can be accessed only via an API, making it a headless CMS.

This in contrast with the more monolithic CMSs responsible for rendering content in their on templates and locking you into their ways of doing things. Similarly, to store user generated data you can use something like Firebase and use the public content (like comments, uploads, and so on) as input in your static site. This way any time there’s a change either in your code or the data sources, you can trigger a rebuild using a web hook.

As the final part of my talk I would like to share a few cases studies with you.

A lot of you probably already read articles on Smashing Magazine, but in case not, it’s a big, international publishing company, mostly writing about front-end development.

A year or so ago they decided to replace their overstretched Wordpress back-end with a static website generator. They used Jekyll at first, then switched to Metalsmith to be able to use Handlebars templates which can also be rendered on the front-end.

They have a fair amount of content, so using static compilation out of the box meant that the linearly increasing compilation time spiralled to 2 hours for each change which of course wasn’t very useful.

Moving on to using incremental builds (mostly by making sure to only process images if they actually changed by comparing timestamps) and splitting content per language, they managed to go down to 2-5 minutes builds depending on the number of images added.

In developing countries like India, where devices are increasingly catching up in processing power and having modern browsers, but storage space is still precious and networks are a bit patchy, the impact of progressive web apps with offline capabilities and are much more important.

So Twitter set out to write a lightweight web client as an alternative to their downloadable apps.

Since Twitter is far from being static content and the feed is tailored to each user, they had to put in place a small Node service for initial server side rendering.

But the browser application itself is making full use of PWA features and CDN based asset delivery, so they managed to offer a great experience while cutting costs by an order of magnitude.

I honestly see no reason why they couldn’t use a lot of this on the desktop site too, so curious to see when will they make the jump,

And finally as a full stack example, there’s a service called A Cloud Guru.

They offer AWS courses, quizzes, certifications, forums, user profiles, payments and so on, just like any other similar business.

But what started as an experiment for them became a principle: they don’t use a single server of their own.

They have a lot of interesting articles on their Medium blog about the journey, but what was the most interesting part for me is that despite being AWS specialists, they have very little of their own services running in Lambda. Instead, working with a more servicefull ethos by using existing services, they found that they could focus on their own application and at the same time became part of multiple communities, spreading the money and love around instead of relying on just one big cloud provider.

As you could see there are a lot of different ways to put this approach into use and methods you can deploy independently, so the one big thing I’d like to leave with you is: When designing your service architecture, ask yourself the question if you really need to have a server running somewhere.

And of course there are many perfectly valid reasons to have them, on the current project I’m on we’re actually developing an API and some transactional pages with Ruby on Rails, as the service is for a children’s hospital with plans to roll out throughout the NHS, we didn’t want to freak them out with something they’d struggle to find maintainers for or would have concerns around data we can’t answer – that said, the hospital employee interface is going to be a PWA which uses Rails as an API only!

But as a father of two, I can assure you that while bringing your own babies into existence is for sure an exhilarating thing (especially in the conception stage), but the responsibility of looking after them is going to stick, so you have to be careful with how many can you comfortably handle.

While git and Github brought tremendous improvement to how people can manage changes and collaborate on their software and simple textual content, most people still work with various binary formats and could benefit greatly from a similar boost in the way they work.

This effort would not only make Github much more useful by making binary file comparison possible, but also contribute hugely to efforts like the semantic web by compiling a library of the best open source parsers (similar to Linguist for code) which can be used by anyone to analyse binary files.

Of course crunching through massive files would be very costly, but depending on complexity and usefulness each format could be processed to a various degree.

1. Gather metadata

As a first step only metadata would be extracted from the file. This would give a high level idea of what changed, making it possible for people to do quick sanity checks whether the changes look right.

This might also mean improving the diff UI to accommodate for these richer comparisons which could also benefit code, but I’ll leave that for another post. Suffice to say that pull requests should be much more suited to (near) real-time collaboration, and not just on code. And that using semantic understanding to go beyond a simple text diff should be used for visualisation (and the tools are actually already available for code via the syntax definitions in Linguist used only for code colouring).

It’s difficult to foresee what kind of possibilities these would bring, but I think they could have to potential to bring open, global collaboration to a lot of new fields like design, engineering, music, etc.

Also it feels the time is getting right for this, with technologies like containers which enable using (often compiled) tools for processing these binary files with much less hassle when it comes to setting environments up with all dependencies installed. If you thought Docker is only for deploying web services, think again.

The problem of staying mobile with two (or more) children

My second little one was one month old just the other day, so we started to talk about some ideas with my wife Ivana how will she be able to stay mobile when I go back to work. She has a hybrid bike now which was converted to electric to help her get around during this last pregnancy, but with the added weight of the motor and battery she won’t be able to also have two children plus all the stuff in the basket on the bike.

The “normal” reaction at this point for most people would be to go and buy a family car, but unless you live in a remote countryside location it only makes sense to drive if you’re disabled or if you’re a conservative politician.

Not for us though, especially after having lived for a while in Amsterdam, where people just get a cargo bike (bakfiets) instead. There are tons of brands to choose from (Bakfiets, Christiania Bikes, De Fietsfabriek, Nihola, Winther, Urban Arrow, WorkCycles, Dolly Bike, Cangoo, Yuba, etc etc and these are just the ones with family friendly models) if you live in an advanced cycling country like The Netherlands or Denmark, but here in the United Kingdom the options are a bit more limited and prices much more inflated – usually the same in pounds as they are in euros.

The bikes come in many different sizes and shapes, but since Ivana’s main concern with her current bike was stability, we instantly ruled the two wheeled versions out. The more solid box tricycles are a bit heavier though and living in Stamford Hill now there’s a bit of a hill to climb to go anywhere, so we also decided that having a trike with electric assist is a must.

An e-trike it is then, but which one?

With these in mind I set out to do some research which took a good few hours so I thought I should share it in case anyone else finds themselves in the same situation.

Unfortunately a lot of these family cargo bikes cost thousands of pounds even without the electric motor, so our shortlist of the “best” is what you can get between £2000-£3000 once you include all the essentials like rain cover tent and so.

There are some cheaper Chinese models, but since we don’t have a garage to protect the trike from the elements (or want to spend loads on maintenance) these wouldn’t be the best invesment on the long term.

So without further ado, here are the top 4 (the titles link to the official web sites). I won’t list detailed specs, but instead write up my impression and what I found out through some research and forum dig.

The best ones in no particular order

With a starting price of £1899 this is the cheapest of the lot. Babboe frames are designed in The Netherlands and made in China, the electric parts come from Taiwan, the wood is European, the Shimano gears are Japanese, the Magura brakes are German and finally the cargo bikes are assembled in The Netherlands. Some of the earlier models were getting some bad reputation a few years ago, but it seems that the constructions has improved a lot and Babboe now offers an impressive 5 years warranty on the frame of the bikes. I also had the chance to see and ride a 4 years old model which was in fairly good shape given the age and the fact that it was used daily to haul heavy tools.

Quality concerns set aside, actually riding the bike didn’t feel that great. It was a bit difficult to keep it on course in (the admittedly quite bumpy) Bethnal Green Gardens park.

So all in all it didn’t seem to be an awful choice, but make sure to give it a ride before getting one.

After a brief stint selling the brand Velo Electrique which were super affordable “upcycled” Chinese bikes (like the Velo Cargo E250 starting at £1455), British designer Jeremy Davies set up a locally made line of bikes under the moniker Boxer cycles. I actually had a good email exchange with him and the passion for quality shined through, which combined with his previous experience designing offshore power systems sounds really convincing.

The Shuttle itself looks really cool with the custom vinyl designs and the specs are also nice, like for example all around hydraulic disc brakes. It also comes with a rain tent which I think is an essential, after which it goes at £2880.

Unfortunately there isn’t a model readily available to be tried in London as of now (the nearest shop with one on hand is Kids and Family Cycles in Dorset), though Jeremy was very accomodating by offering to go out of his way meeting up while doing a delivery near London. If you are any of the London bike shop people reading this, please get at least one in stock for trials!

Being £300 more than the Big-E, the Curve-E starts at £2199. Looking only at the spec sheets the differences between the two models are minor. Nicer brakes, beefier motor, fancier finish, that’s about it.

Sitting on the bike and riding around a bit the difference starts to shine through, even though the actual model we tried was a standard one without electric motor, it handled better and felt nicer to ride than the Big-E. Despite my initial skepticism towards Babboe, riding on this new model made me a convert so we ordered one right after trying it.

I haven’t done too much research on this particular bike, but looking at the Dutch electric bike specialist Juizz it seemed to be the next cheapest option up from the Babboes and Christiania one of the “classic” cargo bike brands made in Denmark, so definitely one to try and look into.

So how do I try and decide?

We were really lucky to have found an open day nearby in Bethnal Green. The two chaps from BikeWorks were really friendly and helpful, answering a lot of our questions. So the fact that BikeWorks is actually

a really cool social enterprise supporting and teaching people from disadvantaged backgrounds,

the place where we got Ivana’s current bike which was a great buy and still going strong,

they throw in a free rain tent (worth £120) if you order your Babboe at the trial event,

made me get the bike through them right then and there.

But anyway, it’s very difficult to decide purely from specs and reviews whether for example you prefer indirect or direct steering (Babboes are former, Shuttle and Christiania the latter), so make absolutely sure that you try at least a couple of different brands and models.

As well as BikeWorks, another good place to go and check bikes out at is London Green Cycles, which is a shop next to Regent’s Park with a big selection. Slightly annoyingly though they don’t have prices for any of the electric models on their website, if you’re the owner reading this please do put them up there.

If you want to do some further research yourself, the German website Nutzrad has a massive catalogue of different bikes which is a good starting point.

Or if you’re in Hackney / East London, drop me a message and come around to try ours in a couple of weeks :)