Tag: decoupling software

Websites and web applications have traffic patterns that are often unpredictable. After all growth in traffic is really what we’re after. However, even with the best stress testing, it’s sometimes difficult to predict what areas of the site will get innundated, or how the site will scale.

Degrade gracefully describes an architecture built specially to unwind in a smooth manner without any real site-wide outage. What do we mean by that? We mean build in operational switches to turn off components in the site. Have a star rating on pages? Build an on/off switch for your operations team to disable it if necessary. Have site-wide comments, or robust search? Allow those features to be disabled. If possible, architect in a read-only mode for your site that you can turn on in a real difficult situation. By operationalizing these components, you give more flexibility to the operations team, and reduce the likelihood of having a complete outage.

Your recent social media campaign has gone viral. It’s what you’ve been dreaming about, pinning your hopes on, and all of your hard work is now coming to fruition. Tens of thousands of internet users, hoards of them in fact, are now descending on your website. Only one problem, it went down!!

That’s a situation you want to avoid. Luckily there are some best practices for avoiding scenarios like the one I described. In engineering it’s termed “degrade gracefully”. That is continue functioning but with the heaviest features disabled.

Browsing Only, But Still Functioning

One way to do this is for your site to have a browsing only mode. On the database side you can still be functioning with a read-only database. With a switch like that, your site will continue to function while pointed to any of your read-only replication slaves. What’s more you can load balance across those easily, and keep your site up and running.

Decoupling

In software development, decoupling involves breaking apart components or pieces of an application that should not depend on one another. One way to do this is to use a queuing system such as Amazon’s SQS to allow pieces of the application to queue up work to be done. This makes those pieces asynchronous, ie they’ll return right away. Another way is to expose services internal to your site through web services. These individual components can then be scaled out as needed. This makes them more highly available, and reduces the need to scale your memcache, webservers or database servers – the hardest ones to scale.

Identify Features You Can Disable

Typically your application will have features that are more superfluous, or that are not part of the core functionality. Perhaps you have star ratings, or some other components that are heavy. Work with the development and operations teams to identify those areas of the application that are heaviest, and that would warrant disabling if the site hits heavy storms.

Once you’ve done all that, document how to disable and reenable those features, so other team members will be able to flip the switches if necessary.