E-Commerce Tip: Load Testing: A Critical Step

The busy holiday shopping season is only a month away and the National Retail Federation forecasts 12 percent growth in online sales[1] over last year. Sounds like good news, doesn’t it? It is, if you can keep your virtual doors open and sales flowing around the clock. Recently, Rackspace offered up five tips to prep your e-commerce site for the holiday crunch[2]. We invited some of our e-commerce partners to provide a guest blog post looking at those tips and offering further advice.

Submodal [3]is a web design studio and a Rackspace e-commerce partner that works with companies of all sizes – from startups to established consumer brands. Here, Submodal Director of Engineering Jason Gordon discusses why load testing is a critical step for your e-commerce site.

We know how crucial load testing is: it’s the reason such a detailed approach has been developed to address this aspect of web application deployment. Here’s what you need to know about load testing and all the issues that come with it.

WHAT IS LOAD TESTING?

When someone uses the term “load testing,” they are usually talking about a larger set of tasks involved in delivering a responsive web application.

For us, load testing is the final step in the testing process, during which all performance tests are run simultaneously over a period of time to see how the system will hold up for a couple of days. It’s similar to how automotive engineers run an engine by itself for a few months straight to test its safety and capabilities.

“Load” is the number of concurrent connections to a system. Since we specialize in Magento, load will be discussed here in the context of a Magento e-commerce application. We also run another CMS in tandem with Magento, which helps determine what type of users a system will have: non-authenticated shoppers, authenticated shoppers and content administrators. A performance plan should include tests designed for each of these three users, because each type of user creates a different load on the system.

LEAD UP TO TESTING

Before the performance-testing phase, it’s important to follow a few development best practices. First, minimize HTTP connections as much as possible by:

Merging and compressing JavaScript files. Ideally, this process is part of an automated build; but on small projects the process is often performed manually by combining JS files into one large file that is then run it through Yahoo’s compressor[4], or a similar tool.

The CSS files are merged and compressed using the same technique as with JavaScript.

We use image sprites as much as possible to ensure all images load quickly and at the same time.

We also use a CDN, and generally recommend a cloud solution like Rackspace Cloud Files[5]. A system that supports origin pull eliminates the need for software that manages assets on the CDN (which is both difficult and).

The next step is to determine which parts of each page don’t change and are cacheable. In our applications, most parts fall into this category. For the sections that don’t fit this bill, we often employ ajax to fill in the “holes,” and Varnish’s ESI technique (“edge side includes”).

By implementing a very lightweight code review process, we identify obvious performance issues in new code. By running our code tests with a tool such as JMeter during the development process, performance and load testing is much easier later on, and poor performing code is identified earlier in the process.

WHAT’S ACTUALLY BEING TESTED?

The process essentially tests from left to right; as a request comes into the system, it gets jiggled around, and then comes back to the browser. The goal is to determine how many of these types of requests a system can handle. Here’s a simplified view of the request flow:

Load balancer decides which app server to hit

App server checks static cache (varnish). We use nginx here because each request consumes very few resources, allowing many requests to be offload from apache

On a miss, the request is routed to the second webserver (apache)

The request is then routed to either Magento or the CMS where the software stack boots up

The software will make requests to a database cache (memcached)

On a miss, the database will be queried (mostly reads)

After all cache hits/misses and database action, the response is returned back up the chain and through the load balancer

The response will be HTML filled with many requests to static assets that will be served by the CDN

Here’s what is being tested:

How many concurrent requests can a system handle at maximum load?

What are response times for all test paths, and are they acceptable?

What points in the chain are consuming the most hardware resources?

Are there any obvious failures caused by large data sets, many users, many products, many orders and other factors?

The first step is to define the set of requests that makes up the basis of how users will interact with the site, such as:

The front page (most important page)

Any “hole punching” URLs

Shopping cart page, full and empty

Checkout page

User account pages

Random types of URLs that generate 404s. It is crucial to minimize how many 404s the software stack has to deal with.

We use apache bench[6] to quickly benchmark each request in a test list. This is when the reverse proxy cache system will be tuned to ensure it is getting as many hits as possible. Apache bench will also give a basic sense of how many requests per second the system can handle for each URL.

Next, we create scripts to generate test data and simulate user interaction. Although we create some stand-alone scripts for data generation, the main goal is to make a large suite of scripts for JMeter[7]. A script will be made for each test path, cookies and all. JMeter scripts are easy to make into templates, and can be copied and pasted to create new ones.

A list of necessary scripts:

Randomly create products

Randomly create orders (fill up a cart, then checkout)

Randomly create customer accounts (can combine this with creating an order script)

Make changes in the admin interface (creating categories, configuration changes).

The idea is to run these scripts both individually and simultaneously to find performance limits. This will fill up a database with lots of data so that during load testing you can identify bottlenecks that occur only when a good amount of data is present.

Then comes the testing system setup. Laptops quickly run out of capacity for testing, so load testing tools usually get set up on cloud machines. During load testing, several machines will run for several days.

TESTS AND ANALYSIS

Now it’s time to actually run tests. Because running tests can be very time consuming, it’s important to ensure we get the most information possible out of each test run and avoid wastefully repeating the same test. To do so, we create a spreadsheet with the following information:

Name of the person running the test

The date, time and duration of the run

A clearly defined hypothesis

A statement of what has changed

A set of metrics being tracked during the test

A post-run capture of the results of the metrics, best visualized with a graph of the metric over time.

Using a set of monitoring tools allows us to obtain truthful data about actual test performance in simple visual graphs: Cacti[8] for capturing metrics, MONyog[9] for monitoring the database and statsd[10] to put stats logging into code so we can monitor code performance in real time. By saving this information in spreadsheets, we achieve provable results and can identify the most effective tests.

Once we churn through the performance tests and apply all possible optimizations, we start running longer tests. A nice side effect of all this setup is that it puts in place a monitoring system that can be used throughout the life of the site, which is very helpful during new software releases.