I'm a web developer in Norfolk. This is my blog...

In the last couple of weeks I’ve been working on a PHP web app. Nothing unusual there, except this was the first time we’d used PHP 7 in production. We discussed the possibility a while back, and eventually decided that for certain projects we’d use PHP 7 without waiting another year or so (or maybe longer) for a version of Debian stable with it by default. I wanted to talk about how our experience has been using it in production.

Background

We’ve never really had a fixed stack that we work with at work before until recently - it was largely based on personal preferences and experience. For many jobs, especially content-based sites, we generally used WordPress - it has its issues, but it does fine for a lot of work. For more complex websites, I tended to use CodeIgniter because I’d learned it during my previous job and knew it fairly well, but I was not terribly happy with it - it’s a bit too basic and simplistic, as well as being somewhat behind the times, and I only really kept using it through inertia. For mobile app backends, I tended to use Django, partly for the admin interface, and partly because Django REST Framework makes it easy to build a REST API quickly and easily in a way that wasn’t viable with CodeIgniter.

This state of affairs couldn’t really continue. I love Python and Django, but I was the only one at work who had ever used Python, so in the event I got hit by a bus there would have been no-one who could have taken over from me. As for CodeIgniter, it was clearly falling further and further behind the curve, and I was sick of it and looking to replace it. Ideally we needed a PHP framework as both myself and my colleague knew it.

I’d also been playing around with Laravel on a few little projects, but I didn’t get the chance to use it for a new web app until autumn last year. Around the same time, we hired a third developer, who also had some experience using Laravel. In addition, the presence of Lumen meant that we could use that for smaller apps or services that were too small to use Laravel. We therefore decided to adopt Laravel as our default framework - in future we’d only use something else if there was a particular justification for it. I was rather sad to have to abandon Django for work, but pleased to have something more modern than CodeIgniter for PHP projects.

This also enabled us to standardize our new server builds. Over the last year or so I’ve been pushing to automate what we can of our server setup using Ansible. We now have two standard stacks that we plan to use for future projects. One is for WordPress sites and consists of:

Debian stable

Apache

MySQL

PHP 5.6

Memcached

Varnish

The other is for Laravel or Lumen web apps or APIs and consists of:

Debian stable

Nginx

PHP 7

PostgreSQL

Redis

It took some time to decide what we wanted to settle on, and indeed we had a mobile app backend that went up around Christmas time that we wrote with Laravel, but deployed to Apache with PHP 5.6 because when we first pushed it up PHP 7 wasn’t out yet. However, given that Laravel 5 already had good support for PHP 7, we decided we’d consider it for the next app. I tend to use PostgreSQL rather than MySQL these days because it has a lot of nifty features like JSON fields and full text search, and using an ORM minimises the learning curve in switching, and Redis is much more versatile than Memcached, so they were vital parts of our stack.

Our first PHP 7 app

As it happened, we had a Laravel app in the pipeline that was ideal. In the summer of last year, we were hired to make an existing site responsive. In the end, it turned out not to be viable - it was built with Zend Framework, which none of us had ever touched before, and the front end used a lot of custom widgets and fields tied together with RequireJS. The whole thing was rather unwieldy and extremely difficult to maintain and develop. In the end, we decided to tell the client it wasn’t worth developing further and offer to rewrite the whole thing from scratch using Laravel and AngularJS, with Browserify used to handle JavaScript modules - the basic idea was quite simple, it was just the implementation that was overly complex, and AngularJS made it possible to do the same kind of thing with a fraction of the code, so a rewrite in only a few weeks was perfectly viable.

I’d already built a simple prototype to demonstrate the viability of a from-scratch rewrite using Laravel and Angular, and once the client had agreed to the rewrite, we were able to work on this further. As the web app was going to be particularly useful on mobile devices, I wanted to ensure that the performance was as good as I could possibly make it. By the time we were looking at deploying it to a server, three months had passed since PHP 7 had been first released, and I figured that was long enough for the most serious issues to be resolved, and we could definitely do with the very significant speed boost we’d get from using PHP 7 for this app.

I use Jenkins to run my unit tests, and so I decided to try installing PHP 7 on the Jenkins server and using that to run the tests. The results were encouraging - nothing broke as a result of the switch. So we therefore decided that when we deployed it, we’d try it with PHP 7, and if it failed, we’d switch to PHP 5.6.

I opted to use FPM with Nginx rather than Apache and mod_php as since the web app was purely custom we didn’t really need things like .htaccess, and while the amount of static content was limited, Nginx might well perform better for this use case. The results are fairly encouraging - the document for the home page is typically being returned in under 40ms, with the uncached homepage taking around 1.5s in total to load, despite having to load several external fonts. In its current state, the web app scores a solid 93% on YSlow, which I’m very happy with. I don’t know how much of that is down to using PHP 7, but choosing to use it was definitely a good call. I have had absolutely zero issues with it during that time.

Summary

As always, you should bear in mind that your needs may not be the same as mine, and it could well be that you need something that PHP 7 doesn’t yet provide. However, I have had a very good experience with PHP 7 in production. I may have had to jump through a few more hoops to get it up and running, and there may be some level of risk associated with using PHP 7 when it’s only been available for three months, but it’s more than justified by the speed we get from our web app. Using a configuration management system like Ansible means that even if you do have to jump through some extra hoops, it’s relatively easy to automate that process so it’s not as much of an issue as you might think. For me, using PHP 7 with a Laravel app has worked as well as I could have possibly hoped.

It’s quite common to have to integrate an external API into your web app for some of your functionality. However, it’s a really bad idea to have requests be sent to the remote API when running your tests. At best, it means your tests may fail due to unexpected circumstances, such as a network outage. At worst, you could wind up making requests to paid services that will cost you money, or sending push notifications to clients. It’s therefore a good idea to mock these requests in some way, but it can be fiddly.

In this post I’ll show you several ways you can mock an external API so as to prevent requests being sent when running your test suite. I’m sure there are many others, but these have worked for me recently.

Mocking the client library

Nowadays many third-party services realise that providing developers with client libraries in a variety of languages is a good idea, so it’s quite common to find a library for interfacing with a third-party service. Under these circumstances, the library itself is usually already thoroughly tested, so there’s no point in you writing additional tests for that functionality. Instead, you can just mock the client library so that the request is never sent, and if you need a response, then you can specify one that will remain constant.

I recently had to integrate Stripe with a mobile app backend, and I used their client library. I needed to ensure that I got the right result back. In this case I only needed to use the Token object’s create() method. I therefore created a new MockToken class that inherited from Token, and overrode its create() method so that it only accepted one card number and returned a hard-coded response for it:

from stripe.resource import Token, convert_to_stripe_object

from stripe.error import CardError

classMockToken(Token):

@classmethod

defcreate(cls, api_key=None, idempotency_key=None,

stripe_account=None, **params):

if params['card']['number'] != '4242424242424242':

raise CardError('Invalid card number', None, 402)

response = {

"card": {

"address_city": None,

"address_country": None,

"address_line1": None,

"address_line1_check": None,

"address_line2": None,

"address_state": None,

"address_zip": None,

"address_zip_check": None,

"brand": "Visa",

"country": "US",

"cvc_check": "unchecked",

"dynamic_last4": None,

"exp_month": 12,

"exp_year": 2017,

"fingerprint": "49gS1c4YhLaGEQbj",

"funding": "credit",

"id": "card_17XXdZGzvyST06Z022EiG1zt",

"last4": "4242",

"metadata": {},

"name": None,

"object": "card",

"tokenization_method": None

},

"client_ip": "192.168.1.1",

"created": 1453817861,

"id": "tok_42XXdZGzvyST06Z0LA6h5gJp",

"livemode": False,

"object": "token",

"type": "card",

"used": False

}

return convert_to_stripe_object(response, api_key, stripe_account)

Much of this was lifted straight from the source code for the library. I then wrote a test for the payment endpoint and patched the Token class:

This replaced stripe.Token with MockToken so that in this test, the response from the client library was always going to be the expected one.

If the response doesn’t matter and all you need to do is be sure that the right method would have been called, this is easier. You can just mock the method in question using MagicMock and assert that it has been called afterwards, as in this example:

Mocking lower-level requests

Sometimes, no client library is available, or it’s not worth using one as you only have to make one or two requests. Under these circumstances, there are ways to mock the actual request to the external API. If you’re using the requests module, then there’s a responses module that’s ideal for mocking the API request.

Note the use of the @responses.activate decorator. We use responses.add() to set up each URL we want to be able to mock, and pass through details of the response we want to return. We then make the request, and check that it was made as expected.

Summary

I’m pretty certain that there are other ways you can mock an external API in Python, but these ones have worked for me recently. If you use another method, please feel free to share it in the comments.

Udemy have very kindly provided some vouchers for free access to their course, “Build Web Apps with ReactJS and Flux” for me to give away to subscribers. To redeem them, follow the link above and use the voucher code MatthewDalysBlog.

There’s only 50 in total, and they are on a first-come, first-serve basis, so I suggest you redeem them sooner rather than later.

In the last year or so, React.js has taken the world of web development by storm. A major reason for this is that it makes it possible to build isomorphic web applications - web apps where the same code can run on the client and the server. Using React.js, you can create a template that will be executed on the server when the page first loads, and then the same template can be used to re-render the content when it’s updated, whether that’s via AJAX, WebSockets or another method entirely.

What is React.js?

A lot of people get rather confused over this issue. It’s not correct to compare React.js with frameworks like Angular.js or Backbone.js. It’s often described as being just the V in MVC - it represents only the view layer. If you’re familiar with Backbone.js, I think it’s reasonable to compare it to Backbone’s views, albeit with it’s own templating syntax. It does not provide the following functionality like Angular and Backbone do:

Support for models

Any kind of helpers for AJAX requests

Routing

If you want any of this functionality, you need to look elsewhere. There are other libraries around that offer this kind of functionality, so if you want to use React as part of some kind of MVC structure, you can do so - they’re just not a part of the library itself.

React.js uses a so-called “virtual DOM” - rather than re-rendering the view from scratch when the state changes, it instead retains a virtual representation of the DOM in memory, updates that, then figures out what changes are required to update the existing DOM and applies them. This means it only needs to change what actually changes, making it faster than other client-side templating systems. Combined with the ability to render on the server side, React allows you to build high-performance apps that combine the initial speed and SEO advantages of conventional web apps with the responsiveness of single-page web apps.

To create components with React, it’s common to use an XML-like syntax called JSX. It’s not mandatory, but I highly recommend you do so as it’s much more intuitive than creating elements with Javascript.

Getting started

You’ll need a Twitter account, and you’ll need to create a new Twitter app and obtain the security credentials to let you access the Twitter Streaming API. You’ll also need to have Node.js installed (ideally using nvm) - at this time, however, you can’t use Node 4.0 because of issues with Redis. You will also need to install Redis and hiredis - if you’ve worked through my previous Redis tutorials you’ll have these already.

We’ll be using Gulp.js as our build system, and Bower to install some client-side packages, so they need to be installed globally:

$ npm install -g gulp bower

We’ll also be using Compass to help with our stylesheets:

$ sudo gem install compass

With that all done, let’s start work on our app. First, run the following command to create your package.json:

$ npm init

I’m assuming you’re well-acquainted enough with Node.js to know what this does, and can answer the questions without difficulty. I won’t cover writing tests in this tutorial as, but set your test command to gulp test and you should be fine.

Planning our app

Now, it’s worth taking a few minutes to plan the architecture of our app. We want to have the app listen to the Twitter Streaming API and filter for messages with any arbitrary string in them - in this case we’ll be searching for “javascript”, but you can set it to anything you like. That means that that part needs to be listening all the time, not just when someone is using the app. Also, it doesn’t fit neatly into the usual request-response cycle - if several people visit the site at once, we could end up with multiple connections to fetch the same data, which is really not efficient, and could cause problems with duplicate tweets showing up.

Instead, we’ll have a separate worker.js file which runs constantly. This will listen for any matching messages on Twitter. When one appears, rather than returning it itself, it will publish it to a Redis channel, as well as persisting it. Then, the web app, which will be the index.js file, will be subscribed to the same channel, and will receive the tweet and push it to all current users using Socket.io.

This is a good example of a message queue, and it’s a common pattern. It allows you to create dedicated sections of your app for different tasks, and means that they will generally be more robust. In this case, if the worker goes down, users will still be able to see some tweets, and if the server goes down, the tweets will still be persisted to Redis. In theory, this would also allow you to scale your app more easily by allowing movement of different tasks to different servers, and several app servers could interface with a single worker process. The only downside I can think of is that on a platform like Heroku you’d need to have a separate dyno for the worker process - however, with Heroku’s pricing model changing recently, since this needs to be listening all the time it won’t be suitable for the free tier anyway.

First let’s create our gulpfile.js:

var gulp = require('gulp');

var jshint = require('gulp-jshint');

var source = require('vinyl-source-stream');

var buffer = require('vinyl-buffer');

var browserify = require('browserify');

var reactify = require('reactify');

var mocha = require('gulp-mocha');

var istanbul = require('gulp-istanbul');

var coveralls = require('gulp-coveralls');

var compass = require('gulp-compass');

var uglify = require('gulp-uglify');

var paths = {

scripts: ['components/*.jsx'],

styles: ['src/sass/*.scss']

};

gulp.task('lint', function () {

return gulp.src([

'index.js',

'components/*.js'

])

.pipe(jshint())

.pipe(jshint.reporter('jshint-stylish'));

});

gulp.task('compass', function() {

gulp.src('src/sass/*.scss')

.pipe(compass({

css: 'static/css',

sass: 'src/sass'

}))

.pipe(gulp.dest('static/css'));

});;

gulp.task('test', function () {

gulp.src('index.js')

.pipe(istanbul())

.pipe(istanbul.hookRequire())

.on('finish', function () {

gulp.src('test/test.js', {read: false})

.pipe(mocha({ reporter: 'spec' }))

.pipe(istanbul.writeReports({

reporters: [

'lcovonly',

'cobertura',

'html'

]

}))

.pipe(istanbul.enforceThresholds({ thresholds: { global: 90 } }))

.once('error', function () {

process.exit(0);

})

.once('end', function () {

process.exit(0);

});

});

});

gulp.task('coveralls', function () {

gulp.src('coverage/lcov.info')

.pipe(coveralls());

});

gulp.task('react', function () {

return browserify({ entries: ['components/index.jsx'], debug: true })

.transform(reactify)

.bundle()

.pipe(source('bundle.js'))

.pipe(buffer())

.pipe(uglify())

.pipe(gulp.dest('static/jsx/'));

});

gulp.task('default', function () {

gulp.watch(paths.scripts, ['react']);

gulp.watch(paths.styles, ['compass']);

});

I’ve added tasks for the tests and JSHint if you choose to implement them, but the only ones I’ve actually used are the compass and react tasks. The compass task compiles our Sass files into CSS, while the react task uses Browserify to take our React components and various modules installed using NPM and build them for use in the browser, as well as minifying them. Note that we installed React and lodash with NPM? We’re going to be able to use them in the browser and on the server, thanks to Browserify.

Most of this file should be fairly straightforward. We set up our connection to Twitter (you’ll need to set the various environment variables listed here using the appropriate method for your operating system), and a connection to Redis.

We then stream the Twitter statuses that match our filter. When we receive a tweet, we log it to the console (feel free to comment this out in production if desired), publish it to a Redis channel called tweets, and push it to the end of a Redis list called stream:tweets. When an error occurs, we output it to the console.

Let’s use Bootstrap to style the app. Create the following .bowerrc file:

This includes some dependencies from Compass, as well as Bootstrap. We won’t be using any of the Javascript features of Bootstrap, so we don’t need to worry too much about that.

Next, we need to create our view files. As React will be used to render the main part of the page, these will be very basic, with just the header, footer, and a section where the content can be rendered. First, create views/index.hbs:

{{> header }}

<divclass="container">

<divclass="row">

<divclass="col-md-12">

<divid='view'>{{{ markup }}}</div>

</div>

</div>

</div>

<scriptid="initial-state"type="application/json">{{{state}}}</script>

{{> footer }}

As promised, this a very basic layout. Note the markup variable, which is where the markup generated by React will be inserted when rendered on the server, and the state variable, which will contain the JSON representation of the data used to generate that markup. By passing that data through, you can ensure that the instance of React on the client has access to the same raw data as was passed through to the view on the server side, so that when the data needs to be re-rendered, it can be done so correctly.

We’ll also define partials for the header and footer. The header should be in views/partials/header.hbs:

Here we’re using Babel, which is a library that allows you to use new features in Javascript even if the interpreter doesn’t support it. It also includes support for JSX, allowing us to require JSX files in the same way we would require Javascript files.

// Get dependencies

var express = require('express');

var app = express();

var compression = require('compression');

var port = process.env.PORT || 5000;

var base_url = process.env.BASE_URL || 'http://localhost:5000';

var hbs = require('hbs');

var morgan = require('morgan');

var React = require('react');

var Tweets = React.createFactory(require('./components/tweets.jsx'));

Here we include our dependencies. Most of this will be familiar if you’ve used Express before, but we also use React to create a factory for a React component called Tweets.

// Set up connection to Redis

var redis, subscribe;

if (process.env.REDIS_URL) {

redis = require('redis').createClient(process.env.REDIS_URL);

subscribe = require('redis').createClient(process.env.REDIS_URL);

} else {

redis = require('redis').createClient();

subscribe = require('redis').createClient();

}

// Set up templating

app.set('views', __dirname + '/views');

app.set('view engine', "hbs");

app.engine('hbs', require('hbs').__express);

// Register partials

hbs.registerPartials(__dirname + '/views/partials');

// Set up logging

app.use(morgan('combined'));

// Compress responses

app.use(compression());

// Set URL

app.set('base_url', base_url);

// Serve static files

app.use(express.static(__dirname + '/static'));

This section sets up the various dependencies of our app. We set up two connections to Redis - one for handling subscriptions, the other for reading from Redis in order to populate the view.

We also set up our views, logging, compression of the HTTP response, a base URL, and serving static files.

Our app only has a single view. When the root is loaded, we first of all fetch all of the tweets stored in the stream:tweets list. We then convert them into an array of objects.

Next, we render the Tweets component to a string, passing through our list of tweets, and store the resulting markup. We then pass through this markup and the string representation of the list of tweets to the template.

// Listen

var io = require('socket.io')({

}).listen(app.listen(port));

console.log("Listening on port " + port);

// Handle connections

io.sockets.on('connection', function (socket) {

// Subscribe to the Redis channel

subscribe.subscribe('tweets');

// Handle receiving messages

var callback = function (channel, data) {

socket.emit('message', data);

};

subscribe.on('message', callback);

// Handle disconnect

socket.on('disconnect', function () {

subscribe.removeListener('message', callback);

});

});

Finally, we set up Socket.io. On a connection, we subscribe to the Redis channel tweets. When we receive a tweet from Redis, we emit that tweet so that it can be rendered on the client side. We also handle disconnections by removing our Redis subscription.

Creating our React components

Now it’s time to create our first React component. We’ll create a folder called components to hold all of our component files. Our first file is components/index.jsx:

First of all, we include React and the same Tweets component we require on the server side (note that we need to specify the .jsx extension). Then we fetch the initial state from the script tag we created earlier. Finally we render the Tweets components, passing through the initial state, and specify that it should be inserted into the element with an id of view. Note that we store the initial state in data - inside the component, this can be accessed as this.props.data.

This particular component is only ever used on the client side - when we render on the server side, we don’t need any of this functionality since we insert the markup into the view element anyway, and we don’t need to specify the initial data in the same way.

Next, we define the Tweets component in components/tweets.jsx:

var React = require('react');

var io = require('socket.io-client');

var TweetList = require('./tweetlist.jsx');

var _ = require('lodash');

var Tweets = React.createClass({

componentDidMount: function () {

// Get reference to this item

var that = this;

// Set up the connection

var socket = io.connect(window.location.href);

// Handle incoming messages

socket.on('message', function (data) {

// Insert the message

var tweets = that.props.data;

tweets.push(JSON.parse(data));

tweets = _.sortBy(tweets, function (item) {

return item.created_at;

}).reverse();

that.setProps({data: tweets});

});

},

getInitialState: function () {

return {data: this.props.data};

},

render: function () {

return (

<div>

<h1>Tweets</h1>

<TweetListdata={this.props.data} />

</div>

)

}

});

module.exports = Tweets;

Let’s work our way through each section in turn:

var React = require('react');

var io = require('socket.io-client');

var TweetList = require('./tweetlist.jsx');

var _ = require('lodash');

Here we include React and the Socket.io client, as well as Lodash and our TweetList component. With React.js, it’s recommend that you break up each individual part of your interface into a single component - here Tweets is a wrapper for the tweets that includes a heading. TweetList will be a list of tweets, and TweetItem will be an individual tweet.

var Tweets = React.createClass({

componentDidMount: function () {

// Get reference to this item

var that = this;

// Set up the connection

var socket = io.connect(window.location.href);

// Handle incoming messages

socket.on('message', function (data) {

// Insert the message

var tweets = that.props.data;

tweets.push(JSON.parse(data));

tweets = _.sortBy(tweets, function (item) {

return item.created_at;

}).reverse();

that.setProps({data: tweets});

});

},

Note the use of the componentDidMount method - this fires when a component has been rendered on the client side for the first time. You can therefore use it to set up events. Here, we’re setting up a callback so that when a new tweet is received, we get the existing tweets (stored in this.props.data, although we copy this to that so it works inside the callback), push the tweet to this list, sort it by the time created, and set this.props.data to the new value. This will result in the tweets being re-rendered.

getInitialState: function () {

return {data: this.props.data};

},

Here we set the initial state of the component - it sets the value of this.state to the object passed through. In this case, we pass through an object with the attribute data defined as the value of this.props.data, meaning that this.state.data is the same as this.props.data.

render: function () {

return (

<div>

<h1>Tweets</h1>

<TweetListdata={this.props.data} />

</div>

)

}

});

module.exports = Tweets;

Here we define our render function. This can be thought of as our template. Note that we include TweetList inside our template and pass through the data. Afterwards, we export Tweets so it can be used elsewhere.

Next, let’s create components/tweetlist.jsx:

var React = require('react');

var TweetItem = require('./tweetitem.jsx');

var TweetList = React.createClass({

render: function () {

var that = this;

var tweetNodes = this.props.data.map(function (item, index) {

return (

<TweetItemkey={index}text={item.text}></TweetItem>

);

});

return (

<ulclassName="tweets list-group">

{tweetNodes}

</ul>

)

}

});

module.exports = TweetList;

This component is much simpler - it only has a render method. First, we get our individual tweets and for each one define a TweetItem component. Then we create an unordered list and insert the tweet items into it. We then export it as TweetList.

Our final component is the TweetItem component. Create the following file at components/tweetitem.jsx:

var React = require('react');

var TweetItem = React.createClass({

render: function () {

return (

<liclassName="list-group-item">{this.props.text}</li>

);

}

});

module.exports = TweetItem;

This component is quite simple. It’s just a single list item with the text set to the value of the tweet’s text attribute.

That should be all of our components done. Time to compile our Sass and run Browserify:

$ gulp compass

$ gulp react

Now, if you make sure you have set the appropriate environment variables, and then run node worker.js in one terminal, and node index.js in another, and visit http://localhost:5000/, you should see your Twitter stream in all its glory! You can also try it with Javascript disabled, or in a text-mode browser such as Lynx, to demonstrate that it still renders the page without having to do anything on the client side - you’re only missing the constant updates.

Wrapping up

I hope this gives you some idea of how you can easily use React.js on both the client and server side to make web apps that are fast and search-engine friendly while also being easy to update dynamically. You can find the source code on GitHub.

Hopefully I’ll be able to publish some later tutorials that build on this to show you how to build more substantial web apps with React.

As I mentioned in an earlier post, I recently had the occasion to use Varnish to improve the performance of a website that otherwise would have been unreliable and unusably slow due to WordPress making an excessive number of queries. The difference it made was nothing short of staggering, and I’m not exaggerating when I say it saved the day. I now use Ansible for provisioning new WordPress sites, and Varnish is now a standard part of my WordPress site setup playbook.

However, Varnish can be quite fiddly to configure, and it was something of a baptism of fire for me to learn how to configure it appropriately for this use case. I did make a few mistakes that caused problems down the line, so I thought I’d share the details of how I got it working for that particular site.

What is Varnish?

Varnish Cache is a web application accelerator also known as a caching HTTP reverse proxy. You install it in front of any server that speaks HTTP and configure it to cache the contents. Varnish Cache is really, really fast. It typically speeds up delivery with a factor of 300 - 1000x, depending on your architecture.

In other words, you run it on the usual HTTP or HTTPS port, move your usual web server to a different port, and configure it, and it will cache web pages so they can be served more quickly to subsequent visitors.

Be warned - Varnish is not something where you can generally stick with the default settings. The default behaviour does make a lot of sense, but in practice almost no-one will be able to get away with leaving the configuration unchanged.

Installing Varnish

If you’re using Debian or a derivative such as Ubuntu, Varnish is available via apt-get:

$ sudo apt-get install varnish

You may also want to install the documentation:

$ sudo apt-get install varnish-doc

If you’re using Apache I’d also recommend installing libapache2-mod-rpaf and enabling it with sudo a2enmod rpaf - without this, Apache will log all incoming requests as coming from the same server.

I’m assuming you already have a normal web server installed. I’ll assume you’re using Apache, but it shouldn’t be hard to adapt these instructions to work with Nginx. I’m also assuming that the site you want to use Varnish for is a WordPress site with WooCommerce and W3 Total Cache installed. However, this is only for example purposes. If you want to use Varnish for a different web app, you’ll need to plan your caching strategy around that web app yourself.

Please also note that this is using Varnish 4.0, which is the version available with Debian Jessie. If you’re using an older operating system, you may have Varnish 3.0 in the repositories - be warned, the configuration language changed in Varnish 4.0, so the examples here will not work with older versions of Varnish.

By default, Varnish runs on port 6081, which is fine for testing it out, but once you want to go live it’s not what you want. When it’s time to go live, you’ll need to open up /etc/default/varnish and edit the value of DAEMON_OPTS to something like this:

DAEMON_OPTS="-a :80 \

-T localhost:6082 \

-f /etc/varnish/default.vcl \

-S /etc/varnish/secret \

-s malloc,256m"

Note that the -a flag represents the port Varnish is running on.

If you’re using an operating system that uses systemd, such as Debian Jessie, this alone won’t be sufficient. Create a new file at /etc/systemd/system/varnish.service and enter the following:

Next, we need to move our web server to a different port. We’ll use port 8080. Replace the contents of /etc/apache2/ports.conf with this:

# If you just change the port or add more ports here, you will likely also

# have to change the VirtualHost statement in

# /etc/apache2/sites-enabled/000-default

# This is also true if you have upgraded from before 2.2.9-3 (i.e. from

# Debian etch). See /usr/share/doc/apache2.2-common/NEWS.Debian.gz and

# README.Debian.gz

NameVirtualHost *:8080

Listen 8080

<IfModule mod_ssl.c>

# If you add NameVirtualHost *:443 here, you will also have to change

# the VirtualHost statement in /etc/apache2/sites-available/default-ssl

# to <VirtualHost *:443>

# Server Name Indication for SSL named virtual hosts is currently not

# supported by MSIE on Windows XP.

Listen 443

</IfModule>

<IfModule mod_gnutls.c>

Listen 443

</IfModule>

You’ll also need to change the ports for the individual site files under /etc/apache2/sites-available, as in this example:

<VirtualHost *:8080>

ServerAdmin webmaster@localhost

DocumentRoot /var/www

<Directory />

Options FollowSymLinks

AllowOverrideAll

</Directory>

<Directory /var/www/>

Options FollowSymLinks MultiViews

AllowOverrideAll

Order allow,deny

allow from all

</Directory>

ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/

<Directory "/usr/lib/cgi-bin">

AllowOverride None

Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch

Order allow,deny

Allow from all

</Directory>

ErrorLog${APACHE_LOG_DIR}/error.log

# Possible values include: debug, info, notice, warn, error, crit,

# alert, emerg.

LogLevel warn

CustomLog${APACHE_LOG_DIR}/access.log combined

</VirtualHost>

Writing our VCL file

Next, we come to our Varnish configuration proper, which resides at /etc/varnish/default.vcl. The vcl stands for Varnish Configuration Language, and it has a syntax somewhat reminiscent of C.

The default behaviour for Varnish is as follows:

It does not cache requests that contain cookie or authorization headers

It does not cache requests which the backend HTTP server indicates should not be cached

It will only cache GET and HEAD requests

This behaviour is unlikely to meet your needs. We’ll therefore work through the Varnish config file I wrote for this WordPress site in the hope that it will teach you enough to adapt it to your own needs.

# Even if no cookies are present, I don't want my "uploads" to be cached due to their potential size

if (req.url ~ "/wp-content/uploads/") {

return (pass);

}

# any pages with captchas need to be excluded

if (req.url ~ "^/contact/")

{

return(pass);

}

# Check the cookies for wordpress-specific items

if (req.http.Cookie ~ "wordpress_" || req.http.Cookie ~ "comment_") {

# A wordpress specific cookie has been set

return (pass);

}

# allow PURGE from localhost

if (req.method == "PURGE") {

if (!client.ip ~ purge) {

return(synth(405, "Not allowed."));

}

return (purge);

}

# Force lookup if the request is a no-cache request from the client

if (req.http.Cache-Control ~ "no-cache") {

return (pass);

}

# Try a cache-lookup

return (hash);

}

sub vcl_backend_response {

set beresp.grace = 5m;

}

Let’s take a closer look at the first part of the config:

vcl 4.0;

backend default {

.host = "127.0.0.1";

.port = "8080";

}

Here we define that we’re using version 4.0 of VCL, and that the host to use as a back end is port 8080 on the same server. If your normal HTTP server is running on a different port, you will need to set it here. Also, note that you can use a different host as the backend.

acl purge {

"127.0.0.1";

"localhost";

}

We also set which hosts can trigger a purge of the cache, namely localhost and 127.0.0.1. The web app hosted on the server can then make an HTTP PURGE request to a given path, which will clear that path from the cache. In our case, W3 Total Cache supports this - if it’s a custom web app, you’ll need to implement this functionality yourself to clear the cache when new content is added.

Next, we start the vcl_recv subroutine. This is where we define our rules for deciding whether or not to serve content from the cache. Let’s look at our first rule:

Here, we declare that we should never cache any PUT, PATCH, DELETE or POST requests, on the basis that these change the state of the application. This ensures that things like contact forms will work as expected.

Note that we’re getting the value of req.method to determine the HTTP verb used. The req object has many other properties we’ll see being used.

# Never cache cart, account, checkout or addons

if (req.url ~ "^/(cart|my-account|checkout|addons)") {

return (pass);

}

# Never cache adding to cart

if ( req.url ~ "\?add-to-cart=" ) {

return (pass);

}

# Never cache admin or login

if ( req.url ~ "^/wp-(admin|login|cron)" ) {

return (pass);

}

# Never cache WooCommerce API

if ( req.url ~ "wc-api" ) {

return (pass);

}

Next, we define a series of regular expressions, and if the URL (represented by req.url) matches that regex, then the request is passed straight through to Apache without Varnish getting involved. In this case, we never want to cache the following sections:

The shopping cart, checkout, addons page or account page

The Add to cart button

The WordPress admin and login screen, and cron requests

The WooCommerce API

You’ll need to consider which parts of your site must always serve the latest content and which don’t need everything to be fully up to date. Typically admin areas any anything interactive must not be cached, while the front page is usually fine.

Cookies, even ones set on the client side such as those for Google Analytics, can prevent content from being cached. To prevent this, you need to configure Varnish to discard these cookies before passing them on to Apache. In this case, we want to exclude Google Analytics and various WordPress cookies.

# Static content unique to the theme can be cached (so no user uploaded images)

Here we allow static content that’s part of the site theme to be cached since that doesn’t change often, so we unset the cookies for that request.

# Even if no cookies are present, I don't want my "uploads" to be cached due to their potential size

if (req.url ~ "/wp-content/uploads/") {

return (pass);

}

Here we prevent any user-uploaded content from being cached, since that can change often.

# any pages with captchas need to be excluded

if (req.url ~ "^/contact/")

{

return(pass);

}

Captchas must obviously never be cached since that will break them. In this case, we assume that the contact form has a captcha, so it gets excluded from the cache.

# Check the cookies for wordpress-specific items

if (req.http.Cookie ~ "wordpress_" || req.http.Cookie ~ "comment_") {

# A wordpress specific cookie has been set

return (pass);

}

Here we check for remaining WordPress-specific cookies. These would indicate that a user is signed in, in which case we may want to serve them all the latest content rather than displaying content from the cache.

# allow PURGE from localhost

if (req.method == "PURGE") {

if (!client.ip ~ purge) {

return(synth(405, "Not allowed."));

}

return (purge);

}

Remember where we allowed the local server to clear the cache? This section actually carries out the purge when it receives a request from an authorised client.

# Force lookup if the request is a no-cache request from the client

if (req.http.Cache-Control ~ "no-cache") {

return (pass);

}

Here we check to see if the Cache-Control HTTP header is set to no-cache. If so, we pass it straight through to Apache.

# Try a cache-lookup

return (hash);

}

This is the last rule under vcl_recv, because it only reaches this point if the request has got past all the other rules. It tries to fetch the page from the cache. If the page is not in the cache, it passes it on to Apache and will cache the response.

sub vcl_backend_response {

set beresp.grace = 5m;

}

This is where we set how long responses are cached for. Here we’ve set it to 5 minutes.

With that done, we should be ready to restart Varnish and Apache. If you are using an operating system with systemd, then the following commands should restart Apache and Varnish:

$ sudo systemctl reload apache2.service

$ sudo systemctl reload varnish.service

For those not yet using systemd, try this instead:

$ sudo service apache2 restart

$ sudo service varnish restart

If you then visit your site and inspect the HTTP headers using your browser’s dev tools, you’ll notice the new HTTP header X-Varnish in the response. This tells you that Varnish is up and running. If you make sure you’re logged out, you should hopefully see that if you load a page, and then load it again, the second response is noticeably quicker.

Installing and configuring Varnish is a relatively quick and easy way of helping your website scale to be able to serve many more users, and if the site becomes popular all of a sudden, it can make a huge difference as to whether the site can stand up to the load or not. If you need more information on how to configure Varnish for your own needs, I recommend consulting the excellent documentation.

About me

I'm a web and mobile app developer based in Norfolk. My skillset includes Python, PHP and Javascript, and I have extensive experience working with CodeIgniter, Laravel, Django, Phonegap and Angular.js.