Pressing buttons in the right order

Context: I recently posted about using Apache as a reverse-proxy for Google App Engine development. See that post for the motivation for monkeying with Real Web Servers when what you really want to be doing is developing your GAE app. The quick version is that you can reduce your local pageload time by a factor of 20 or more, depending how many static assets (images, javascript, stylesheets) your app has.

Proxying with Apache was a big improvement using the single-threaded, I/O blocking dev_appserver.py directly, but it was by no means perfect. I noticed that if I hadn’t hit Apache for a while, it’d take roughly 10 seconds before my next request would be served. If you know what you’re doing, it’s probably possible to configure Apache not to do this. But I don’t know what I’m doing, and Apache configuration isn’t exactly friendly.

Instead, I switched to using nginx, which is sometimes described as a reverse proxy first and webserver second. It’s insanely fast at serving static files, uses little memory, and configuring it turned out to be easy. Also, who can resist that logo?

How to set up nginx as a reverse proxy

First, install nginx. With homebrew on OS X this is just brew install nginx.

Next, we need to configure nginx to work as a reverse proxy. The following configuration did the trick for me. I put this file at /usr/local/etc/nginx/kadev.conf, as the include refers to a file mime.types in that directory.

Notice how DRY this config is compared to the Apache equivalent. It’s really easy to add extra server sections if you have multiple development servers.

Line 19 is where the magic happens. It tells nginx to check first for a file matching the path for a request. If a file exists, nginx serves it directly. Otherwise, nginx forwards the request to the proxied server, in this case creaky old dev_appserver.py.

Finally, run nginx with sudo nginx -c ~/path/to/config/file.conf. sudo is needed to listen on port 80. If you’re running on another port, you don’t need it.

Now you should be able use your proxy to serve your static files quickly so you can get on with development.

Debugging tips

The above configuration tells nginx to listen on port 80, so if you have Apache running you will need to disable it first. On OS X, open System Preferences, select Sharing, then uncheck “Web Sharing”. Or you can tell it to exit with sudo apachectl -k stop.

If you installed with brew install nginx, logs will be stored at /usr/local/Cellar/nginx/1.0.7/logs by default. If something isn’t working, tail -f the files in that directory.

If you needed to do anything else to get it running, please leave a comment to help the next person.

When using the Google App Engine SDK in development mode, you have probably noticed that dev_appserver.py is incredibly slow. This is because all requests – even requests for static files like javascript, stylesheets or images – are handled sequentially by a single thread. Take a look at the timeline below. Does the left side look familiar?

It’s possible to get a big speed boost by setting up Apache as a reverse proxy in front of your dev server. All requests for static assets will be handled by Apache, which is blazing fast compared to dev_appserver.py. The 35 second request above is fulfilled in only 2 seconds, with most of the static files loading in parallel (see the right side).

Update 2011-11-15: It turns out nginx is even quicker at serving static files, uses less memory, and easier to configure too! From here on I recommend reading this post instead, which tells you how to set up the same thing with nginx rather than Apache.

How to set this up

First, enable Virtual Hosts in Apache. Edit /etc/apache2/httpd.conf and go to line 623. Uncomment the line for vhosts, so it looks like the following.

/etc/apache2/httpd.conf

623624

# Virtual hostsInclude/private/etc/apache2/extra/httpd-vhosts.conf

Next, open /etc/apache2/extra/httpd-vhosts.conf and insert something like the following. Fellow KA devs shouldn’t have to edit much, but if you’re working on a different app you will obviously have to change the static directories. Look in app.yaml to see the full list of statically served paths.

This allows you to access your dev server via something other than localhost, which is needed for the virtual host to work. If you don’t already have --address=0.0.0.0 as a parameter to dev_appserver.pyyou will need to add this.

Also, Apache needs to be enabled - the easiest way to do this is to go to Sharing under System Preferences and check the “Web Sharing” item. If you already have it enabled, you may need to clear and check it again to force a restart. If it doesn’t start, check your config syntax with apachectl -St.

This setup should work on OS X Lion. Small changes might be needed for other OSes. If you had to tweak anything, mention it in the comments.

Sometimes people don’t agree on the contents of the tracked .hgignore file in the repository root. For example, I don’t like having *orig in .hgignore as having backup files show up when I grep is annoying. I solved that problem by removing the *orig pattern and telling other repository users about hg purge.

But today I found another way to deal with different opinions for ignore files. Hidden away on the Mercurial wiki is a nice tip about per-user hgignore files. In a repository’s hgrc you can reference an arbitrary file to be used in addition to the tracked .hgignore file. No more .hgignore wars!

I use an excellent if simple program called Caffeine on OS X. Its only purpose is to temporarily prevent your computer from automically sleeping, or displaying the screensaver. A similar program called Insomnia is available for Windows, but I dislike its UI.

So, I built Caffeinated. It’s a port of Caffeine that runs on Windows. The UI is straight-up lifted from Caffeine, and the entire program is pretty much just a usable wrapper around the SetThreadExecutionState function from the Windows API.

Despite its simplicity, I find it useful, and maybe you will too. Read more about it, or download it now.

Since we starting using the massively convenient GAE Mini Profiler, we were surprised to discover that we spend a significant amount of time reading files from disk. Here’s a particulary extreme example:

This was contrary to my understanding that App Engine tries to cache almost any code-related file read. After investigating, I found that App Engine does try to cache templates – see the source of template.py. But it turns out this only works when you render a template directly with webapp.template.render, and not when you use Django’s inclusion_tag.

To verify this, I put together a basic page with some repeated template use and used opensnoop (and after discovering that tool, I need to learn more about DTrace) to observe changes to the filesystem. Here’s the result when using inclusion_tag. You can see the template simple_student_info.html was loaded over and over again: