"…we are not pans and barrows, nor even porters of the fire and torchbearers, but children of the fire, made of it…" — Ralph Waldo Emerson, The Poet

Menu

Monthly Archives: January 2013

I’ve been playing with Google App Engine for a while now and overall found it very easy to work with. When I mention GAE to people, most have never heard of app engine, so I thought I would provide some background, highlighting my experience so far.

Wikipedia provides a great definition of Google App Engine, as a “platform as a service”. Basically you upload your code and your application is hosted by google and scaled automatically based on usage. The service is free below a certain threshold and you can pay for more bandwidth, storage, etc. as needed.

What I like:

The Udacity CS253 class uses App Engine and walked through lots of pieces, providing a great intro. If you already know something about back-end development, the docs are great and the interface for everything is pared down to simple and minimal.

No admin of servers needed. With my minimal sys-admin experience, not having to setup or administer any machines or install any tools or worry about scaling issues or installing a DB etc is a HUGE boon.

Python!

Lots of stuff comes built in:

Database

Blobstore for file storage

Memcache

Email utilities

Scales automatically, I think this is a hugely cool idea. As it turns out there is a downside that bit me discussed below, but the fact that app engine can automatically scale up to handle additional traffic on your site is awesome.

Database migration from one installation to another is trivial.

Problems I’ve seen:

No built-in Blobstore migration tool for moving data to another account.

Deployment limitation, you can’t host the app at a secondary domain, only primary domain and aliases of it can be used. There is an open issue for this.

For low traffic apps, they often incur startup costs as google’s resource balancing will shutdown the app and cause new requests to spin up a new instance. This seems to be a bigger problem for java apps, which are slower to startup. But something that my beta test site was subject to.

App engine was down for a few hours during our beta testing reporting error 121, which I discussed in a previous post.

I have been working recently on a new project: building a programming competition site for a small programming competition in February (UPDATE: site is up now at http://challenge13.jaybridgerobotics.com) and putting to good use what I learned in CS253. It is fun after years of ICFP competitions to implement my own competition site. The interesting parts are all back-end, design, and admin interface, so pretty invisible to the user, but neat none-the-less.

Submission page which accepts new submissions and reports date, score and link the grader output details for each submission

Use of cookies and appropriate password hashing per lessons in CS253.

The most interesting pieces are under-the-hood or part of the Administrative interface:

User submissions (zipped files) are checked for valid size/extension and stored in the google Blobstore.

API for the grader machines to a) query for new submissions, b) download them from the Blobstore and c) report results.

Authentication for the grader machines

Submission grading queue with error recovery to ensure all submissions are graded (even if a grader crashes)

Admin ability to re-run submissions (for example on update to the grader scripts), tracking of the grader versions etc

Admin ability to inspect user submissions for fraud and abuse and disable or throttle abusive activity

Event Log for admins to monitor activity on site.

Automatically shut down the site for submissions when the contest ends

It’s a nice project. The udacity class was great prep and I have learned a bit more regarding email authentication, API design and other robustness requirements for this site. It will be fun to see what sort of attacks we will get when it goes live.

We are beta testing the programming competition site I built for an upcoming Jaybridge Robotics recruiting event. Yesterday the site went down and pretty much all page queries returned a 503 sever error. Looking in the log history even simple GET requests like favicon.ico where timing out:

Info: 2013-01-15 10:21:06.818
This request caused a new process to be started for your application,
and thus caused your application code to be loaded for the first time.
This request may thus take longer and use more CPU than a typical
request for your application.

Investigating this warning revealed what many developers of low traffic sites have deemed an un-usable flaw with GAE. A flip side of the benefit that GAE provides of automatically scaling up your site if you have increased traffic, is that they will also shut down your site if there isn’t any traffic, which means requests may require your entire process to be started before requests are serviced. Since there is a 30second timeout on requests, if your app’s startup is slow (especially a problem with java, but shouldn’t apply so much to this python app) users will see greater latency or errors if the request can’t be serviced in time.

This “feature” is core to the GAE service, but they now offer services to paid subscribers to minimize this risk. You can pay for a minimum number of idle instances which will ensure you app is always ready to serve new requests. There is also an “Always On” feature which should help. I will update this when I switch to the paid service and learn more about it.

Another solution not recommended by google is to keep your site warm by regular queries of some sort, this is not desired as a waste of bandwidth and google would rather turn you off and spool up to save resources understandably.

So yesterday my site when down, but I hadn’t made any changes and was not sure why. I am still not quite satisfied (but it is working well again now). Looking at the traffic to the site I saw it dropped to nothing during the time the site was down, -20hrs to -6hrs in the graph below:

Some requests were further logging this disturbing warning as well:

Warn: 2013-01-15 10:21:06.818
A problem was encountered with the process that handled this request,
causing it to exit. This is likely to cause a new process to be used
for the next request to your application. (Error code 121)

In the end it looks like it was an app engine problem and not something on my end, lots of other people had the same Error code 121 problem during the same window and their apps having the same issues.

The problem was apparently temporary and I haven’t seen anything since. With regard to spooling up new instances of the app, since we had grading servers checking for new submissions periodically by design, this wasn’t an issue for us. We also upgraded to the paid version and requested 1 idle instance, which should additionally mitigate the issue.