A web development framework is a tool for programming dynamic websites. This page collects some of my thoughts on some existing web development frameworks. I've never operated a high-traffic website so the following may be all wrong.

summary

My first choice for a new project would be to use Ruby on Rails on Amazon with Elastic Beanstalk. The database would be either Amazon DynamoDB? or Amazon RDS. DynamoDB? isn't described below but it's like Cassandra without secondary indices. I'd choose DynamoDB? if transactions and indices were unneeded and if huge write throughput may potentially be needed later, and I'd choose RDS otherwise.

If DynamoDB? was chosen and if the project turned into a big successful company later on, then i'd switch to Cassandra (DynamoDB? and Cassandra are not API-compatible but their architecture is so similar that it seems that it would be easy to port from DynamoDB? to Cassandra, unless you were using DynamoDB?'s conditional write functionality).

(Python or Java) Google App Engine

Google App Engine (or Appengine, or GAE) is awesome -- they'll supposedly take care of all the ops headaches if you scale, and they have an integrated deployment environment with logging, deployment, database connectivity, and an admin/debug console all set up for you. They also force you/teach you to code your database accesses in a scalable style. The downsides are:

it's a little slow, unreliable, and expensive (see http://www-cs-students.stanford.edu/~silver/gae.html . My guess is that the slowness is a consequence of the usage of the 'Megastore' database, which is a No-SQL style database whose defining feature is a limited form of transaction -- the possibility of doing transactions may come with a substantial speed hit. I wish they would get the latency down and the uptime up. I haven't looked at it for awhile, so perhaps they have.

not total control -- you're stuck in their environment, and if you need to do something weird they didn't think of (for us, it was allow us to support a simple custom protocol for large XML file uploads from a client) then you're out of luck. If you need a non-pure-python library that they haven't ported to their platform, you're out of luck. As time marches on the list of things they haven't thought of is growing smaller, however.

Google has a nasty habit of unpredictably abandoning products, and unpredictably raising prices on products. In the csae of Appengine, at least they promise to give you 3 years of warning before killing it. It's clear that Amazon's AWS is a huge success and won't be killed for longer than 3 years -- that's not clear with GAE. I wish they would promise to keep it around for a decade, or six years at the very least.

you can't trust their self-reported uptime measures (see http://www-cs-students.stanford.edu/~silver/gae.html ), because sometimes various services go down without the whole thing going down, and they don't always report that. I wish they were better about reporting these, hopefully they have gotten better since i last looked.

you'll be pretty tightly wedded to them, requiring a significant rewrite to migrate off later if you want to do that. I wish AppScale? were officially supported.

My guess is that no business will want to host a primary product on Google App Engine for these reasons (again, see http://www-cs-students.stanford.edu/~silver/gae.html for more), unless at the least the latency and uptime for standard requests, and cost and speed for updating large datasets, have improved since i last checked (and probably unless they extend the length of the won't-kill-it-for-3-years promise) (if anyone working on Appengine is reading this, the latency may be a hard problem, but at least two important elements could be fixed by managerial fiat; cap the cost of any single mapreduce job to something similar to what the analogous operation would cost if you were on AWS, and promise that GAE will stick around for 10 years). However, for personal projects that do not involve lots of system-wide analysis or system-wide updates to data (e.g. where each user has a little bit of data and it's very rare that you need to touch all the users), that you want to be slightly scalable just in case, for prototype products that you want to put up publically but that you're not sure that you want to actively improve unless they catch on, for internal sites, or for less important applications that you wouldn't spend time optimizing anyways, Google App Engine seems like an ideal choice. Another reason to prototype in App Engine is that it forces you to use horizontally scalable database access patterns such as no joins. I'm not sure, but i don't think Google itself uses Google App Engine for any of its core product offerings, but i think it uses it for some internal sites and some low priority offerings (such as http://www.google.com/moderator/ ).

If you can stand these caveats, and if you don't like Ruby on Rails, then choose Google App Engine. I loved many things about it and i wish i could use it for everything. Unless you are already experienced at this sort of thing, it may save you a bunch of time not figuring how to set stuff like monitoring and admin consoles up, and not initially worrying about keeping your site up under load.

If you use it, watch the App engine blog -- http://googleappengine.blogspot.com/ -- they've been very good about improving Appengine and getting rid of almost all of the gotchas that used to plague it in the past, save the ones above, and when they annouce an update it almost always has some awesome new functionality.

(Python) Flask

(Ruby) Sinatra

(Ruby) Ruby on rails

The most popular kitchen-sink framework (popular not in terms of numbers, that would be some PHP thing, but in terms of some sort of subjective 'mindshare').

js

Ember and Angular and Knockout seem to be the popular ones right now. Also note Node.js for the server-side. Not sure which is best. Never used any of them. See http://addyosmani.github.com/todomvc/ .

(Python) Django

Kitchen sink Python framework. I've never used it so i don't know whether to recommend it. My inclination is to say that if you want a kitchen-sink framework (and i think you do if you want a traditional, server-generated, dynamic website that spans many webpages), you should use Ruby on Rails instead.

Which language?

I don't have enough experience to say when to use javascript.

I'd say if you are making a traditional, server-generated, dynamic website that spans many webpages, use Ruby and Ruby on Rails. If you are making an API, either one is good (Ruby's Grape is particularly nice). If you are making a small 'app', either Flask or Sinatra is good.

More programmers know Python than Ruby but more junior web developers seem to know Ruby. However Python is being taught in a bunch of schools and grad students use it, so that may change. Python is an easier language to learn or to read if you don't know it.

So i guess my advice regarding which language is a little conflicting. Sorry, i can't make up my mind. If i had to choose today, for an application in which the API was important, i'd make the API in Grape and then the website in Ruby on Rails using ActiveResource? models.

Which database?

If you chose Google App Engine, you use their database. That's the whole point, because the database stuff is (so i'm told) what is hardest to scale.

If you aren't too worried about scaling to high numbers of simultaneous writes, and you just want to use the standard thing, use Postgresql. The big caveat with SQL is that it's hard (but not impossible if you have a lot of money) to horizontally scale to very high numbers of simultaneous writes without sharding.

If you aren't too worried about scaling to high numbers of simultaneous writes, and you need transactions, use Postgresql.

If you aren't too worried about scaling to high numbers of simultaneous writes, and you want the database to enforce schema-based consistency conditions on data in the database, use Postgresql.

If you aren't too worried about scaling to a huge database, or you can shard, and you could use the useful and interesting operations that it provides (but only within each shard), use Redis. The big caveats with Redis are (a) the keys for your data set must fit into your server memory, (b) no joins.

If you are worried about scaling to a high number of simultaneous writes, and you don't need transactions, and you're willing to spend more time developing because you will be using a simple key/value store, use Cassandra. The main caveats are that (a) no joins, (b) no transactions and (c) less popular (so less libraries, and less tested) than the others listed here. In theory Cassandra could even be a little easier to handle operationally than SQL because you don't have to deal with setting up masters and slaves and failover, but the potential immaturity of the platform kinda cancels that out -- also you need to make sure that you have enough nodes in your Cassandra ring so that if one goes down during high traffic, you don't experience cascading failure due to the load shifting from the dead node to the other nodes, pushing them over the brink too (this problem exists with any horizontal scaling setup, though).

It's said to be hard to switch data models after your application starts scaling. The 'safest' model in terms of horizontal scaling is one without joins (because it's hard to scale joins over multiple shards). If you know you will want to scale in this fashion eventually, you may want to make 'no joins' a rule from the beginning. Using Cassandra, Redis, or Mongo may help you to impose this discipline upon yourself or your team (or Google App Engine, but then that's hard to switch away from in any case).

Another way to put it:

If you need joins, use Postgresql. If you need transactions, use Postgresql unless your dataset will remain small (in which case you can consider Redis also).

If you only need a simple key-value store and you want scalability, use Cassandra.

If total size of the keyspace of each shard of your dataset will remain small, consider if you'd like Redis's unique features e.g. computing set intersection, union and difference; getting the member with highest ranking in a sorted set (Redis is also fast).

If you want to reduce development time by using a document-oriented datastore with lightweight map/reduce, at the cost of potentially more painful scaling than Cassandra, use Mongo.