Friday, April 11, 2008

Google is off to a good (bad?) start with both of these in its management of the App Engine release.

Of the 120+ issues logged by beta testers, a few have been closed as wontfix or duplicate; most have no response at all from the App Engine team. I can't think of any other company that I've filed an issue with that took that long to get back to me. The good ones get back within hours.

The one exception I have seen is for the urllib issue, where gu...@python.org, presumably Guido, wrote

Providing a urllib replacement implemented on top of urlfetch shouldn't be particularly hard. If someone is willing to produce one, I'd be happy to review it and, if it passes muster, try to get it added.

Paraphrased: "maybe if you do our work for us we'll consider it."

WTF!

This isn't OSS, where "if you want something, do it yourself" is at least a semi-valid response. App Engine developers are all currently beta testing a product that Google hopes to eventually charge for. We're doing google a favor. (Context: the replacement Guido wants is a piece of code that will only ever be useful on app engine, and is something Google should have done in the first place instead of making urlfetch a public API. This is not code with a use case outside of App Engine.)

Maybe I'm over-sensitive, but this really rubs me the wrong way.

I hope Google can (a) put enough engineers on this that they can actually respond to issues, and maybe start closing some, and (b) remember that when you're selling a product, "why don't you fix it if it bothers you" is a poor response.

Thursday, April 10, 2008

App Engine sure has caused a stir. Some of the competition is already scared, with reason.

But who is App Engine's real competition?

In a lot of ways, App Engine is in a class by itself. It competes on the high end with Amazon Web Services. But it also competes on the low end with every shared host out there. And thanks to the integration of Google authentication and the application directory you could also make a case that in an orthogonal way it competes with Facebook's application API.

At the low end, App Engine is a big deal for Python developers and anyone else who is allergic to PHP. Historically, you've really had to look hard for low end hosting that offered anything else. And as everyone who has given products away to colleges knows, Free is a fantastic hook to get developers to try out your platform. Once it's open for all, App Engine is going to become the preferred option for developers with the itch to write a toy or proof of concept and show it off to the world.

Less obviously (to developers, anyway), App Engine also a big deal for businesses that aren't quite big enough to hire a sysadmin, or who are big enough but still prefer not to deal with that complexity. (You thought hiring skilled developers is hard? If anything, hiring skilled sysadmins is harder.)

I suspect there are a substantial number of companies in the uncomfortable situation of really needing more performance than shared hosting offers, but not wanting the complexity of taking the next step, to dedicated servers with dedicated sysadmins.

Of course, given App Engine's constraints, porting such applications to it is only going to be an option in a few cases. The question is, are managers of new projects farsighted enough to see this problem coming and realize that app engine insures against it?

At the high end, AWS is the only real competition to App Engine, but as most observers have pointed out, they are different beasts. AWS offers far more flexibility, at the cost of far more hours from your ops department. (Although App Engine's datastore is a lot more sophisticated than the AWS SimpleDb, so the capabilities of AWS aren't a strict superset of App Engine's.) Contrary to the Joyent assertion linked earlier, it isn't necessarily stupid to trade flexibility for convenience. App Engine just works to an unprecedented degree in the field of high-end scalability.

As with anything this disruptive, there's been a certain amount of hysteria. Even people who should know better have repeated the idea that "nobody will want to acquire a product built on App Engine because you're locked in." This is stupid. Depending on a proprietary platform hasn't stopped products built on Oracle from being acquired, or products using AWS, or even products built on a proprietary UNIX. (Yes, those still exist.) Nobody will care if you build on App Engine, except maybe Microsoft and Yahoo. And even they can be pragmatic; Hotmail ran on BSD when Microsoft acquired them.

Lock-in is a real issue, but not because App Engine will keep you from being acquired, and not because Google will screw you once they have you in their clutches -- that would scare off new customers and thus be bad business. Lock-in is an issue because evolving requirements might make App Engine's confines less of a good fit than it started out. If you have to start adding servers at AWS or RackSpace to handle things you can't within App Engine, App Engine loses most of its value.

Over three years ago (!), I wrote a screen scraper to turn the Python Job Board into an RSS feed. It didn't make it across one of several server moves since then, but now I've ported it to Google's App Engine: the new unofficial python job board feed.
I'll be making a separate post on the Google App Engine business model and when it makes sense to consider the App Engine for a product. Here I'm going to talk about my technical impressions.
First, here's the source. Nothing fancy. The only App Engine-specific API used is urlfetch.
Unfortunately, even something this simple bumps up against some pretty rough edges in App Engine. It's going to be a while before this is ready for production use.
The big one is scheduled background tasks. (If you think this is important, star the issue rather than posting a "me too" comment.) Related is a task queue that would allow those scheduled tasks to easily be split into bite-size pieces, which is important for Google to allow scheduled tasks (a) without worrying about runaway processes while (b) still accomplishing an arbitrary amount of work.
If there were a scheduled task api, my feed generator could poll the python jobs site hourly or so, and store the results in the Datastore, instead of having a 1:1 ratio of feed requests to remote fetches.
While you can certainly create a cron job to fetch a certain url of your app periodically, and have that url run your "scheduled task," things get tricky quickly if your task needs to perform more work than it can accomplish in the small per-page time allocation it gets. Fortunately, I expect a scheduled task api from App Engine sooner rather than later -- Google wants to be your one stop shop, and for a large set of applications (every web app I have ever seen has had some scheduled task component) to have to rely on an external server to ping the app with this sort of workaround defeats that purpose completely.
Another rough edge is in the fetch api. Backend fetches like mine need a callback api so that a slow remote server doesn't cause the fetch to fail forever from being auto-cancelled prematurely. Of course, this won't be useful until scheduled tasks are available. I'm thinking ahead. :)
Finally, be aware that fatal errors are not logged by default. If you want to log fatal errors, you need to do it yourself. the main() function is a good place for this if you are rolling your own simple script like I am here.

Sunday, April 06, 2008

I have been lucky to be able to fill our recent open positions with people who know Python as well as Java so now we are up to half the (6 person) company in that category and preferring Python, and 2 of the others have played with Python and liked it at least well enough to not object. So the boss has conceded that it makes sense to go the Python route for our next project.

We're going to be doing a web, "next gen" version of our existing client-server project, which is mostly simple CRUD but does have 1000+ tables in its current incarnation. So we really need something that can autogenerate 90+% of the CRUD or we will go insane.

The trouble is, I still don't really like any of the Python web options 100%. (I like the web options in other languages less, but I'm a perfectionist.)

Django is well documented, its admin app is something everyone else envies, and newforms looks decent, but the ORM blows and I'm not fond of the template engine either. (Pre-emptive pedantry: yes, I know I can "import sqlalchemy." Please stop saying that like it means something; I'm not interested in defining models twice -- once for real work with SA, and once for interop with the rest of django.) Apparently django-sqlalchemy got far enough in PyCon sprints that it's kinda usable so working on that would be an option. Of course even then there is no guarantee the django core would accept it into mainline, and maintaining it as a "vendor branch" would proably suck. If django used a dscm like Mercurial I might be willing to do that, but svn is just too painful so that is a real risk.

I don't see a way to generate a page containing just a CRUD interface for table X with the django admin app. The admin app really is a monolithic application, not something you can easily re-use pieces of.

Regexps suck for url mapping.

Pylons is not well documented and after keeping an eye on this for something like 18 months I don't think this is a problem that will be solved, for whatever reasons. On the other hand, SA + mako is a very sane default, and both of those are well documented so it's really only core Pylons that suffers from doc crapitude, and core Pylons is fairly small. IRC responsiveness mitigates this further.

Pylons still doesn't have a good CRUD (or even high-level manual form generation) solution, which has bugged me for even longer than the docs. I can't fathom how people can tolerate writing this kind of boilerplate in 2008. Formalchemy gets about 30% of the way there. DBMechanic requires TG2 atm, although apparently hacking it to run on Pylons may not be too much effort; I would guess around 20% of the effort to get the django-sa project really usable.

TG2 is of course very bleeding edge and although I like genshi's syntax in theory, in practice XML templates irritate the hell out of me. (Very verbose, xinclude sucks compared to "inheritance," and incorporating rich dynamic content -- i.e., user-generated, like forum posts, that needs to include html tags -- is a PITA. Not to mention that having to write "a &gt; b" when you mean "a > b" bugs me all out of proportion to the actual inconvenience it inflicts on me.) Still, better than the django templates.

I'm skeptical that TG2 is a big enough value add to want to add it (in its unfinished state) as a dependency vs rolling our own on Pylons. But DBMechanic does look like it could be exactly what I want in a CRUD generator.

web.py seems like more of a tech demo than a real product. I don't see any signs of a CRUD or form generator. reddit, probably the largest web.py site at least in terms of page views, moved to Pylons.

Zope 3 is alone in being really production ready without running from svn. Grok does do a good job of smashing zcml and z3c.form looks okay but lives up to the Zope reputation of complexity. (Field managers, widget managers -- are these the same things? -- widget modes, ...) AFAIK relational dbs are still second-class citizens in zope, and with all due respect to zodb it is no postgresql. OTOH there is z3c.sqlalchemy which gives me hope. Finally: you have to manually restart zope (per the Grok tutorial) after changing your .py files? Seriously?

Bottom line, Zope might actually be a decent option if we had a Zope expert on staff but we do not and I am not willing to tackle the learning curve alone.

Nevow: form handling is in flux. The new hotness is "pollenation forms," but that is svn-only and the api "will probably change."

Zope and Nevow both have their own xml-based templates predecessing but similar to genshi. Something like Nevow's Stan is obviously useful for programmatic template generation but it's not yet clear if that's going to be something we need. Probably only if we have to write our own form generator. If so, I suspect ripping a standalone Stan out of Nevow would be straightforward.

(Spyce of course never really got any traction to speak of. It's time for me to let it go quietly into the night and leverage someone else's framework.)

Conclusion: I think porting DBMechanic to Pylons is our best option. DBMechanic seems designed to be more flexible than the django admin app. Django would be my second choice.

Wednesday, April 02, 2008

After reading a blog post titled "The Abysmal State of Python IDEs" (which I won't link to because it's minformative, but it's easy to google by title), I wondered how the author managed to pick such a lousy group of IDEs to try. He tried "ActiveState" (does he mean PythonWin?), DrPython, SPE, and ScrIDE, only one of which is in the top 10 google hits for Python IDE.

The google top 10 include Eric, Wing IDE, Radio Userland, SPE, PyDev, and Komodo. The Yahoo and MSN top 10s are similar. Except for Radio Userland, this is a much better group to start with, and one that in fact does include what I think are the only 3 Python IDEs worth trying.

So how does a newbie end up picking such a lousy group of IDEs to try? The only likely possibility seems to be that he went to the top google hit, the python.org wiki page. Or possibly he went off of the top MSN hit, the c2 wiki Python IDE page. Both are (rather, were) heaping wads of products that mostly weren't IDEs at all, or were IDEs for other languages that happened to include Python syntax coloring.

Syntax coloring and maybe a Run button doesn't qualify you as a Python IDE in 2008, guys. (Sorry, IDLE.) Integrated means you need to integrate something nontrivial, preferably a debugger, although gui builders can also count.

So I organized the python.org IDE page by feature set and moved the non-IDEs to the Editors page, even if a pedant would note that they were IDEs, just not really for Python. That's not what 99.9% of people are looking for when they go to a Python IDE page, so let's be useful rather than pedantic. I also elided the non-IDEs from the c2 page.