Saturday, August 25, 2007

Have you ever noticed that smart, interesting people have weird technical hangups? Often, they take a good idea to its logical conclusion in such a manner that it dominates their lives. For instance:

I'm an open source fanatic. I'll put up with a lesser product if it means the difference between being open source or not. For instance, I think Apple has a better desktop experience than Ubuntu does, and I also think they have slicker laptops than Dell has. However, I refuse to buy a Macbook because it's not open source, despite the fact that all the people around me have Macbooks--even my heros Guido van Rossum and Bram Moolenaar.

My buddy Mike C. hates OOP. Mike's a wicked sharp guy from MIT, so he's earned the right to his opinion. (As an aside, it's strange how vehement many Lisp hackers hate OOP, despite the fact that OO systems exist for Common Lisp.) It'd be one thing if he were simply a fan of Scheme over Java, but Mike often codes in Python and refuses to use OOP.

My co-worker Alex J. has many strange technical hangups. He's also been coding in Haskell for nine years, which I think qualifies him for "smart and interesting". Alex hates RDBMSs because of the way they use disks. Clearly in the 70s, RDBMSs were an important way to abstract disk usage, but he argues that memory is plentiful enough these days that we should no longer optimize for disk usage. The thing that irritates him most is that when you query a database, you don't know whether it's going to be able to answer the query from memory or have to incur several 9ms seek time penalties in order to answer the query. He calls this a "leaky abstraction". Sounds somewhat reasonable right? Except Alex takes it to the point where it literally pains him to work in any company that uses an RDBMS.

I have a previous co-worker named Jesse M. who hates the Web. He's a senior architect at IronPort Systems; he's the type of guy who can get big things done. Jesse's a user interface purist and he argues that the Web is a fundamentally broken user interface. The fact that every Web site looks and behaves slightly differently is horrible from a user interface perspective. His pet peeve is that text areas have scroll bars while the page itself also has a scroll bar. Personally, I had never thought to be irritated by that. It's reasonable for a user interface purist to gripe about the Web, but Jesse takes it to what might be considered unreasonable levels. He almost entirely refuses to use the Web. He occasionally has to write scripts to scrape data from Web pages so that he can avoid using a Web browser. In fact, he refused the adoption of a company wiki because that would require a Web browser.

My buddy Sam R. loves asynchronous networking, but hates writing asynchronous code. These days Erlang is popular enough that people are beginning to see that you can write asynchronous code without breaking everything into callbacks. However, Sam figured this stuff out years ago. He was the inspiration behind Stackless Python and wrote the mail server for eGroups (i.e. Yahoo Groups). There are cases where the performance benefits of asynchronous networking are simply unnecessary; cases where it'd be nice to simply use the built in server libraries. However, Sam would rather write everything from scratch--hacking at the innermost bits of Python--that put up with synchronous networking APIs or having to write Twisted code. By the way, I'm a convert to Sam's religion. I think Twisted is awful ;)

Guido van Rossum has an interesting hangup. He built restrictions into the very syntax of Python in order to force programmers to write more readable code. Did you ever edit a piece of C where the indentation didn't match the braces? Guido's response was to write a language where it's not possible to indent the code in a way that conflicts with the meaning of the code.

Similarly, in many functional languages, you can write code like

a = if winner: 1else: 2

The "if" statement is itself an expression that returns a value. You can do this in all of the functional languages and in Ruby too. However, Guido feels that the following is more readable:

if winner: a = 1else: a = 2

Guido's response: Python makes a distinction between expressions and statements that prevents you from writing the code in a way that Guido feels is less readable.

Well, since it's my blog, I can confess to one more over-the-top hangup. I love style guides. I think following style guides improves code readability. In fact, I have the style guides for C (BSD), Java, Python, and Perl mostly memorized. Sounds like a good idea, right? The problem is that I'm so obsessed with style guides that I have a hard time reading code that doesn't follow the style guide. During code reviews, if a chunk of code doesn't follow the smallest requirement in the Python style guide, I'm so distracted that I can barely focus at all on what the code does or whether it has any bugs.

It's interesting to see how smart, interesting people can often take reasonable ideas and take them to their logical conclusions in such a way that it dominates their lives. Sometimes, it's in a way that is not only disproportional to the subject at hand, but is even occasionally harmful to them overall.

Thursday, August 23, 2007

Yesterday, my Dish Networks satellite went out and took 20 minutes to come back.

Also yesterday, my wife's bluetooth headset kept disassociating itself from her phone. She had to reset it several times.

Last night, my access point stopped working. From the access point, I could ping my DNS server. From my laptop, I could establish a wired and/or wireless connection to the access point, but from my laptop, I could not ping my DNS server. The same was true of my wife's laptop. I hadn't changed anything on my AP in months. Finally, I gave up, restored it to factory defaults, and set it up again from scratch. Now it works.

What's going on? Was there a solar flare I didn't hear about? Might it be that my 100 year old house is not providing "clean" electricity? Any ideas? Weird.

Monday, August 20, 2007

It's interesting to me that while a modern Web application seems to have a shelf life of two years, popular programming languages never die. This isn't news, but I thought I'd just point out a few:

FORTRAN

FORTRAN is still a favorite among scientists.

COBOL

COBOL is still alive and well in ERPs and banking systems.

C

C isn't dead by a long shot. Kernels (e.g. Linux) and interpreters (e.g. Python) are still written in C.

Lisp

Even though Lisp was first written about 40 years ago, Lisp is still used at various companies like Orbitz, and rest assured that as long as Paul Graham lives, he'll never stop talking about it ;)

APL

APL seems dead, but it's not. Every once in a while, I'll meet a strange hacker who can translate a long algorithm into a single magical incantation of funny symbols in APL.

Forth

Forth is alive and well at the firmware level.

Pascal

Pascal's not dead. It's still being taught as a first programming language.

Ada

Ada is still being used by the military.

JavaScript

You might ask why I bring up JavaScript since it's clearly everywhere. I'm sure that if its designer knew that it was going to be the single most widespread programming language interpreter on the planet, it might have gotten a bit more of a design review ;) JavaScript is wonderful and horrible at the same time, and I highly doubt it will die within my lifetime, even though it'd be wonderful if we could replace it.

Prolog

I recently found out that Prolog is still being used in various natural language processing contexts.

Sed and Awk

Many hackers still use Sed and Awk in shell scripts when the complexity of a larger language like Perl isn't justified.

BASIC

It makes me sad, but there are still kids who learn programming by way of BASIC.

Smalltalk

Smalltalk is still alive and well in projects like Squeak and Seaside.

Assembly

One might think that the only reason to code in assembly is to write the backend for a compiler or to write boot code for an operating system, but assembly is still used anytime your resources are scarce, and you need to code close to the machine.

So the next time someone says that Java is dead, know that he's dead wrong. Java will be around at least as long as COBOL.

Saturday, August 18, 2007

Ruby has a curious approach to protecting instance variables, constants, and private methods.

I've often heard Java programmers criticize Python because it doesn't enforce privacy in any way. Personally, I think that it'd be great if Python could be fully sandboxed like JavaScript can, but sandboxing is a completely separate topic. Preventing a programmer who works on my team from calling a method that I've named _private_method isn't all that interesting to me. If he sees the fact that I've named the method with a leading underscore, and he still feels the need to call it, so be it.

Ruby does provide private instance variables, constants, and private methods, but really, those are just suggestions.

You can use the same "inject a method" trick to get access to instance variables:

def obj.get_a @aend

In no way am I criticizing Ruby for this behavior. As I said, I think it's a bad situation if you can't trust your team members. I just wanted to point out that in Ruby, the protection mechanisms are really just suggestions ;)

OpenDarwin has failed to achieve its goals in 4 years of operation, and moves further from achieving these goals as time goes on...The original notions of developing the Mac OS X and Darwin sources has not panned out. Availability of sources, interaction with Apple representatives, difficulty building and tracking sources, and a lack of interest from the community have all contributed to this.

I can't say I'm surprised. When it comes to playing fair in the open source world, I simply trust the Linux guys more than I trust Apple. Besides, Darwin isn't even the most interesting thing about OS X--Cocoa is. Tragically, it's closed source.

As you all know, I've been pondering operating systems lately. I just don't think people are going to tolerate Apple's walled garden / vendor lock-in forever. I don't get the sense that Vista is a huge success. Based on my attendance at Linux Expo for the last seven years in a row, Linux seems to be somewhat quiet these days, at least on the desktop side. It makes me wonder what's going to happen on the desktop.

Maybe the desktop is dead--killed by the Web and Google. Maybe the desktop is going to be reborn via the likes of Adobe Air. I sure hope not. I don't need any more proprietary systems from Adobe! Maybe the desktop and the world-wide Web are both less important these days now that Facebook is functioning as a social operating system.

Saturday, August 11, 2007

Python has a wonderful interactive interpreter (i.e. shell). IPython is a third-party Python shell that's even nicer, but that's a topic for another post. It's fairly common to code in the shell until you have the code working correctly, and then copy-and-paste it into your program. Developing super-interactively is a great way to keep bugs at bay.

However, sometimes you need more setup before you can start coding. For instance, when writing a Web app in, say Pylons, you might need an actual request and a database connection before you can start coding what you want. You might even need a form POST before you can start. Ideally, you'd be able to start the shell from in the middle of your application at just the right spot. I'm pretty sure that someone out there knows how to get IPython to do the right thing, but I find using pdb, the Python debugger, really helpful for this purpose.

First of all, add the following wherever you want to break into the debugger, "import pdb; pdb.set_trace()". If you need some more help about the debugger itself, type "?". Now, you'll want to see what variables you have to play with, so use "dir()". Let's say the "request.environ" object has what you want, but you're not sure under what key. Well, start by printing it, "p request.environ". If you need a nicer looking version, try "from pprint import pprint; pprint(request.environ)". If you find the right key, say 'spam', then print that, "pprint(request.environ['spam'])". Maybe spam is an object, and you need to know what methods it has, "pprint(dir(request.environ['spam']))".

In the Python shell, you can use "help(x)" to find out more about x if x is a module, class, or function. If x is an object, try "help(x.__class__)". In the debugger, help() tends to not work for a reason that someone smarter than me probably understands. However, you can still do things like "p x.__class__.__doc__".

The debugger isn't very good for writing new functions or multi-line statements, but it's great for evaluating simple expressions. Once you have the code you want, you can copy-and-paste it back into your program and repeat the process. Writing code in this way is really helpful so that you can deal with the details one at a time without trying to debug a larger program that might have more than one bug. That's not to say I don't make heavy use of print-statement debugging, but using pdb is great if you want to "look around."

Tuesday, August 07, 2007

As part of my day job, I've written a Rails-style database migration script. This lets you write migrations from one version of a schema to the next. This allows you to develop schemas iteratively. It also lets you upgrade or downgrade the schema. Best of all, if an attempted upgrade fails, it can back it out even if you're not using transactions. Of course, this is based on writing "up" and "down" routines--it's practical, not magical.

I'm releasing this code in the hope that others will find it useful. It's well-written, solid, and well-tested. This is the type of thing you could probably write in a day. I took four, and polished the heck out of it.

It uses SQLAlchemy to talk to the database. However, that doesn't mean you have to use SQLAlchemy. Personally, I like writing table create statements by hand. You can do either.

My database configuration is stored in a .ini file ala Paste / Pylons. Hence, the script takes a .ini file to retrieve the database configuration. If you don't use Pylons, but you still want to use my script, that's an easy change to make. Migrations are stored in Python modules named like "${yourpackage}.db.migration_${number}.py". Again, I use Pylons to figure out what "${yourpackage}" is, but that's easy enough to change.

The name of my Pylons app is "multicosmic", and the script is installed in my application. You'll need to change the name to match your app.

Start by creating directories and __init__.py files for "multicosmic/db" and "multicosmic/scripts".

First, there's a migrate script in "multicosmic/scripts/migrate.py":

#!/usr/bin/env python

"""This is a script to apply database migrations.

Run this script with the -h flag to get usage information.

Migration Modules-----------------

Each migration is a module stored in``${appname}/db/migration_${revision}.py`` where revision starts at 000(i.e. an empty database). Each such module should have a module-levelglobal named migration containing a list of pairs of atoms. Forinstance::

The up and down atoms may either be SQL strings, or they may befunctions that accept a SQLAlchemy connection.

Since I'm using SQLAlchemy, you might wonder why I'm writing actual SQL.I like to use the SQLAlchemy ORM. However, when creating tables inMySQL, there are so many fancy options that I find it easier to writethe SQL by hand.

Error Handling--------------

* If something goes wrong when down migrating, just let the exception propagate.

* If something goes wrong when up migrating, complain, try to back it out, and then let the exception propagate. If backing it out fails, just let that exception propagate.

* Use transactions as appropriate. There are a lot of cases in MySQL where transactions aren't supported. Hence, backing things out is sometimes necessary. However, it's also possible that a transaction might rollback, and then the code to back things out runs anyway. It's best to make your down atoms idempotent. For instance, use "DROP TABLE IF EXISTS" rather than just "DROP TABLE".

I'm using SQLAlchemy, but that doesn't force you to use SQLAlchemy inthe rest of your app. I'm using Paste's configuration mechanism becausethat's how my database configuration information is stored. Passing aCONFIG.ini to the script meets the needs of Paste and Pylons usersperfectly. If you're not one of those users and you want to use myscript, it's easy to subclass it and do something differently.Similarly, if you're not using Python 2.5, I'm happy to remove thePython 2.5-isms. Let's talk!

"""

# Copyright: "Shannon -jj Behrens <jjinux@gmail.com>"# License: I am contributing this code to the Pylons project under the same license as Pylons.

def find_desired_migrations(self): """Figure out which migrations need to be applied.""" self.find_migration_range() self.desired_migrations = [ self.migration_modules[i] for i in self.migration_range ]

def find_migration_range(self):

"""Figure out the range of the migrations that need to be applied."""

if self.current_revision <= self.desired_revision:

# Don't reapply the current revision. Do apply the # desired revision.

Monday, August 06, 2007

I make heavy use of a nicely indented notes file and a TODO file. Until recently, I had never used an outline editor, even though my files were basically outlines. I saw my buddy, Alex Jacobson, using his outline editor, and I decided to try out the one for Vim. Within a couple hours, I was hooked!

Actually, there are several outline plugins for Vim, but I think that VimOutliner is the best.

It has nice syntax highlighting for the different levels.

It manages Vim's folding as you would expect.

It understands how to put a paragraph of text under a heading and how to automatically turn on line wrapping.

It supports checkboxes, and it's really smart about working with them.

It supports inter-document linking.

It has a nice menu, so you don't have to memorize the documentation before getting started.

Best of all, since it's a Vim plugin, it fits right in with my blazing-fast, Vim editing skills.

Friday, August 03, 2007

For a long time, my goal has been to develop a higher-level, natively-compiled programming language, and then to develop a proof-of-concept kernel in it. Well, someone else beat me to the punch.

House is a proof of concept operating system written in Haskell. It has some simple graphics, a TCP/IP stack, etc. Naturally, it's just a research project, but achieving proof of concept was my goal too.

On that subject, I'm also keeping my eye on Microsoft's Singularity. It's a microkernel, and much of it is written in C#. Unlike most microkernels, the different components do not run in separate address spaces. The VM does protection in software, rather than hardware. I had been toying with this idea too, but my buddy Mike Cheponis informed me that VM/360 did it decades ago.

Is anyone other than me bummed that BeOS never took off? I'm sadly coming to the conclusion that Linux might not ever make it on the desktop. It's just not a priority. Too many greathackers use Linux on the server with Mac laptops. There's always hope that Haiku might recreate BeOS in a completely open source way, but it would have been a lot easier if Be had simply open sourced it in the first place.

In the meantime, SkyOS thinks that there is room for another easy-to-use, featureful, proprietary OS. Apple succeeded at this. BeOS failed. It's hard for me to get excited about a new proprietary OS. I'd sooner buy a Mac (although I still haven't been fully de-Stallmanized).

Well, now that House has shown that you can write a kernel in Haskell, I think I need a new goal. Maybe I'll go solve world hunger. I've heard there's a little squabble going on in the Middle East that could use some attention. Maybe I'll go write an entire operating system in modern assembly; oh wait, it's already been done ;)