Wednesday, May 17, 2006

In the foreward to the Wizard Book, Alan J. Perlis writes that "The programmer must seek both perfection of part and adequacy of collection." He goes on to say that

Every computer program is a model, hatched in the mind, of a real or mental process. These processes, arising from human experience or thought, are huge in number, intricate in detail, and at any time only partially understood. They are modeled to our permanent satisfaction rarely by our computer programs. Thus even though our programs are carefully handcrafted discrete collections of symbols, mosaics of interlocking functions, they continually evolve: we change them as our perception of the model deepens, enlarges, generalizes until the model ultimately attains a metastable place within still another model with which we struggle.

I think that this is a very nice description of what it's like to think like a programmer. Furthermore, it suggests the kind of role that blogging can play in the research process. Every time we post, we struggle to find a balance between getting it right and getting it written. If you put too little effort into a post, it comes across as lightweight, disposable. Too much effort and you eventually have something that you might as well send to a journal. The optimal blog post is timely enough to enter the flow of communication while the topic is still of interest, and substantive enough to travel. Most blog posts aren't optimal, of course, but that shouldn't stop us from trying.

So much for perfection of part. What about adequacy of collection? The advantage of having a research blog is that it serves as an archive of steps taken. Sometimes they seemed promising but went nowhere; sometimes an initial mis-step turned out to be very productive. Over time, the blog as a whole becomes more focused, more refined, a better model for processes "arising from human experience or thought." That is to say that the process of blogging, much like programming, can be one of stepwise refinement.

Somewhere in Discovering, Root-Bernstein has an anecdote about a scientist who wrote the most significant research questions on a blackboard in the lab, so they would always be in front of people and could be constantly modified to reflect new understandings. Blogs can serve the same purpose, placing an evolving set of questions and models before the members of a virtual lab.

Thursday, May 11, 2006

Two interesting news items have been posted at Engadget recently about research being done at Kodak that may help to automate the digitization of historical photographs. First, new scanners have the ability to estimate the decade that a print was made based on the paper, and may someday be able to recognize watermarks or handwriting on the back. Second, a 2004 patent makes use of the red-eye effect from flash photography to determine the subject's age.

Historians, curators and archivists already have a number of techniques for dating photographs (see Joe Nickell's Camera Clues for an accessible introduction to some of them.) Digitization of any historical source, however, brings it into the realm of computation. I don't know exactly how the Kodak scanning software works, but unless it is doing some kind of physical analysis of the photographic paper itself (rather than the image of the paper) it should, in principle, work on a high resolution TIFF scanned somewhere else. In other words, it might be possible to build a spider that sifts through online archives looking for photographic prints from the 1920s.

There is a lot of interest in biometrics right now, much of it geared toward present-day concerns with security and identity. The Kodak age-detection patent suggests, however, that we may see some spinoffs for historical research. To take another example, a research group at Georgia Tech is working on programs to recognize people from their gait. It doesn't seem farfetched to imagine a system that uses biometric techniques to search through, say, old newsreel footage.

The Programming Historian

Are you interested in learning how to program? Check out The Programming Historian, an open-access introduction to Python programming for working historians (and other humanists) with little previous experience.