During the question and answer section of the panel I recently spoke on at DCWeek 2012, one questioner asked the panel to describe an API that had “disappointed” us at some point. I replied: Twitter. Though he was angling for technical reasons – poor design, bad documentation, or insufficient abstraction – I had different reasons in mind.

Twitter’s Successful API

Twitter’s primary API is without a doubt a hallmark of good design. It is singularly focused on one particular task: broadcasting small messages from one account to all following accounts as close to real-time as possible. That extreme focus led to simplicity, and that simplicity meant it is easy for developers to code and test their applications. Interactions with the API’s abstraction are straight-forward and often self-documenting. When coupled with Twitter’s excellent documentation and sense of community, the early years meant that developers were free to explore and experiment, leading to a plethora of interesting – and sometimes terrible – Twitter clients (including my own Java IRC bot JackBot).

Coincidentally, the explosion of smart phones, social networking, and always-on Internet connectivity meant Twitter’s raison d’être was also a means to explosive growth. The Fail Whale was an all-too-familiar sight during those early growing pains, but the same focus and simplicity that made it an easy API for developers to use also made it possible for Twitter to dramatically improve the implementation. Today, Twitter serves over 300 million messages daily – up several orders of magnitude from when I joined – yet our favorite marine mammal surfaces rarely.

Business Decisions

Twitter’s early business model is a familiar story. A cool idea formed the basis of a company, funded by venture-capital and outside investment. There was little thought given to how to turn a profit. Seeing themselves in competition with the already-huge Facebook, growing the user-base was the only real concern. For many years, Twitter continued to foster its community: In a symbiotic relationship with developers and users – who were often the same – Twitter expanded and modified the API, improved the implementation, and actively encouraged new developers to explore new and different ways of interacting with the Twitter systems. So important was this relationship that even things like the term “tweet”, the concept of the re-tweet, and even Twitter’s trademarked blue bird logo all originated with third-parties.

But the good times can’t roll forever; eventually the investors want a return, and the company began seeking a method to make money. Seeing itself as a social network, advertising was the obvious choice. But there was a problem: the company’s own policy and openness had made advertising difficult to implement. Here’s what I wrote in December 2009:

Twitter shows us the future of the Web. The user interface on Twitter’s home page is as technologically up-to-date as any of Google’s applications: it’s a full-on CSS-styled, HTML-structured, JavaScript-driven, AJAX-enhanced web application. And it looks just as lackluster as GMail or Google Calendar. But Twitter isn’t about HTML and CSS – it’s about data and the APIs to access and manipulate it.

More than 70% of users on Twitter post from third-party applications that aren’t controlled by Twitter. Some of those applications are other services – sites like TwitterFeed that syndicate information pulled from other places on the web (this blog, included). Others are robots like JackBot, my Java IRC bot which tweets the topics of conversation for a channel I frequent.

Advertisers purchase users’ attention, and if you can’t guarantee that access, you can’t sell ads. But what third-party client is going to show ads on behalf of Twitter? Users – particularly the developers creating those third-party apps – don’t want to see ads if they can avoid it. You won’t make much money selling ads to only 30% of your users (who are also likely the least savvy 30%). What’s a little blue birdie to do?

The chosen path was to limit – and perhaps eliminate entirely – third-party clients. The recent 100,000 limit on client tokens is an obvious technological step, and they are already completely cutting off access for some developers. Additionally, where technological restrictions are difficult, changes to the terms of service have placed legal restrictions on how clients may interact with the API, display tweets, and even in how they may interact with other similar services. (Twitter clients are not allowed to “intermingle” messages from Twitter’s API with messages from other services.) It seems likely that the screws will continue to tighten.

A Way Forward: Get On The Bus

Twitter has built the first ubiquitous, Internet-wide, scalable pub-sub messaging bus. Today that bus primarily carries human-language messages from person to person, but there are no technical limitations preventing its broader use. The system could be enhanced and expanded to provide additional features – security, reliability, bursty-ness, quantity of messages, quantity of followers, to name just a few – and then Twitter can charge companies for access to those features. Industrial control and just-in-time manufacturing, stock quotes and financial data, and broadcast and media conglomerates would all have benefited from a general-purpose, simple message exchange API.

Such a generalized service would be far more useful to the world at large than just another mechanism for shoving ads in my face, and I would bet that the potential profits from becoming the de facto worldwide messaging bus would dwarf even the wildest projections for ad revenues. It wouldn’t be easy: highly available, super-scalable systems are fraught with difficulty – just ask Amazon – but Twitter is closer to it than anyone else, and their lead and mindshare would give them a huge network-effect advantage in the marketplace.

With this new model replacing the advertising money, third-party clients would no longer be an existential threat. Twitter could remove the pillow from the face of their ecosystem and breath new life back into their slowly-suffocating community.

Will they take this path? I doubt it. The company’s actions in the past several months clearly telegraph their intentions. Twitter’s API teaches us an important lesson that, no matter how well designed, documented, and supported an platform is, there will always be people behind it making business decisions. Those decisions can affect the usability of the API just as deeply as bad design, and often much more suddenly. Caveat programmer!

I had the opportunity to speak on a panel at DCWeek 2012 this past week: “Five Crucial APIs to Know About”. (I am not listed on the speakers page, as I was a rather last-minute addition.) Conversation ranged from what goes into making a good API – dogfooding, documentation, focus – to pitfalls to be aware of when building your business on an external API. It was a fun and informative discussion, and I walked away with plenty to chew on.

An API is all about two things: Abstraction and Interaction. It takes something messy, abstracts away some of the details, and then you, as a programmer, interact with that abstraction. That interaction causes the underlying code to do something (and hopefully making your life easier). If you interact with it differently, you’ll get different results. Understanding an API, then, requires understanding both the abstraction as well as how you are meant to interact with it.

Now, DCWeek focuses primarily on the startup scene. As such, I expected that most of my fellow panelists would be focusing on web-exposed APIs. Sure enough, there was plenty of talk on Facebook, Twilio, Twitter, and laundry list of other HTTP-accessible APIs. All of which are great! Note, though, that these APIs share one common thing: They are all network-reliant APIs. As such, they are built on a whole bunch of other APIs, but at the end of the day, they all route through one specific API (or a clone): Berkley Sockets.

Why should you care about a 30-year-old API when you care about tweets and friends and phone calls? Stop for a moment and think about what those high-level APIs are built on: a network. Worse – the Internet. A series of tubes. Leaky, lossy, variable-bandwidth tubes. And it’s only getting worse – sometimes you’re on a high-bandwidth wifi connection; other times you’re on a crappy, intermittent cellular connection in a subway tunnel.

The user’s experience with a high-level network API is going to be directly impacted by socket options chosen several layers down – often just by default – but different experiences require different expectations from the network. Do you have a low-latency API that provides immediate user-interactive feedback in super-short bursts? Then you might want to learn about Nagel’s Algorithm and TCP_NODELAY. Does your app require a user to sit and stare at a throbber while you make a network call? You might want to consider adjusting your connection, send, and receive timeouts to provide more prompt feedback when the network fails.

And believe me: the network will fail. But how do you handle it? As programmers, we tend to focus on the so-called “happy path”, relegating failure handling to second-class status. Unfortunately, treating failure as unlikely is simply not acceptable in a world of ubiquitous networking and web services. Not all network failures are the same, and providing the best user experience requires understanding the difference between the various types of failures in the specific context of what you were attempting to accomplish.

So take a moment and do some research. If you’re using a networked API that exposes network details, learn about them and tweak them for the specific task at hand. If you’re writing an API, consider how users will be accessing it, and provide them guidance with how to achieve the best possible experience over the network. The people using your apps will thank you.

One of the features in the Prism Webapp Bundle for Google Wave is a toaster pop-up notification of unread waves using the window.platform.showNotification() method. The third parameter is named aImageURI, and is described by the nsIPlatformGlue IDL as, “The URI of an image to use in alert. Can be null for no image.”

Which is great, except … what URI scheme and path does one use? Every example I could find always passed null for the image, so after giving up on the web I joined the Prism mailing list and posted a question. The first response was to use the inline data scheme, with a base64-encoded image. It was ugly, but it worked.

I’ve created a webapp bundle that does just that. Unfortunately, such bundles at present only work with the stand-alone version of Prism. The Firefox add-on is really a better way to run Prism, but if you’re using it you’ll need to do a little manual mucking in your webapp profile to use this bundle.

Stand-Alone Bundle

So, if you just want the bundle, here you go. Note that I haven’t really tested it on the stand-alone version, so please let me know if something is broken.

Hack Your Webapp

As I said, if you’re using the add-on version, you’ll need to do a little manual hacking. After you create the webapp, as described in my earlier post, open up Explorer and navigate to your Prism webapp bundle cache. On Windows, this is in %APPDATA%\WebApps (something like C:\Users\Brian\AppData\Roaming\WebApps); on Linux, it is ~/.webapps. You should see your Google Wave webapp in that directory. Add the webapp.js script to that directory, and also add in images/google-wave-52×32.png. Now you should get a toaster pop-up and task bar notification when there are new waves.

It would be nice if Google were to add a <link rel=”webapp”> to Wave, referencing an appropriate bundle. If anybody there sees this and cares to use my code as a crude starting point, I am releasing this code under an MIT license.

Windows Installer: As anyone who has done .msi development knows, you will never find a more wretched hive of scum and villainy.

Visual Studio 2005 valiantly tries to make things easier by offering “Setup and Deployment” projects. This thing magically binds together the build outputs of other projects and burps out a plausible .msi file. Hoora……waitaminute, something’s not quite right here. Yeah.. it turns out that if you want anything but the barest minimum of shoving files and registry keys onto target machines, you’re going to have to do some post-processing, son. Fortunately, Microsoft provides a handy COM API for torturing the .msi SQL database until it agrees to do your bidding.

What? Oh, sorry…you didn’t know that an .msi was basically a demented relational database crammed into a file? Congratulations, now you can share my nightmares.

But I come here not to complain about the .msi file format, nor Visual Studio. The main course of today’s rant will be the installer engine itself, msiexec. Specifically Windows Installer 4, which led me on a merry chase today. I accidentally missed a dependency for one of my custom actions, and got the following lovely error:

The installer has encountered an unexpected error installing this package. This may indicate a problem with this package. The error code is 2869

I’ve been hacking on these beasties for a couple years, this was not my first dance with 2869. In fact, the internet is filled with stories about it. This isn’t actually an error about what went wrong during the install, it’s an error about [what went wrong when the installer was trying to tell you [what went wrong during the install] ]. This is what we call a masking error, meaning “Your installer is so broken, I can’t even tell you about it properly”. A detailed install log offers up:

DEBUG: Error 2869: The dialog ErrorDialog has the error style bit set, but is not an error dialog

Most forum threads about this error are from hapless end users trying to get their program download to work, and vendors supplying fixed versions. Everyone addresses the root cause, i.e. the error actually being thrown first, that eventually leads to the 2869. Often it has to do with impersonation problems. Well and good, but I already knew what was wrong with my custom action. What I wanted, and couldn’t find anywhere, was someone who understood why the error reporting mechanism itself was failing. (Spoilers: eventually found severalrightanswers. It’s easier to find them in retrospect once I knew what was wrong.)

What could be wrong with the ErrorDialog? This guy came right off the truck from Visual Studio, and my tweaks never touched it. Nevertheless, I spent about an hour poring over the documentation, trying to find any possible detail that was different between my .msi and the spec. It all checked out totally fine. But no matter what I tried, not an error dialog. Not an error dialog. My kingdom for an error dialog!

It’s such a pointed error, you see, and there are so many subtle requirements, I thought I must be missing something. Was it the phase of the moon or something about the feng shui orientation of my laptop? In casting about for a solution I happened upon this note. It talked about adding an entry to the Error table, which is advice I hadn’t seen before:

In order to see the actual error, open the MSI with ORCA and add the following entry to the “Error” table.

1001 | “Error[1]: [2]”

My logs never showed a 1001 error code, and a missing entry in the error table doesn’t have any relevance to the properties of the error dialog being correct. And yet, and yet… The page referred to 2869. With nothing to lose, I tried adding the entry. As if by magic, the error reporting immediately began working just fine. Total changes needed to the error dialog: zero. Total time wasted on this: one afternoon.

What happened? In this case not only is the 2869 masking the underlying error, but the windows installer engine itself was lying about the nature of the masking error, and as a side effect of the problem, hiding the real error code (1001) to boot! Why 2869 and not something like, “So listen, I see there’s no format string in the Error table for #1001… so regrettably I must now poop myself.”

I can totally imagine how it went down. The developer needs to implement a new error formatting behavior in version 4, but in the event that the .msi has broken error handling, he has to tell the user about it somehow. Adding a brand new error code would require changes up and down the source tree. It’s almost deadline already, and hey look: this error 2869 is pretty close, it’s about error dialogs not working. Surely anyone who gets that error will quickly understand what was meant. One line of code, and under the wire home free.

One of the new features in the BagIt Library will be multi-threading CPU-intensive bag processing operations, such as bag creation and verification. Modern processors are all multi-core, but because the current version of the BagIt Library is not utilizing those cores, bag operations take longer than they should. The new version of BIL should create and verify bags significantly faster than the old version. Of course, as we add CPUs, we shift the bottleneck to the hard disk and IO bus, but it’s an improvement nonetheless.

Writing proper multi-threaded code is a tricky proposition, though. Threading is a notorious minefield of subtle errors and difficult-to-reproduce bugs. When we turned on multi-threading in our tests, we ran into some interesting issues with the Apache Commons VFS library we use to keep track of file locations. It turns out that VFS is not really designed to be thread-safe. Some recent list traffic seems to indicate that this might be fixed sometime in the future, but it’s certainly not the case now.

Now, we don’t want to lose VFS – it’s a huge boon. Its support for various serialization formats and virtual files makes modeling serialized and holey bags a lot easier. So we had to figure out how to make VFS work cleanly across multiple threads.

The FileSystemManager is the root of one’s access to the VFS API. It does a lot of caching internally, and the child objects coming from its methods often hold links back to each other via the FileSystemManager. If you can isolate a FileSystemManager object per-thread, then you should be good to go.

It turns out that ImageMagick is really quite good at reading, writing, re-arranging, and otherwise mucking with PDFs. Unfortunately, you need to know the proper incantation, which can take much trial and error to figure out. So, for my own future reference:

Split A PDF Into Parts

$ convert -quality 100 -density 300x300 multipage.pdf single%d.jpg

The quality parameter is the quality of the written JPEGs, and the density is the DPI (in this case, 300 DPI in both X and Y).

Join JPEG Parts Into A PDF

$ convert -adjoin file*.jpg doc.pdf

Rotate a PDF

$ convert -rotate 270 -density 300x300 -compress lzw in.pdf out.pdf

This assumes a TIFF-backed PDF. The density parameter is important because otherwise ImageMagick down-samples the image (for some reason). Adding in the compression option helps keep the overall size of the PDF smaller, with no loss in quality.

Now, if I can just figure out how to make future me remember to look here…

In the little free time that I have, I have been messing around with writing a .NET program to help me with the large amount of photo metadata editing I want to do on all the photos I’ve been uploading to my Flickr photostream.

It’s been fun doing some .NET again. It’s easy to forget how nice of a language C# is especially with all of the fancy new features 2.0 brings. Last night, I came up with a fun little method of using iterators.

Here’s the scenario: I need to collect photo IDs for editing. As a simple solution, I have a very small dialog box that contains a text field, an “Add Another” button, a “Done” button, and a “Cancel” button. When the user clicks “Add Another,” I want to save the entered photo ID and re-display the dialog. If they click “Done,” the currently entered ID should be saved, and the dialog should vanish. Finally, if they click cancel, the current ID does not get saved, and the dialog disappears.

I have set up the DialogResult of the “Done” buttton to return with DialogResult.OK, and “Add Another” to return with DialogResult.Yes. Here’s the code in the UI controller responsible for coordinating this:
public IEnumerable GetPhotoIds()
{
AddPhotoDialog dialog = new AddPhotoDialog();

I recently had to do some mangling of a dump of my personal Subversion repository. Basically, I had to modify some paths and revision copy numbers before re-importing to a clean repository. However, the dump was in one huge 300 MiB file, making it really difficult for to open for editing.

Normally, the solution would simply be to re-dump the repository using the --revision option to the svnadmin dump command. Unfortunately, in a flash of stupidity, I deted my old repository before I had the new one working. So I wrote a little Perl script to split the dumpfile into seperate files.

Ah…short and sweet, just the way a Perl script should be. Normally, I would immediately delete any Perl that I might happen to write, as I think the language is too flexible to be properly maintained over any length of time. However, since I can’t find anything else like this out on Teh Intarweb, I figure I’ll leave it here for posterity.

Basically, direct links into Mantis would not work, since most of the Mantis pages redirect to the login page when a user has not yet authenticated. The modification was simply to modify the login page to detect basic authentication and redirect to the previously modified login script.