A little while ago I wrote about Cedar, a BDD-style testing framework for Objective-C. The responses I received nearly all went something along these lines: “That’s great! Too bad I can’t use it, since I’m writing an iPhone app.”

Hogwash.

I actually wrote Cedar specifically for testing iPhone OS projects we’re working on at Pivotal. To prove it, I’ve started a small public iPhone project that I’ve test-driven entirely with Cedar. You can get the project here (more on that in a bit); it should eventually allow you to log into Pivotal Tracker, see all the delivered stories in a given project, and accept or reject each one. At the moment it does little more than start up and display the Pivotal Chicken*, but it does contain Cedar specs that run on and off the device.

“How is this possible?” you ask. I’ve done two things to make this work:

I separated out all classes that don’t depend on UIKit into a target that builds a static library. The specs for this target run as a console app using the OS X runtime, so no need to worry about runtime support for blocks (assuming you’re running 10.6). Also no need to incur the overhead of starting the emulator every time you run tests. This is a pattern I started using ages ago to make automated testing easier on Win32 client applications, and it works great for all the mobile platforms I’ve worked on. Framework independence means faster tests, and faster tests mean happier programmers. I recommend doing this whether you’re interested in testing with Cedar or not.

Tests for the actual app, which does depend on UIKit and therefore must target the iPhone runtime, run on the emulator (or, in theory, a device) using the PLBlocks iPhone runtime for block support.

You’ll need to build Cedar, both the dynamic framework (Cedar.framework) and the static iPhone library (libCedar-iPhone.a), as well as the OCHamcrest and OCMock frameworks. Fix the references in the StoryAccepter project to point to these libraries on your system.

If you’re running Leopard you’ll need to install PLBlocks 1.0 for Leopard, and you’ll need to include the runtime and set the compiler for both spec targets; the PLBlocks page has excellent instructions. If you’re running Snow Leopard the project should already contain the runtime, so you’ll just need to download and install the compiler for PLBlocks 1.0.1.

Select the DomainSpec target, and make sure you’ve selected the appropriate Mac OS X runtime for your system. Build and run; you should see dots appear in the console window as the specs run.

Select the StoryAccepterSpec target, and make sure you’ve selected an iPhone runtime (if you want to try running on a device you’ll have to set up the provisioning, of course). Build and run; the emulator should start up and try to run the app, which will simply run the specs and then exit. You should see dots in the console window, as before.

All of this still has some rough spots, especially the UIKit-dependent specs, but even so I’ve found test driving with hierarchical describe blocks far more pleasant than using OCUnit. Some things I hope to improve on:

I imagine the app that runs the UIKit-dependent specs showing a graphical display of test results, perhaps similar to what GHUnit displays when run on the emulator or device.

The iPhone doesn’t allow dynamic libraries, and I haven’t found a way to use OCHamcrest or OCMock for UIKit-dependent specs. The folks at Carbon Five describe using these libraries in their tests on the device; I’m curious to know how they pull that off.

Once iPhone SDK 4.0 comes out with support for blocks this should all work without the need for the PLBlocks runtime. That won’t help iPad development for the foreseeable future, though.

As we’ve grown our mobile practice at Pivotal we’ve tried to apply to it the same principles and disciplines that have made our Rails practice successful. Often the one that we have the most difficulty translating is testing. In my experience the testing tools for Objective-C in particular are significantly wanting; there are some out there, but they’re hard to find, often hard to use, and occasionally defective in frustrating ways.

One of the things I found I miss most in testing Objective-C, Java, or C++, is the hierarchical structure for organizing tests that frameworks like RSpec or Jasmine provide. I find nested describes indispensable for managing orthogonal aspects of the classes under test, for handling preconditions, for eliminating redundant setup code, and for generally keeping my sanity. So, when I first heard about the addition of blocks in the GCC compiler for Objective-C the first application that came to mind was testing.

So, I wrote Cedar, a BDD-style framework for writing tests in Objective-C. The code is available here. Perhaps more importantly, Cedar is in its infancy so I’m interested in any suggestions and feedback. To that end, I created a public Tracker project for it here.

Unlike OCUnit, Cedar doesn’t run magically run as part of the build. You have to create an executable target for your specs and run it. I did this because I find looking through the build output for test logging and the like to be cumbersome, among other reasons. This may or may not have been a good choice.

Yes, those are C preprocessor macros surrounding the specs. Before you get out the torches and pitchforks keep in mind that Objective-C, unlike Ruby or JavaScript, is a compiled language. This means that all imperative code must be in function or class method of some kind. In order to remove code that provided a distraction from the specs themselves I wrapped as much boilerplate as possible in these macros. When expanded, the code looks like this:

Cedar has no matchers, other than the fail() method. Rather than reinvent the wheel I decided to support using the matchers from the Hamcrest library, available here. Note that you can only get the Objective-C port of Hamcrest by checking out the code from Subversion and building it yourself. I considered committing a pre-built version of the Hamcrest framework into the Cedar repository, but I’m not sure what the accepted approach is for including dependencies like that in Objective-C projects. Feedback welcome.

All of this obviously depends upon the support for blocks provided by the GCC compiler for Objective-C. Unfortunately, this means you can only use Cedar on a Mac. Far more unfortunately, it means you have to build your specs for a runtime that supports blocks; at the moment this is only the Mac OS X 10.6 runtime. The iPhone OS runtime doesn’t support blocks (although 4.0 may), and the Mac OS X 10.5 runtime doesn’t support blocks. However, all is not lost. Plausible Labs provides patched versions of the GCC compiler and runtimes for iPhone OS and Mac OS X 10.5. I built much of Cedar on a Leopard machine with the PLBlocks compiler; I haven’t tried building for iPhone OS yet, I look forward to hearing about any experiences.

Don’t try to mix blocks with Objective-C++, at least not yet. I tried it for some time and ran into any number of internal compiler errors. Hopefully this will improve in the future. As some will astutely point out, I could have used the anonymous functions introduced by C++0x (and supported by GCC). Unfortunately (from Wikipedia):

If a closure object containing references to local variables is invoked after the innermost block scope of its creation, the behaviour is undefined.

There’s no need to provide a header file for your specs, since Cedar finds the specs by introspection.

As I’m sure will soon become entirely obvious this is very much a minimal viable product for Cedar. You can create and nest describe blocks, create examples and beforeEach blocks, and that’s about it. I’m curious to see if people will use something like this; if they do, I’m hoping for plenty of feedback. I’m attached to basically nothing about the framework at the moment (including the name), so please send me a note or join the Tracker project if there’s something you’d like to see added, removed, or changed.

For the sake of examples consider the following class hierarchy: Athlete, Footballer, and Defender.

Implementing this with pseudo-classical inheritance would look something like this, assuming the implementation of Object.extend() from here (I’ve had bad luck with the __super attribute in the past, but let’s assume it works for the moment).

This creates relationships between constructor functions and prototypes that look something like this (if you’ll excuse the ASCII art):

This should look familiar; it’s exactly the pattern that Ruby uses for basic inheritance, with the prototype objects representing instances of Ruby’s Class object, and prototype attributes representing Ruby’s Object#class (horizontal) or Class#superclass (vertical) attributes. The lookup rules work the same as Ruby as well: start with methods on the instance (in Ruby these would be methods on the instance’s singleton class); if not found, look for a method on the instance’s class; if not found, look on the Class instance’s superclass; repeat until found or you run out of classes/prototypes.

This looks comfortable and familiar, but unfortunately JavaScript and Ruby have a fundamental difference: Ruby is a class-based language and JavaScript is not. Object instances have instance-specific state — instance variables — which the instance methods use. However, while the instance variable are attributes of the instance, the instance methods live on the class/prototype object. This isn’t a problem in Ruby, because the interpreter knows what’s going on and gives the methods defined on the class access to the instance variables defined on the instance. JavaScript has no such capacity, so all instance variables in these pseudo-classes must be public attributes of the instance. This attempt to create object-oriented behavior by approximating classes actually ends up precluding encapsulation, one of the fundamental aspects of object-orientation.

A different approach to creating our classes could look like this (with a nod to Douglas Crockford). There are a few things I prefer about this approach:

Privates are private. Implementation methods such as #shoot and #feignInjury are hidden from the outside world. The same can be done for instance variables.

Object definitions have a clearly defined structure: instance variables at the top of the function definition, ending with the declaration and definition of self; public instance methods next, ending with the return of self; private instance methods at the end, where they belong.

Object definitions are nicely contained. In the first example the methods are defined at the top level, outside the constructor definition. In this second example everything that defines the class is nicely contained within a single function scope.

No dependence on, or pollution of the global namespace with, a method such as #extend.

I’ve heard brought up a couple concerns with this style:

Methods are, necessarily, defined on the object rather than the prototype, which can lead to duplication and inefficiency. This is a fair point, but I have yet to see it cause a problem. I’d be interested to see actual numbers that show how much of a performance hit this causes, given a certain number of object creations. In the meantime, I haven’t noticed a performance problem with code written this way, so I’m inclined to prefer encapsulation over theoretical performance issues.

Objects defined this way do not properly set their constructor attribute upon creation. Some test methods (notably Jasmine’s any method) depend on this.

Closures can be hard to grok.

We’ve recently finished up a reasonably sized project largely using this functional style of object definition, and it worked quite well. I’m sure there’s some reason that it shouldn’t have that I’ve overlooked; I’m curious to hear what that is.

One of our clients recently presented a demo of a new web application we built for them to a potential customer. This customer happens to be a large corporation who enforces, as large corporations will do, the use of IE6 on all of their corporate machines. When we found this out we feared the worst: many tortuous hours of work twisting the fabric of good sense and CSS standards to eke a sensibly rendered site out of the maw of the IE6 beast.

Imagine our relief and surprise when the head of corporate IT for the potential customer not only decided that the application was essential for the organization, but that to support it they would eliminate the use of IE6 on their corporate machines.

I recently had an opportunity to work on a relatively high-profile project with a crazy timeline. A coworker and I spoke with the client for the first time around lunchtime on a Friday, and the client needed a completed website, complete with a relatively sophisticated design (which had, at that moment, not yet been delivered by the designer) and a relatively sophisticated data model, by mid-day the coming Monday. That’s 72 hours to build and skin a fully functioning website.

I realize that “fully functioning” doesn’t tell you much about the scope of the project, but we live in a world of non-disclosure agreements. Suffice to say, it was a non-trivial amount of work.

Keeping in mind that the two of us had already worked 36 hours of our 40-hour work week on other projects, we agreed to take on the project. Between the two of us we worked about 60 hours that weekend, most of it solo, and had the site finished at start of business Monday morning. This might not seem particularly heroic (particular to anyone who writes software for the game industry), but keep in mind that at Pivotal we believe strongly in the concept of sustainable pace; we really do work eight hour days, five days a week. People working late at Pivotal is relatively unusual; people working weekends is almost unheard of.

The thing that struck me about this project was that I enjoyed it. I had a little rush of excitement when I agreed to give up my weekend and work late into the night as necessary. Even as I was doing it, bleary-eyed and mentally dull, I felt the high of accomplishment.

I spent the next several days trying to get my sleep schedule back to normal, and feeling generally tired and worn out. At the same time the two of us on the project spent a fair bit of our time, unsurprisingly, regretting some of the decisions we had made while working late into the night. Also during that week I came to something of a startling realization: I’m an addict.

I knew when we agreed to work on the project that the amount of sleep I would lose would make me miserable. I also knew that putting in a bunch of time over the weekend would mean starting work the next week without a day off, without time to do the non-work things I need to get done. More directly, while I was working on the project when I hadn’t taken a break all day and hadn’t slept the night before I knew my focus was poor, I knew my decision-making skills were compromised, and I knew I was prone to cutting corners I shouldn’t have cut; I consciously thought these things to myself at the time. And yet, I continued to work, without taking a break.

I don’t claim to be an expert on the psychology of addiction, but I do know that I’ve met a lot of people with drug and alcohol problems in the past. Lots of them talk about how they know that drinking/shooting up/huffing/whatever is hurting them, and hurting their friends and family, but they do it anyway. Because they’re addicted.

How many programmers work past the point of fatigue at which they can continue to make effective design decisions, to clearly think through scenarios, or to correctly differentiate between a reasonable trade-off and a regrettable hack? How many do so when they know their judgement is compromised, because they’re getting their fix of problem solving?

Proponents of “flow,” like Joel Spolsky (whose credentials for “writing great software” appear to be a summer internship at Microsoft and a silver tongue) will tell you this is a good thing; starting the flow of a great programmer takes so much effort you want it to go on for as long as possible. I’m not convinced; I know when I’m sharp and when I’m not, and I don’t even trust myself to stop the magic when I’m not doing my best work. And, this is a bit like an alcoholic explaining the health benefits of vodka.

Addicts most commonly manage their addictions by talking with other addicts. Maybe that’s an as yet unheralded benefit of pair programming: someone to cut you off when you’ve had one too many.

Nested attribute assignment is one of the recent additions to Rails that made a great deal of sense, and made a lot of people happy. Chances are you’ve either used nested attribute assignment by now, or you worked on an older project that really could have used it. If you haven’t yet, check it out and see what you think.

Unfortunately, not all is well in Railstown. Nested attribute assignment is slick, and the related implementation of #fields_for makes it even slicker, but #fields_for can cause you some headaches if you’re not careful. Possibly if you are careful as well.

Consider a standard example of where you might want nested attribute assignment:

This seems fine, but when you look in more detail you discover that #fields_for will emit a hidden <input /> element for each cat associated to our cat lady. It does this at the point where you make the #fields_for call, much like the way #form_for emits the <form /> element. Unfortunately, that means that #fields_for emits the <input /> element as a sibling of the <tr /> element for the related cat; and thus, as a direct child of the <tbody /> element. Oops. The HTML standard doesn’t allow <tbody /> elements to have <input /> elements as children.

Most browsers won’t complain about this, but Safari 4 will (and so, I’d guess, will any other WebKit-based browser, like Chrome). Safari not only complains, it helpfully moves the <input /> element to a valid position. So, instead of this (you’ll have to imagine a bit; the markdown renderer for this blog is actually modifying my invalid HTML example to try to make it valid):

Seems innocuous enough, doesn’t it? After all, it’s still inside the form, so the browser will still submit the value along with everything else. However, the HTML that you sent to the browser is still invalid, and Safari still spits out the errors, which is probably not the best way to gain your users’ confidence. Also, any JavaScript you’ve written that depends on the DOM structure you lay out might fail, but only in some browsers (and not, for a change, only in IE!).

Now, someone will point out that you can solve this problem by not using tables. True, but that solution has two drawbacks: first, it’s entirely reasonable, even potentially very desirable, to use a table for this type of data; second, the hidden ID input will end up outside whatever container element you create for your nested model. This may not generate invalid HTML, but it may generate conceptually improper HTML. For instance, what if we change the above HTML to look like this:

It doesn’t take too much imagination in the drag-and-drop Web 2.1 world to come up with some form of DOM manipulation that will dissociate the cat div from its associated ID element. And, of course, if the server receives the nested cat attributes without an ID it will helpfully make a new cat model. We don’t want this; crazy cat lady has enough cats already.

So, what to do?

We knocked around some ideas, and the most reasonable seems to be to add the capability to manually insert the hidden ID field (and, potentially, the hidden _destroy field) to the form builder object created by #fields_for. So, the #fields_for block from the edit form above would look something like this:

It’s also possible to automatically determine if the block for each nested model called the #hidden_fields method, which would obviate the need for the explicit option; I haven’t decided if I like that approach.

I’m open to suggestions for better fixes, or tweaks to this one. In any case, look for a Rails patch for this some time in the coming week.

One of my favorite computer games when I was growing up was Robot Odyssey; I imagine it will come as a surprise to no one that I was a nerdy kid. This article is a little bit of a tribute to that game, and the coolness of solving complex problems with a handful of simple concepts combined in clever ways.

Imagine you want to write a web-based game that involves robots. The robots in your game are a bit like the robots in Robot Odyssey: you program them with a list of simple instructions and when you turn them on they follow those instructions faithfully. Let’s say, for the sake of argument, that your robots can Walk Forward, Turn Left, Turn Right, Jump, and Beep.

You model each of the moves your Robot can make as an Action, and a Program as a list of Actions. Since each Robot can execute any number of Actions as part of its Program, and any number of Robots can execute Actions, you join Programs and Actions through the Instructions table. Like so:

You build a whiz-bang interface for programming each Robot using JavaScript, which will PUT a collection of Action IDs to programs#update (each Robot creates a blank program on initialization, so no need for programs#create).

This seems pretty straightforward, doesn’t it? Unfortunately, it won’t work; the update will fail to reliably record the uploaded Actions. Here are some examples:

What happened here? The problem lies with the way ActiveRecord collection associations update themselves. Here is the interesting code (slightly paraphrased) from association_collection.rb:

# Replace this collection with +other_array+
# This will perform a diff and delete/add only records that have changed.
def replace(other_array)
...
delete(@target.select { |v| !other_array.include?(v) })
concat(other_array.select { |v| !@target.include?(v) })
end

As you can see (or you could just read the comment), this only adds an element to a collection if that element isn’t already in the collection (and it acts similarly for deletion). Sadly, it doesn’t take into account how many times an element appears. So, above, when I tried set the action_ids to [5, 5, 5] it saw that the action_ids collection already contained that ID and moved on. If I had set the action_ids collection to [5, 5, 5] when it already contained [1, 4], the result would have been the expected [5, 5, 5].

Now, I’m on the fence with regard to whether I consider this a bug or just a somewhat inconsistent, but expected, behavior. To begin with, the fix is annoyingly nontrivial, and would potentially have a noticeable performance impact. Far more importantly, I’m not sure how often this might reasonably cause a real problem. In the case of your Robots game, you’d probably care quite a lot about what order your Robot executes its Actions in, so you’d likely have a position column, or something similar, on the instructions table. Given that, you’d probably update the Program by sending in a list of nested attributes which would create Instructions, each with the correct position and associated to the correct Action.

Even so, this is behavior worth knowing about. In the infinitude of possible update scenarios someone will want to update their HMT associations this way. I know I initially wrote code to do this as a temporary experiment, and ended up spending the rest of the day trying to figure out why my updates did the wrong thing a small percentage of the time.

A few days ago I finally discovered why rake db:migrate:redo consistently angers me nearly as much as watching Paula Dean deep fry the vegetable kingdom. As any devoted connoisseur of the db rake tasks in Rails knows, db:migrate:redo always leaves your schema.rb file in the wrong state. The reason, as mentioned in our standup blog, is that rake will only invoke a given task once in a particular run.

To trivially test this try running a single task twice:

rake db:rollback db:rollback

You’ll find that your database only rolls back one migration. Now, you can set the STEP environment variable when calling db:rollback, but this is, as I said, a trivial example. It gets worse.

Take a look at the implementation of the db:migrate:redo task. The part we’re interested in looks like this:

namespace :migrate do
task :redo => :environment do
...
Rake::Task["db:rollback"].invoke
Rake::Task["db:migrate"].invoke
end
end

That looks fine; db:migrate:redo just verifies that your new migration will properly run down and up without blowing up. Sweet.

Both db:migrate and db:rollback dump the schema after they run, as they should. If you were to migrate or rollback your database and not dump the schema, then your schema would be in an invalid state. So, of course you can see where this is going, when you run db:migrate:redo the task performs the rollback, dumps the schema, performs the migrate, and then doesn’t dump the schema, because that task has already run. Boom, your schema is one migration behind, db:test:prepare loads the invalid schema into your test database, and all your tests fail (or, worse, pass inappropriately)

Now, I assumed this was a bug in Rake, and so I went on a little investigatory safari through the jungles of the Rake code to find it and kill it. I found the culprit, but invoking each task at most one time is, somewhat surprisingly, the expected behavior; it’s tested and everything. Now I can only wonder why. Why prevent invocation of a task more than once in a given rake run? The code contains unrelated guards against circular task dependencies, so that’s not it. Is this an example of overly-speculative defensive coding, or is there an actual use case for which this behavior is desirable? I’d like to hear from anyone who has written tasks that depend on this behavior, as well as anyone who (like me) considers this behavior unexpected and has run into problems because of it.

Assuming no one steps forward with a compelling reason that Rake should behave this way, I’d suggest that this be changed. I could see the value of it (perhaps as a performance optimization?) if rake tasks were guaranteed to not change the state of anything they operate on, or even were guaranteed to be idempotent; but neither is the case. This behavior severely limits the composability of tasks, since a task writer has to know which atomic tasks have run, and avoid any task that might try to run them again.

In the meantime, Rake provides a way to explicitly re-enable tasks that have run once, but it doesn’t seem to work. The db:schema:dump definition looks like this:

namespace :schema do
task :dump => :environment do
# Do dumpy stuff
Rake::Task["db:schema:dump"].reenable
end
end

That #reenable call is meant to tell the task “hey, task, you can run again.” I tried calling #reenable on the db:schema:dump task inside the db:migrate and db:rollback tasks as well, but without any luck.

This is a bug in Rails that quite likely affects you, but which you’ve even more likely never experienced. I’ve posted it here for the benefit of the small number of people who will run into this problem and turn to Google for help.

In short, if you use Mongrel app servers (this may affect Passenger as well, I don’t know), the first HTTP request to your Rails app after you restart your servers, or otherwise reload your environment, will have an empty HTTP body.

I say you’ve likely never experienced this because the majority of HTTP requests to your Rails app are likely GET requests, which always have empty HTTP bodies. After that first request everything will work just fine. Even if you’re unlucky enough to receive a POST or a PUT request containing a body immediately after restart it will only fail once, which you could easily write off an an anomaly. You also won’t see this behavior in your development environment, or any environment in which you use Mongrel as a web server rather than just an app server.

If you’re interested in a patch for the bug, I’ve submitted one to Rails here.

The source of the problem lies in how ActionController initializes itself. In the actionpack gem you’ll find the lib/action_controller/cgi_ext.rb file, which does little more than load the three files in the cgi_ext directory:

The interesting line is the one I’ve marked with a comment rocket. Notice how it reads from stdinput; this leaves the read pointer at the end of the input stream. Now look back at the Rails override for this method, and notice how it does not read from stdinput, thus leaving the read pointer at the start of the input stream.

This is all fine and dandy as long as all of the ActionController code loads up and patches the CGI library properly. However, ActionController doesn’t load the cgi_ext.rb file (or its dependencies) until it references either the CgiRequest or CGIHandler classes (which require cgi_process.rb, which require cgi_ext.rb), as part of the first request, which is after the default Ruby CGI library has read the input stream containing the request body. ActionController then tries to read the request body assuming the read pointer is at the start of the stream. Oops. Subsequent requests work fine, because everything has now been loaded.

Finding the source of this bug took some doing (Chris Heisterkamp, aka “The Hammer” and I tracked it down together), but the fix is easy. If you look at the patch you’ll see it’s simply a single require in action_controller.rb. You can achieve the same result by requiring ‘action_controller/cgi_ext’ in an initializer file in your app.

Like many problems, this one should go away in Rails 3. Rails has deprecated use of the CGI library, and the CGI extensions have already been removed from the Rails master branch. However, it’s a real problem now, and will remain so for at least some amount of time.

Imagine, if you will, that you’re a bookseller. You sell books. Big books, small books, serious books, silly books; if it’s got pages and a cover you’ll sell it. Times being what they are you’ve decided to harness the power of the intertubes to sell your books (a novel idea; ho ho ho). In fact, you’ve decided to build a website, and to expose an API with which your business partners can sell books through their websites. Huzzah.

As it turns out you’re an accomplished Rails developer as well as a thriving bibliophile, so you get to work. Fortunately, you thought ahead and already have information for all of your books in a database. Being well read as you are, you choose to make a RESTful Books resource to show off your books. Any customer can check out a book of their choosing by navigating their browser thusly:

http://www.amazzahn.com/books/1

Huzzah again. Sort of.

You’ve heard about this SEO thing, and you hate how ugly that URL is, so you override #to_param on your Book model to return a nice looking slug. Now that URL from above looks like this:

http://www.amazzahn.com/books/stickwick-stapers

You go about your business, quite pleased with yourself, until you receive a phone call from one of the business partners who use your API; it seems they can no longer look at books through your web service.

Here’s the problem: they’re using ActiveResource to consume your RESTful interface. To get a catalog of books they call Book#find(:all), which executes a books#index request. This returns some XML looking like this:

Now, if they’re interested in Stickwick Stapers by Farles Wickens they call Book#find(1), which returns a 404 error. Oops, of course it does, you’re not looking up books by their database ID any more, you’re looking them up by their URL slug. Your customer needs to call Book#find(’stickwick-stapers’).

Unfortunately, your book XML doesn’t include the URL slug, so your partners are in a bind. Back to work. You change the #to_xml method for your Book model to return something that looks like this:

After all, the consumers of your API aren’t really interested in the database ID; or, they shouldn’t be. All is well again, until you get another phone call. It seems now your partners can no longer purchase books through your service.

You’ve exposed the Purchases resource for your partners who want to buy books. A purchase involves simply POSTing to this resource with the ID of the book you want to buy and a quantity (you handle payment offline using a complicated barter system). The POST body looks like this:

OOPS! ActiveRecord doesn’t expect the URL slug for the book, it wants the database ID.

Well, crap. This is a big problem, and one that has no particularly satisfying solution. Here are the candidates:

Send both the database ID and the URL slug in the API, and try to educate all of your API consumers about when to use one vs. the other. Get ready for some serious customer support time.

Override the #book_id= method in the Purchase model to expect a URL slug for the book. Unfortunately, the web site you developed, at great expense, has all sorts of drop-downs and the like stuffed chock full of book IDs. Changing all of that would be a significant expense, never mind the bugs guaranteed to creep in as developers consistently forget that #book_id= doesn’t actually take an ID.

Write the #book_slug= on the Purchase model, and ask your API uses to start using this method instead. Unfortunately, this means changing the web sites that they have developed, at great expense. You just cost them money, never mind the bugs guaranteed to creep in as developers consistently forget that the method to set the book ID is #book_slug=, not #book_id=.

Stop using those silly slugs and just go back to database IDs. Integers are really quite beautiful, aren’t they?

This little ditty is just an example of a fairly serious problem with Rails:

Sometimes we reference domain objects by their database ID (when creating associations), sometimes we reference domain objects by their URL representation (when finding objects in a controller), but in both cases we call the reference that we use the ID.

ActiveResource is an obvious example of the problem. It expects that the XML it receives for an object will have the <id> attribute, and it uses this attribute to build the URL for that object.

ActiveRecord, whether by intention or not, further enforces this fallacy with the unfortunate convenience that the default implementation of #to_param is simply id.to_s, and that #find_by_id will accept an integer, or a string, or even a string that starts with an integer[1]. So, oftentimes when a project chooses to start using something other than database IDs for URLs the code has a confusing mishmash of methods that use the two interchangeably. Have fun picking that apart.

So, what to do about it? The Rails conventions are largely set in stone, after all, it’s not likely the names of these references will ever change. But, we can be smarter about how we use them:

Stop using #find_by_id in controllers. After all, you’re more than likely not looking for anything there by the database ID. I like the find_by_param plugin as a nice little helper for this. It gives you the #find_by_param and #find_by_param! methods, which you should use in your controllers. It also gives you methods for easily creating URL slugs, but you don’t need to use those until you want them.

Stop writing broken tests. Every time you pass an ID to a routing parameter in a functional test you’re testing a lie. Your tests will pass with the default ActiveRecord behavior, but if you ever decide to override #to_param (most likely after you’ve written about 700 tests like this), they’ll break. My experiences dealing with just this problem on client projects was no small part of the reason I wrote the Wapcaplet plugin and this Rails patch.

Know what you mean and say what you mean. The fact that Rails got this wrong just means that you have to pay closer attention when referencing anything by ID.

Let me know if you come up with any clever solutions.

[1] Rails will treat any string that starts with a database ID the same as the database ID itself in many cases:

Years ago, after I finished college but before I started working professionally with software, I spent a couple years working as a paramedic. I learned a lot from that job, not least about interacting with people who really, really don’t want you in their lives.

One of the calls I remember most vividly happened around three in the morning, not long after schools had let out for the summer. A group of recently graduated high school girls had rolled their Ford Bronco on the highway. When we arrived an engine company was on scene, busily cutting the remains of the car into fun size Bronco strips. I followed the trajectory implied by the hole in the windshield and found my patient, the driver, on the pavement some distance from the car.

While I set about preparing to package her up for a quick trip to the hospital another engine company arrived. As I started my cursory physical exam the lieutenant rushed over and demanded I stop. To understand his reasoning you have to realize that the process of emergency medicine affords little or no dignity to trauma patients: life comes before limb; modesty comes well down the list.

So, what the fire lieutenant objected to was that I was cutting the clothes off a sixteen year old girl in the middle of the highway, directly illuminated by the halogen scene lights from our bus. He demanded that we package and transport the patient fully clothed. If you haven’t worked with firefighters, know this: they travel in packs. My partner was hanging IV bags and working the radio for another bus, so it was me, a supine patient, and five firefighters. They got what they wanted.

Of course, to protect trauma patients’ spines you have to package them fairly thoroughly. You basically strap them down to a six foot board so they can’t move, and once you finish it’s pretty much impossible to get their clothes off or do a half decent physical exam without jeopardizing their spinal cords. Which means, as the attending medic, when we got to the ED I was the one who had to explain to the trauma surgeon and ED physician why I was handing over a patient who could have had a piece of windshield glass the size of a grapefruit sticking into her kidney for all I knew. Not a shining career moment.

I remember that call not because the patient was badly hurt (just some broken bones; she was lucky), or because I made a huge difference in someone’s life, but because of the lieu’s argument for not doing the full, expected, physical exam. As his minions packaged my patient like gift-wrappers at Macy’s, and my partner made a break for the driver’s seat, he told me these exact words:

“She’s suffered enough trauma. She’ll be okay.”

Now, I knew at the time he was quite likely right (her chances having escaped major injuries were actually better than you might imagine), and it’s actually quite difficult to explain to patients why you have to cut their clothes off (try it some time), poke them, shock them, or tie them down. It’s very, very tempting to do the easy thing in these cases, and 99 out of 100 times if you avoid making the patient, or yourself, uncomfortable they do fine. But, one out of 100 times the patient dies from a ruptured spleen or flash pulmonary edema. That’s the worst case scenario; and that’s the only scenario medics really care about.

Now (finally on to my point, dear reader), I recently submitted a patch to Rails that I, and many of my colleagues, believe will help prevent invalid functional tests, and therefore prevent bugs. The response from the Rails core team:

The vast bulk of tests just pass the id in those places and they’ll work fine. Users overloading to_param are in the minority and we shouldn’t spam everyone else just to satisfy them.

Ignoring for a moment the somewhat unscientific characterization of the prevalence of #to_param overriding, the message is that the patch is potentially annoying for people with improperly written tests and most of those improperly written tests probably won’t be a problem. It’s more comfortable for the Rails team to let broken tests slide and hope all will be well, than to bother developers over the cases in which broken tests lead to broken production.

I realize that comparing HTTP 500 responses to ruptured spleens might seem overly draconian. But, consider this: while the consequences of doing the wrong thing in software are much less dire compared to medicine, it’s so much easier to do the right thing. If you doubt this, drop me a line; I’ll be happy to nasally intubate you. I guarantee after that experience spending half an hour fixing your controller tests won’t seem so bad.

Yesterday I wrote about Wapcaplet, which is really little more than a Rails patch that didn’t get accepted, but that some of us think Rails actually quite needs. To that end I submitted a second patch, which does the same thing but, by default, outputs a warning rather than raising an exception. I also included some methods for modifying the behavior on ActionController::TestCase. Specifically, if you want to ensure your tests aren’t broken:

ActionController::TestCase.treat_parameter_type_warnings_as_errors

Or if you, like Pierre, don’t care:

ActionController::TestCase.ignore_parameter_type_warnings

I don’t know if these changes will make the behavior of the patch palatable enough for the core team to commit it. We’ll see. After creating the ticket I considered pulling the new behavior back into Wapcaplet; I’ve decided not to for a few reasons:

First and foremost, no one pays attention to warnings. I can’t count the times I’ve preached myself blue about eliminating compiler/interpreter warnings, to little or no effect. I recently broke the builds for several projects by deleting a method that had been deprecated for a year and half, and which generated a fairly annoying deprecation warning on every build for every project that used it (keep in mind that at Pivotal projects will build many times a day).

Any patch applied to Rails will affect every Rails project that upgrades. I believe people should fix their broken tests, but I accept that this change will break a lot of tests. I can accept warnings as a way to show people what may be broken without bringing the world down on their heads. Wapcaplet, on the other hand, is entirely opt-in; no need to handle users with kid gloves.

I believe that a test failure is the right behavior. We’re talking about broken tests, they should act that way.

Imagine for a moment that you run a big, important company. It’s important to you that your big, important company be successful at promoting, manufacturing, and distributing your big, important product, so you have decreed that the company must show a profit each and every quarter. In fact, your internal accounting software enforces this. For example:

Ignoring the somewhat misguided domain requirements, this test is wrong because it probably won’t fail when it should. It’s an example of a problem in Rails controller testing that bites everyone sooner or later.

The problem is that HTTP requests don’t send their parameters as integers, or booleans, or Date objects. No, HTTP request parameters are just big piles of strings. Rails does a good job of hiding the process of converting these strings into the integers and booleans and Date objects that make sense in your domain, but ActiveRecord handles that little bit of sleight of hand (using the column types from your database schema), not ActionController.

So, when you execute that test up above, the value of params[:quarterly_report][:net] will be the integer value -100, which is a value that the controller will never receive from a real HTTP request. This test fails to test a real case.

Now, if you try to use this value as an integer, either in the controller or by overriding #net= in the model, the test will still pass. But, as soon as you send a real request to the controller (hopefully not in production) you’ll find yourself on the business end of a 500 response. The test is broken.

In order to prevent this sort of brokenness I wrote a small patch to Rails (available here), which was summarily, and I will admit not unexpectedly, rejected. After that, I turned the patch into a tiny plugin called Wapcaplet (available here). It simply checks the parameters you pass to functional tests, and throws a friendly exception if you pass something with an inappropriate type. It accepts strings and instances of TestUploadFile in all cases. For parameter that Rails uses for routing it will also accept subclasses of ActiveRecord::Base.

Incidentally, notice that this will catch the pathological case, which seems to afflict every Rails developer, in which you pass #id rather than #to_param for a routing parameter.

In anticipation of the misguided comments that I know will come, no, it would not be better to have the plugin (or patch) call #to_s or #to_param on every incoming parameter to functional tests. Consider the example of boolean values:

Neither #to_s nor #to_param return a value likely to appear in a real HTTP request. It doesn’t take much imagination to come up with other examples of types that would not convert to meaningful request values. Worse, it takes only a little more imagination to come up with a scenario in which the implicit string conversion in the test would create subtlely wrong behavior that would be an enormous pain to track down.

So, take Wapcaplet out for a spin, I hope it saves you some time. Special thanks to Parker for getting bit by this problem enough times to get angry and demand I fix it.