This is a belated follow up to my previous couple of posts about taking FireWatir (or rather Jssh) further than simply accessing and manipulating the DOM. They were following the train of thought that since Firefox extensions have very deep access to the internals of the browser; they are written in JavaScript; and given that Jssh is just an extension, then the code you get Jssh to execute is going to be executed in the extensions context, and should allow you to do anything an extension can do.

Extensions are written in JavaScript and can have user interfaces, so the UI must be scriptable. And that would be XUL (think Ghostbusters), the XML User Interface Language, according to Wikipedia. It’s an XML dialect, very much analogous to HTML, but focussed on the user interface, not the page.

And Firefox exposes a DOM to manipulate it. You can iterate over the open windows using Jssh’s getWindows() function and examine the location. If it happens to start “chrome://” then the document property of that window is an XUL DOM.

Here’s a bit of script to automate the download save dialog, given the window from getWindows():

This finds the “save” radio button element, clicks it, then clicks the accept button. Looks just like scripting the HTML DOM, doesn’t it?

Go to the C:\Program Files\Mozilla Firefox\chrome folder. Make a copy of toolkit.jar and rename it to toolkit.zip and extract the contents. Go to content\mozapps\downloads and there you’ll find the downloads.xul file with accompanying scripts and css. This is the content required to display the file download dialog. Open it in an editor to get an idea of what the object model will be like. And if you check for the location of the file download dialog from the getWindows() function, it’s at chrome://mozapps/content/downloads – so it’s fairly easy to find the appropriate xul file for the given location.

One very important point to make here: there is only one carriage return. Jssh prints out a prompt for each carriage return, and you need to do a read_socket() for each. Save it, and just do it once. This took me a while to figure out.

In fact, you could do a lot worse than taking a look at Ethan’s work on JsshObject which is some very nifty Ruby meta-programming that allows you to effectively write the JavaScript directly in the Ruby code. Very cool.

(The very observant among you might notice that I’m enabling the button before I click it. For some reason, the accept button is disabled unless the window has focus. I really can’t think of a good reason for this.

OK, so the last post was a bit heavy. Er, detailed. The upshot is that FireWatir, a Ruby library for automating Firefox via the JSSh (JavaScript Shell) extension can get access to the same APIs used by JavaScript Firefox extensions, allowing for some very low level automation of Firefox. The example in that post showed how to send arbitrary extra headers when navigating to a page:

I thought it might be useful to show another example, without having to explain the background. If you’re interested in details, read the last post.

Right. Let’s automate Firefox’s preferences. Again, searching for what extensions do, they use the nsIPrefBranch interface to get and set preferences. We’ll do the same.

Instead of using the createInstance method to get the interface, we’re going to use getService (presumably because the preferences service object is already in memory, and we want to get a handle on that, rather than a new instance). In Javascript:

These can be used to make sure that your test environment’s version of Firefox is correctly setup before using. And if you’re wondering what to use for “key”, then take a browse through Firefox’s about:config (then Google it to see what it does). Here’s what I’m using at work:

browser.helperApps.neverAsk.saveToDisk. I didn’t even know this one existed, but it’s great. This is a comma separated list of mime types that will always get saved to disk, so again, I know that I won’t get stuck at the confirm download prompt. (This is my favourite – previously, we’ve had to set these up by hand by actually downloading an instance of the file. I’m very happy to be automating this.)

There’s no delicate way to put this, so I’m just going to have to go ahead and say it; at work, we write automated acceptances tests in a BDD style, using cucumber and Ruby, and using FireWatir to automate Firefox.

Nothing controversial there, it’s just not the most thrilling of opening sentences.

This all started with the need to send a custom HTTP header to a web page we were testing. Now, we could write that using Ruby’s Net::HTTP module, but that would require also writing stuff to manage logins and cookies, and frankly, I’d rather let Firefox handle all that. It just needs to be convinced to send that extra header.

Now, FireWatir has a very interesting implementation. It uses an extension for Firefox called JSSh - a JavaScript Shell (which seems rather tricky to download – try the FireWatir download site). This extension starts listening on a socket, to which you can send raw JavaScript, which JSSh evaluates and returns the output. It exposes a fairly simple API that allows you to e.g. enumerate open windows, navigate to web pages, and access the DOM. FireWatir exercises this quite extensively.

Firefox really likes JavaScript. After all, most extensions are written in JavaScript.

And that right there is the key.

Firefox extensions are written in JavaScript, and they can do *anything* with the browser. So, I just have to figure out how an extension would add a header to a request. One quick Google later, and I’ve found nsIWebNavigation.

Firefox is written to be astonishingly componentised. It is implement on a platform called XPCOM (cross-platform COM, very much like Microsoft’s COM) that exposes a huge amount of functionality. All of which is publicly accessible to those JavaScript extensions.

And nsIWebNavigation is an XPCOM interface all about navigating the web browser, and one method it exposes is loadURI(url, flags, referrer, postData, headers). This method allows me to add custom headers to a navigation request – exactly what I’m looking for. I now have two small problems – how do I get my hands on an instance of nsIWebNavigation, and how do I marshal up the data into the headers parameter?

The first question was answered through a bit of trial and error. If you telnet into JSSh (telnet localhost 9997) you can start investigating your surroundings. You’ve got an interactive shell from which you can run commands, such as getWindow(). But if you type getWindow without the parentheses, you get the JS listing of the function definition. One such example lead me to discover that the higher-level API that JSSh exposes for navigation simply calls into the webNavigation property of the browser variable. And this property is an instance of nsIWebNavigation.

The second problem was solved by Google. I need an instance of nsIInputStream to represent the headers. Turns out I can create an instance of nsIStringInputStream, set the data from a JS string and pass that to loadURI. All of which gives us the following JavaScript:

The first line looks like some heavy magic. I won’t confess to knowing exactly what’s going on here, but it’s safe to assume that it’s creating an instance of a named XPCOM class and returning the nsIStringInputStream interface implemented by that instance. The second line sets the data into the object, pushing in a string (the loadURI docs state that each header must be separated by a carriage return/line feed pair) and the length of the string. The headers variable is then passed to loadURI, and ta-da! the browser is now navigated to the given URL, and the custom header is sent.

Of course, that’s just the JavaScript. We need to be able to use this from Ruby. I opened up the Firewatir::Firefox class and added:

Note that the carriage return/line feed escape characters and the quotes in setData have been escaped – we’re writing Ruby that is going to be writing JavaScript. And as the comment says – call it with an array of headers that you’d like to pass to the server.

So there we have it. FireWatir, via JSSh, has a much larger API available to it, thanks to XPCOM. This opens the door to some very interesting possibilities.

I’m aware this is quite an exposition heavy post, so I’ll do an executive summary with a few more examples (would automating Firefox preferences be useful?) and then we’ll get really advanced.