Download and XML parsing with HotCocoa

08 Feb 2010

I’ve been working on
Rife, a Google Reader
client, over the last few days and have been digging my way through some more
HotCocoa mappings. I figured the best
way to remember some of this stuff is to write it down so, the following will
look at synchronous and asynchronous downloads and writing a XML parser in
HotCocoa/MacRuby.

So, what are we creating you ask? Well, as I said, I’ve been playing with Google
Reader APIs and we’re going to do a synchronous request to Google for an
identifier token. Once we’ve successfully authenticated we’re going to make an
asynchronous request for our unread items. We’ll then parse the resulting XML
document and spit the titles out to the console.

As with any HotCocoa application the easiest way to get started is to have
system setup the shell of our application. We’ll use the hotcocoa command to
create our application which I’m calling titles.

titania:example dj2$ hotcocoa titles

In order to authenticate to Google we’re going to need your username and
password. Since I’m going to do the output to the console for demonstration
purposes I’ll use the main application window to show the fields for username
and password and a save button.

If you run the application by executing macrake in the titles directory
you should see the main application window. Typing something into the username
and password fields and pressing save you should see something similar to
the following in your terminal.

DOAUTHtesttest

With our framework setup let’s get to the interesting stuff. First up,
authenticating so we can retrieve our identifier from Google. We’re going to
make a synchronous request to retrieve the identifier and, if successful, call a
method to start retrieving our reading list.

If you add the above to your application, and add require 'cgi', you should
be able to run the program, put in your username and password and get a long
line of characters spit out on the terminal. Those characters are your Google
SID.

Let’s look a bit closer at what we’re doing in the authenticate method. We
start by grabbing the stringValue for the username and password fields.
Then, using these values, we build the query string needed for authentication.
This query string is used to build a URL object by calling
NSURL.URLWithString(query). With the URL in hand we can start building our
request object. This is done by calling
NSMutableURLRequest.requestWithURL(url). I’m using the mutable version of
the request as I want to add a few extra header values. These are both added
with addValue(value, forHTTPHeaderField:field).

When we execute our request the system is going to want to put our response
object somewhere. In the Cocoa version the method accepts a NSURLResponse
\*\*response parameter. In order to handle the response we need to create a
Pointer object which is a MacRuby object for handling these pointers to
objects. We want our pointer to point to an object so we use
Pointer.new("@").

With the response setup we call NSURLConnection.sendSynchronousRequest and
provide our request and response objects. I don’t care about the error, but if
you do, you’d want to pass in something similar to our response pointer. The
request will return a NSData object which we convert to a string using the
initWithData initialization method of NSString.

With the string in hand we try to extract our SID and, if successful, execute
the retrieve_reading_list method which just spits out the SID.

OK, cool, we’ve now got our authentication token and are ready to move onto the
asynchronous request to get our reading list.

Similar to the synchronous method we start by building our query string,
NSURL and NSMutableURLRequest. We’ve added a cookie to our request
object to hold the SID retrieved earlier from Google.

We fire the request by calling NSURLConnection.connectionWithRequest(request,
delegate:self). We specific ourselves as the delegate for the connection.
There are a few delegate methods we can implement to receive data and get
notified of request states. These are:

connectionDidFinishLoading(connection)

connection(connection, didReceiveResponse:response)

connection(connection, didReceiveData:data)

We’ll look at our implementation of each of these callbacks in turn. First, in
connectionDidFinishLoading(conn) we’re just printing out the data retrieved.
We need to convert the data, similar to what we did in the synchronous request,
from a NSData object to a NSString object.

In connection(conn, didReceiveResponse:response) we’re just checking to see
if we got a 200 response code from the server. In all other cases we print an
error.

The main work is done in connection(conn, didReceiveData:data) where we
create a NSMutableData object if needed and append any data received into
the mutable data object.

Running the code at this point should dump the first two items in your reading
list to the console. The data will be a big mess of XML but we’ll look at
parsing that in the next step.

We’re finally getting into some HotCocoa specific code with our XML parser.
HotCocoa defines a mapping wrapper around NSXMLParser and provides a set of
delegate methods. These delegates mean we don’t have to set our class as the
delegate and create a bunch of methods. They mean we can attach our code as
blocks on our XML object. All the better if you want to define a few parsers in
one class.

We start off by creating a HotCocoa.xml_parser. The parser accepts
NSData objects so we don’t need to convert our response data to a string. We
then setup eight callbacks. There are actually a bunch more callbacks that can
be hooked up and you should look at the xml_parser mapping code to see if
you need any of them. For our purposes, we only really care about eight.

The on_start_document, on_end_document and on_parse_error callbacks,
as you can probably guess, get called when we start parsing, when we finish
parsing and when we receive a parse error, respectively. We don’t really care
about start in this example, but I put it in anyway. When we’ve completed
parsing we send a notification and other application code can then listen for
this notification and do anything it needs. If we wanted we could store the
entries as they’re parsed and provide them to the :object key. This would
make those entries available to anyone that receives the notification.

If we receive either CDATA, with on_cdata, or text, with on_characters,
we append the content to our current elements text. When we receive the open tag
of a new element, on_start_element, we dump our current element text as
we’ve started a new element. We can also take a look at the elements name,
attributes, namespace and qualified name, if desired.

Finally, in on_end_element we print out the current element text if the
element we’re finishing has a name of title.

With all the callbacks configured we use xml.parse to start the parser. You
should, if you run this example, see the titles and authors of the first two
posts in your reading list. (The author name is also called title and I’m not
bothering to check that the parent element is entry before spitting it out.)

That’s it. You can now make synchronous and asynchronous requests for content
and parse any resulting XML.

One last thing before you go. Both of the requests we did above were GET
requests. You can do other types of requests using the same methods as above you
just need a slightly different setup for the request. You can see a POST
request below.

The first few lines should look familiar from creating our asynchronous request
above. Since we’re going to be posting the data we use setHTTPMethod('POST')
to setup the request method. We’ve form encoded the data so we set the
appropriate Content-Type and set the Content-Length. Note, we convert the
length to a string before sending to setValue. Finally, we set the body of
the post with setHTTPBody. You need to convert the body string into a
NSData object which we do with the dataUsingEncoding method. If you
don’t convert the body to NSData you’ll end up sending a nil body with
your post request.