Ramblings on computers…

Tag Archives: libgdata

I’m at GUADEC! The conference has been great so far: a nice location, good organisation, interesting talks, and Czech beer. Thank you to the GNOME Foundation for sponsoring me.

I’ll be giving a talk tomorrow (Saturday, 11:35, room E112) about testing online services. It’ll be a short presentation about some work I’ve started recently on mocking web services so that unit tests for client code can be run offline. This is an area with quite a lot of potential, and what I’ve done so far has only just scratched the surface, so if you’ve got ideas about this kind of thing, please come along and we can have a good discussion.

Update: Thanks to all who attended the talk and gave their input. My slides are available online (and the source files are in the same directory).

From the Department of Bad Ways to Document APIs, referring to no service in particular:

Don’t refrain from changing and updating the specification after release. Consumers of your service will value API tweaks and improvements over stability.

Don't provide a changelog for the specification. Nobody reads them anyway, and they're such a hassle to maintain.

Similarly, don't notify consumers of changes to the specification by RSS or Atom, since nobody uses them anymore. They're all such avid fans of your service that they'll take the time to re-read the API specification every few weeks anyway.

Move the documentation around every few months and re-brand it. Shiny logos are cool, and maintaining redirects from old documentation locations is too hard.

Also don’t worry about moving your issue tracker around every few months, or closing all the old bugs every time. If nobody's touched a bug in months, it's probably been fixed, right?

Annotated API call examples are worth 1000 times more than boring, verbose, precise descriptions of constraints, error behaviour and motivations for the API design.

Version numbering is hard to get right, and consumers always use the highest-numbered version, so there's no need to annotate APIs with the version or date they were last changed or introduced.

On the other hand, some good ways to document APIs:

Allow anchor linking to documentation subsections.

Clearly mark deprecations and provide a deprecation schedule.

Provide a sandboxed API playground for testing API calls.

Provide worked client examples in multiple programming languages.

Turns out, documenting a large API which is expected to be consumed by thousands of people requires a lot of careful work.

Sarcasm aside, the last few years of working (on and off) on libgdata has made a number of things obvious about web APIs. Here are some ideas I’ve had for best practices for writing code which interacts with them. Some of these have made their way into libgdata; others would require an API break to implement. References to relevant examples of APIs in libgdata are given inline, but if something isn’t clear please leave a comment. As always, this list is probably incomplete and any additions or alterations to it would be appreciated.

Have a very general, flexible core API (example), and add a layer of specialisation on top of it (example). This allows client programs to use the general APIs to access new features in the web API if your library hasn’t yet caught up.

Use objects liberally. Objects can be extended with new properties without breaking API. Structs and function parameter lists cannot. Even if you end up creating objects with a single property (example), don’t create them as structs!

Don’t worry about CPU efficiency. The cost of creating objects or doing some ‘unnecessary’ extra processing to give your API more flexibility is nothing compared to the cost of a network round trip. Network round trips and memory consumption are the main costs.

As a corollary to the previous point, the API should be zero-copy for consumers. If possible, try to design the API so that programs using it won’t have to take copies of all the data they access, as this will end up doubling the memory consumption of the application unnecessarily. One way to do this (which is what libgdata does) is to make the objects returned by network requests effectively immutable — e.g. a query will return a new set of result objects each time it’s performed, rather than updating an existing set of them.

Always try to think one step ahead of the web API designers. This is part of making your API flexible: if you’re thinking about the directions the web API could go in and the features which could be added to it in future, you’ll be more prepared when the web API designers suddenly spring them on you. libgdata managed this with its authentication API, but didn’t manage it with the core feed/entry API.

Report bugs against the web API. In the case of libgdata, many of the bugs we reported have been ignored, but that’s not the point. By reporting bugs, you help other consumers of the web API, and give (a little) feedback to the web API designers as to how people are using, or expecting to use, the web API. (And also how broken it is.)

Make everything asynchronous (example). Absolutely everything which could result in a network request should be asynchronous, cancellable, and support returning errors (even if cancellation isn’t initially implemented and no errors are initially returned). This prevents having to break API in the future to make a method asynchronous. Methods which will result in network requests should be clearly separated from non-networking methods, e.g. by using a different naming scheme for them (my_object_request_property() versus my_object_get_property(), for example).

Design the API with batch processing in mind. Wherever possible, allow sets of objects to be passed to methods, rather than individual objects. If the web API doesn’t support batch processing, the method can just implement a loop internally. If it does, the use of batch processing allows for an order n reduction in network round trips. libgdata failed at this, having to tack batch operations onto the API as an afterthought (example). Fortunately (or perhaps unfortunately) it hasn’t been much of an issue because Google’s batch API never really went anywhere. Clients of libgdata have wanted to use batch functionality, however, and it would have been best implemented from the start.

Integrate concurrency control in the core of your API (example). Web APIs are interfaces to large distributed systems. As we’ve found with libgdata, concurrency control is important, both in managing conflicts between different clients (e.g. when concurrently modifying an object) — but also in managing conflicts between clients and internal server processes. For example, just after a client creates a document on Google Docs, the server will modify it to add missing metadata. These modifications (and the accompanying change in the object’s version number) are exposed to clients. Google’s APIs (and hence libgdata) implement optimistic concurrency control using HTTP ETags. All operations in libgdata take an ETag parameter. This works fairly well (ignoring the fact that some web API operations inexplicably don’t support ETags).

Don’t expose specifics of the web API in your API. Take a look at all the functionality exposed by the web API (and all the functionality you think might be added in future), then design an API for it without reference to the existing web API. Once you’re done, try to reconcile the two APIs to make sure yours is actually implementable. This means your API isn’t tied to some esoteric behaviour when the web API gets fixed. However, if done incorrectly this can backfire and leave your API unable to map to future changes in the web API. Your mileage may vary.

Testing is tricky. You want to test your code against the web API’s production servers, since that’s what it’ll be used against. However, this requires that the machine running the tests is connected to the Internet (which often isn’t the case). It also means your unit tests can (and will) spuriously fail due to transient network problems. The alternative is to test your code against an offline mock-up of the web API. This solves the issues above, but means that you won’t notice changes and incompatibilities in the web API as they’re introduced by the web API developers. libgdata has never managed to get this right. I suspect the best solution is to write unit tests which can be run against either a mock-up or the real web API. Automated regression testing would run the tests against the mock-up, but developers would also regularly manually run the tests against the real web API.

The Desktop Summit's over for another year. Berlin was great, the parties were good (thanks to Intel and Collabora!), and it was good to see everyone again — and also meet some new people. My thanks, as always, go to the GNOME Foundation for arranging and sponsoring my accommodation.

Unfortunately, the Summit wasn't all parties and sightseeing. We managed to belatedly push out folks 0.6, with Travis putting in far too much of his holiday time to tar it all up. Raúl somehow got away with sneaking off and doing interesting things with the Sugar people instead.

My libgdata BoF went down well, with a reasonable amount of interest in pushing libgdata's support for Google Documents forward, for use by both GNOME Documents and Zeitgeist. I'll get round to tidying up and pushing forward with the libgdata 0.12 roadmap in the near future.

I'm looking forward to being the lucky recipient of a load of GSoC patches to review for using GXml in libgdata. I love reviewing patches.

Relatedly, I've just released libgdata 0.9.0, which has sprouted support for OAuth 1.0 — so hopefully some GNOME Online Accounts goodness will soon make it into Evolution's Google Contacts and Calendar backends.

It turns out that stuffing my failure of an apricot liqueur full of spices has worked, and I now have a bottle of tasty festive spicy apricot vodka. Here's the recipe in full:

Ingredients

300g dried apricots

700ml vodka

1 large cinnamon stick

Rind of 1 orange

2 blades mace

4 cloves

8 juniper berries

~200g granulated sugar

Method

Submerge the dried apricots in boiling water and leave them overnight to re-inflate. I had to do this because I could only buy dried apricots, but I guess an equivalent mass of fresh apricots would work.

Split the apricots into halves and put them into a 1.5l plastic bottle (an empty water bottle is perfect) with a cinnamon stick and the vodka. It's better to put the apricots in first so that they don't splash.

Seal the bottle well and shake it.

Leave the bottle in a cool, dark place for around four weeks, shaking it every day. The precise amount of time you leave the mixture to infuse doesn’t really matter, as long as it’s more than about three weeks. Try to avoid opening the bottle during this time, as that’ll introduce contaminants.

After the mixture has infused for a suitable amount of time, filter it out into another bottle. This might take a day or longer, depending on your filter paper. Use fine filter paper (coffee filter papers worked well for me), and be prepared to use several filter papers, as they will get clogged up. Keep the apricots!

Once filtering is complete, add sugar to the filtered liquid to taste. 200g should be about right (I used 225g this time, and it was a little too sweet). Add the cinnamon stick again (or use a fresh one), and the orange rind, mace blades, cloves and juniper berries. Seal the bottle up again and leave it for at least another two weeks, shaking it daily.

After at least two weeks, filter it again. This should take a lot less time, as most of the fine sediment will have been removed during the first filtration.

Drink and enjoy! The recipe produces about 700ml of spiced apricot vodka. The apricots left over from filtration are nice in a pie, though they will have lost a lot of their flavour (more so than the raspberries).

I'm sure it would be quite possible to infuse the vodka with the apricots and spices at the same time, but I didn't try this, so I can't be sure. Another thing I probably should've done is added the sugar after infusing the vodka with the spices, as that would've allowed it to be added to taste. As it was, I added what I guessed was the right amount and fortunately it turned out well.

While not busy playing around with drinks, I've got around to finishing off and releasing libgdata 0.8.0. This vastly improves upload and downloads via streaming I/O, as well as fixing a lot of race conditions, leaks and cancellation problems. This release doesn't introduce much new functionality, but breaks a lot of API, hopefully for the better.

Earlier in the year, I experimented with making fruit liqueur at home; I made a bottle of raspberry vodka with great success. Seeking to repeat the success and add some variety, I recently tried making apricot vodka (with the intent to have it ready for Christmas and the New Year). Unfortunately, I failed. It turns out that apricots don't have enough flavour in them to overcome the taste of vodka, and so I've ended up with pale yellow coloured bleach. I'm now attempting to rescue it by ramming it full of spices.

I realise it's a little late now for anybody who's interested in making this for Christmas, but if anyone fancies making it some other time, here's the (successful, at least in my opinion) recipe for raspberry vodka. It takes three weeks or more to do. A good recipe for apricot vodka is left as an exercise for the reader.

Ingredients

675g fresh raspberries

~1tbsp lemon juice

700ml vodka

~250g granulated sugar

Method

Slightly crush the raspberries and put them into a 1.5l plastic bottle (an empty water bottle is perfect) with the lemon juice and vodka. It's better to put the raspberries in first so that they don't splash.

Seal the bottle well and shake it.

Leave the bottle in a cool, dark place for around four weeks, shaking it every day. The precise amount of time you leave the mixture to infuse doesn't really matter, as long as it's more than about three weeks. Try to avoid opening the bottle during this time, as that'll introduce contaminants.

After the mixture has infused for a suitable amount of time, filter it out into another bottle. This might take a day or longer, depending on your filter paper. Use fine filter paper (coffee filter papers worked well for me), and be prepared to use several filter papers, as they will get clogged up. Keep the raspberries!

Once filtering is complete, add sugar to the filtered liquid to taste. 250g makes it fairly sweet, so you could quite easily use less. If you're not sure, add it gradually, since it only takes a few tens of minutes (and some shaking) to dissolve.

Drink and enjoy! The recipe produces just under 1l of raspberry vodka. The raspberries left over from filtration are excellent when eaten with cream or ice cream.

I can't take much credit for this, as it's based on a recipe I found here: http://www.guntheranderson.com/liqueurs/raspberr.htm. I made some alterations, though, such as doubling the ratio of raspberries to vodka. Fortunately, they seem to have worked out well.

Having totally failed to blog much recently, I thought I'd post an update on libgdata, and the work I've been doing over the last few days on it.

I'm aiming to get libgdata 0.8.0 released in good time to be used in GNOME 3.0, since it contains a lot of fixes and API breaks. What will be new in 0.8? The major change is that libgdata will be fully asynchronous — every long-running operation now has an asynchronous API. As part of the work on this, I've changed the way uploads work, too, so that they're now conducted exclusively via streaming I/O. This means that the data being uploaded to PicasaWeb as a new photo (for example) can just as easily come from memory as it can from disk.

Other improvements include lots of XML escaping fixes, some memory leak fixes, and improvements to cancellation support. Cancellation support is one of the things left on the to-do list before 0.8.0 can be released. While I naïvely thought that cancellation support in libgdata had been fairly good, after poking around in the code for the last few days I've been shown to be very, very wrong. Cancellation doesn't work in many cases, and is racy in others. That will be fixed before 0.8.0

Funnily enough, all this work was prompted by the need to fix the test suite so that libgdata passed its tests on build.gnome.org. It turns out that due to a recent change to g_str_hash() in GLib master, libgdata's test suites started to fail, as they implicitly relied on the order of entries in a hash table when doing XML comparisons using strcmp(). Obviously, this wasn't a brilliant idea. I suppose this is a little word of warning, then: if things involving hash tables in your program suddenly break with GLib ? 2.27.4, it might be because you're (erroneously) relying on the order of entries in a hash table.

The changes to fix the test suite have been backported to libgdata's 0.6 and 0.7 branches, and are present in the new 0.6.6 release. I was a little over-eager with the 0.7 branch, and released 0.7.1 too early, so that doesn't have the fixes. They'll be in 0.7.2 though, whenever that comes out.