Month: May 2012

Further to my previous post: I’m used to thinking of iOS devices (iPhone, iPad) as undermining the PC. From that perspective, my response as a developer is partly skewed by frustration at seeing relatively open platforms replaced by more closed ones.

Thinking about an iOS device as an alternative to the games console—the classic successful closed-system consumer computing product—makes an interesting change. But it’s a perspective in which my response is also skewed, this time by general affection for Nintendo in particular.

The best-selling console worldwide at the moment is Microsoft’s Xbox 360. It has been around for roughly 7 years, unusually long in console terms, and has so far sold about 70 million units. (That’s perhaps 30 million in total behind the Nintendo Wii, which sold far more in earlier years but has now almost stopped selling. I believe that Xbox 360 sales are now also falling, though I can’t remember where I read that.)

Meanwhile, the iPad has been available for about 2 years and has so far sold… about 70 million units. An interesting coincidence.

Historically, it seems to have been the case that that technically successful improvements to input devices in gaming—joystick, D-pad, motion controls, touch, motion tracking, arguably even the ability to provide your own CD as soundtrack in the original PlayStation—have prompted significant increases in popularity.

Meanwhile, improvements to output devices—most obviously 3D, but also things like resolution and frame rate increases—seem to appear incrementally and be largely ignored. (Anecdotally: whenever my kids play with a 3DS, the first thing they seem to do is switch off the 3D.)

The Wii, Xbox 360 and iPad have all carried improvements in input technology over earlier games devices, but as with any technology in gaming, their success depends entirely on their use in fun games. The initial success and later decline in the market of the Wii’s rather basic motion control is well documented (it’s all about Wii Sports, right?). Kinect, for the 360, has sold around 19 million units and is probably also slowing in sales: is the natural size of the market limited, or does it just lack worthwhile games?

So, what happens next?

I pretty much admitted in my previous post that I don’t know how you drive an Apple TV. I’ve never seen one in action.

I assume that a version with apps would need to be controlled from an external iOS device. (Apple execs have talked quite convincingly in the past about the disadvantages of a large vertical touchscreen.) I’m guessing that this logic might be one of the inspirations for Nintendo’s forthcoming Wii U, which looks quite like a Wii controlled by an external iPad-like controller.

It seems hard to imagine why many people would consider buying a dedicated games console when they can have a device like an Apple TV box that plays up-to-date games with minimal fuss, is regularly upgraded, and presumably is supported by major games companies because of the potentially huge market ahead of it.

But it all depends on the input device.

What sort of compelling big-screen games are made possible by a touchscreen controller? They can’t be the same as the current touchscreen games. Those won’t benefit from any extra distance between controller and screen.

I don’t think I believe that Apple would launch an interactive TV without some understanding of how games will work to their best advantage. Games are a big deal, both on the iPad and in existing home entertainment contexts. What don’t I know?

Rumours abounding (nice example here from John Gruber) that Apple may be about to announce an updated Apple TV operating system with apps support, possibly integrated into an Apple-branded TV set rather than being available only as a separate box as at present.

(How would you control it? Through a separate iOS device like an iPad?)

This sounds potentially very bad for the traditional games console, a market that seems to be already waning.

If I could only have one secretive, obsessively proprietary company making integrated hardware and software products, with a history of approaching product design a bit differently from its competition, of favouring customer pleasure over technical advantage, and of treating third-party developers in unpredictable and capricious ways… I’d choose Nintendo rather than Apple.

But Nintendo don’t really seem to know what to do at the moment. A pity.

Although there was nothing very deep about this change or its causes, I found it interesting partly because I had used a partly test-driven process to evolve the original API and I felt there may be a connection between the process and any resulting problems. Here are a few thoughts prompted by this change.

Passing the tests is not enough

Test-driven development is a satisfying and welcome prop. It allows you to reframe difficult questions of algorithm design in terms of easier questions about what an algorithm should produce.

But producing the right results in every test case you can think of is not enough. It’s possible to exercise almost the whole of your implementation in terms of static coverage, yet still have the wrong API.

In other words, it may be just as easy to overfit the API to the test cases as it is to overfit the test cases to the implementation.

Unit testing may be easier than API design

So, designing a good API is harder than writing tests for it. But to rephrase that more encouragingly: writing tests is easier than designing the API.

If, like me, you’re used to thinking of unit testing as requiring more effort than “just bunging together an API”, this should be a worthwhile corrective in both directions.

API design is harder than you think, but unit testing is easier. Having unit tests doesn’t make it any harder to change the API, either: maintaining tests during redesign is seldom difficult, and having tests helps to ensure the logic doesn’t get broken.

Types are not just annoying artifacts of the programming language

An unfortunate consequence of having worked with data representation systems like RDF mostly in the context of Web backends and scripting languages is that it leads to a tendency to treat everything as “just a string”.

This is fine if your string has enough syntax to be able to distinguish types properly by parsing it—for example, if you represent RDF using Turtle and query it using SPARQL.

But if you break down your data model into individual node components while continuing to represent those as untyped strings, you’re going to be in trouble. You can’t get away without understanding, and somewhere making explicit, the underlying type model.

Predictability contributes to simplicity

A simpler API is not necessarily one that leads to fewer or shorter lines of code. It’s one that leads to less confusion and more certainty, and carrying around type information helps, just as precondition testing and fail-fast principles can.

It’s probably still wrong

I’ve effectively found and fixed a bug, one that happened to be in the API rather than the implementation. But there are probably still many remaining. I need a broader population of software using the library before I can be really confident that the API works.

Of course it’s not unusual to see significant API holes in 1.0 releases of a library, and to get them tightened up for 2.0. It’s not the end of the world. But it ought to be easier and cheaper to fix these things earlier rather than later.

Dataquay hasn’t seen a great deal of use yet. I’ve used it in a handful of personal projects that follow the same sort of model as the application it was first designed for, and that’s all.

But I’ve recently started to adapt it to a couple of programs whose RDF usage follows more traditional Linked Data usage patterns (Sonic Visualiser and Sonic Annotator), as a replacement for their fragile ad-hoc RDF C++ interfaces. And it became clear that in these contexts, the API wasn’t really working.

I can get away with changing the API now as Dataquay is still a lightly-used pre-1.0 library. But, for the benefit of the one or two other people out there who might be using it—what has changed, and why?

The main change

The rules for constructing Dataquay Uri, Node and Triple objects have been simplified, at the expense of making some common cases a little longer.

If you want to pass an RDF URI to a Dataquay function, you must now always use a Uri object, rather than a plain Qt string. And to create a Uri object you must have an absolute URI to pass to the Uri constructor, not a relative or namespaced one.

This means in practice that you’ll be calling store->expand() for all relative URIs:

(Note the magic word “a” still gets special treatment as an honorary absolute URI.)

Meanwhile, a bare string will be treated as an RDF literal, not a URI.

Why?

Here’s how the original API evolved. (This bit will be of limited interest to most, but I’ll base a few short conclusions on it in a later post.)

An RDF triple is a set of three nodes, each of which may be a URI (i.e. an “identity” node), a literal (a “data” node) or a blank node (an unnamed URI).

A useful RDF triple is a bit more limited. The subject must be a URI or blank node, and the predicate can only realistically be a URI.

Given these limitations, I arrived at the original API through a test-driven approach. I went through a number of common use cases and tried to determine, and write a unit test for, the simplest API that could satisfy them. Then I wrote the code to implement that, and fixed the situations in which it couldn’t work.

So I first came up with expressions like this Triple constructor:

Triple t(":bob", "a", "profession:Builder");

Here :bob and profession:Builder are relative URIs; bob is relative to the store’s local base URI and Builder is relative to a namespace prefix called “profession:”. When the triple goes into a store, the store will need to expand them (presumably, it knows what the prefixes expand to).

This constructor syntax resembles the way a statement of this kind is encoded in Turtle, and it’s fairly easy to read.

It quickly runs into trouble, though. Because the object part of a subject-predicate-object can be either a URI or a literal, profession:Builder is ambiguous; it could just be a string. So, I introduced a Uri class.[1]

Triple t(":bob", "a", Uri("profession:Builder"));

Now, the Uri object stores a URI string of some sort, but how do I know whether it’s a relative or an absolute URI? I need to make sure it’s an absolute URI already when it goes into the Uri constructor—and that means the store object must expand it, because the store is the thing that knows the URI prefixes.[2]

Triple t(":bob", "a", store->expand("profession:Builder"));

Now I can insist that the Uri class is only used for absolute URIs, not for relative ones. So the store knows which things it has to expand (strings), and which come ready-expanded (Uri objects).

Of course, if I want to construct a Triple whose subject is an absolute URI, I have another constructor to which I can pass a Uri instead of a string as the first argument:

This API got me through to v0.8, with a pile of unit tests and some serious use in a couple of applications, without complaint. Because it made for simple code and it clearly worked, I was happy enough with it.

What went wrong?

The logical problem is pretty obvious, but surprisingly hard to perceive when the tests pass and the code seems to work.

In the example

Triple t(":bob", "a", store->expand("profession:Builder"));

the first two arguments are just strings: they happen to contain relative URIs, but that seems OK because it’s clear from context that they aren’t allowed to be literals.

Unfortunately, just as in the Uri constructor example above, it isn’t clear that they can’t be absolute URIs. The store can’t really know whether it should expand them or not: it can only guess.

More immediately troublesome for the developer, the API is inconsistent.

Only the third argument is forced to be a Uri object. The other two can be strings, that happen to get treated as URIs when they show up at the store. Of course, the user will forget about that distinction and end up passing a string as the third argument as well, which doesn’t work.

This sort of thing is hard to remember, and puzzling when encountered for the first time independent of any documentation.

In limited unit tests, and in applications that mostly use the higher-level object mapping APIs, problems like these are not particularly apparent. Switch to a different usage pattern involving larger numbers of low-level queries, and then they become evident.

[1] Why invent a new Uri class, instead of using the existing QUrl?

Partly because it turns out that QUrl is surprisingly slow to construct. For a Uri all I need is a typed wrapper for an existing string. Early versions of the library did use QUrl, but profiling showed it to be a significant overhead. Also, I want to treat a URI, once expanded, as a simple opaque signifier rather than an object with a significant interface of its own.

So although Dataquay does use QUrl, it does so only for locations to be resolved and retrieved —network or filesystem locations. The lightweight Uri object is used for URIs within the graph.

[2] Dataquay’s Node and Triple objects represent the content of a theoretical node or triple, not the identity of any actual instance of a node or triple within a store.

A Triple, once constructed, might be inserted into any store—it doesn’t know which store it will be associated with. (This differs from some other RDF libraries and is one reason the class is called Triple rather than Statement: it’s just three things, not a part of a real graph.) So the store must handle expansions, not the node or triple.

Dataquay is my C++ library for RDF datastore management using the Qt toolkit.

It’s a library for people who happen to be writing C++ applications using Qt and who are interested in managing data that fit well into a subject-predicate-object graph model (as in the Linked Data paradigm, for example).

It uses Qt classes and coding style throughout, and includes an object mapper for store and recall of Qt’s property-based introspectable objects.

The library started out with an interest in exploring RDF as a representational model for data in a traditional document-based editing application that used Qt. In purpose therefore it has more in common with aspects of Core Data or Hibernate than with semantic data frameworks such as Soprano. That is:

This is the second time I’ve been forestalled in writing a positive note about Microsoft’s SkyDrive cloud storage and apps service, by going to the site and finding it isn’t actually working at all:

I hadn’t asked for Hotmail. This is just where the site redirected me when I tried to log in to SkyDrive on my phone.

I must say this is nicely fitting, in light of Microsoft’s recent attack on the unpredictability of Google Docs: “Different… better… completely gone…” Perhaps they decided it was time to get ahead in the race to “completely gone”.

It’s a pity, as I kind of liked SkyDrive. I evaluated Office365 for business purposes a year ago, but gave up on it when I found it included no way to download your files—perhaps that was intentional for purposes of corporate control, or perhaps it’s fixed now, but it doesn’t seem to have been an issue with the SkyDrive office apps. In many ways I prefer the interface to that of Google Docs, and I think of Microsoft as the underdog nowadays in a way that makes me (dangerously) more inclined to trust them. And in fact, I probably will continue to use SkyDrive for the odd thing.

But it’s clear now that Microsoft aren’t really all that great at keeping it running. I’m afraid, despite my liking for the service, that it does appear to be just a little bit pants.