I recently read an article about how negative assertions about something are automatically getting associated with the person who made them. For example, if you say negative things about your competitor's products, people will subconsciously link these negative sentiments directly with you. A psychology thing. So, my recent rants about the RDF spec mania at the W3C have already lead to an all-time low karma level in the RDF community, and I'm trying hard to keep away from discussions about RDFa 1.1 or RDF Next Steps etc. to not make things worse. (Believe it or not, not all Germans enjoy being negative ;)

Now, why another post on this topic? ARC2's development is currently on hold as my long-time investor/girlfriend pulled the plug on it and (rightly so) wants me to focus on my commercial products. With ARC spreading, the maintenance costs are rising, too. There are some options around paid support, sponsoring and donations that I'm pondering, but for now the mails in my inbox are piling up, and one particular question people keep asking is whether ARC is going to support upcoming SPARQL 1.1 or if I'm going to boycott it and perhaps think that the W3C specs are preventing the semantic web from gaining momentum. Short answer (both times): To a certain extent, yes.

Funnily, this isn't so much a question about developers wanting to implement SPARQL 1.1, but rather if they actually can implement it, in an efficient way. SPARQL 1.1 standardizes a couple of much-needed features that we had in ARC's proprietary SPARQL+ for a couple of years. Things like aggregates and full CRUD which I managed to implement in a fast-enough way for my client projects. But when it comes to all the other features in SPARQL 1.1, the suggestions coming out of the "RDF 2.0" initiative, and the general growth of the stack, I do wonder if the RDF community is about to overcookbake its technology layer cake.

Not that any particular spec was bad or useless, but it is becoming increasingly hard for implementors to keep up. Who can honestly justify the investment in the layer cake if it takes a year to digest it, another year to implement a reasonable portion of it, and then a new spec obsoletes the expensive work? The main traction the Semantic Web effort is seeing happens around Linked Data, which uses only a fraction of the stack, and interestingly in a way non-compliant with other W3C recommendations such as OWL, because the latter doesn't provide the needed means for actual symbol linking (or didn't explain it good enough).

A central problem could be lack of targeting, and lack of formulating the target audience of a particular spec. 37signals once said that good software is opinionated. The RDF community is doing the exact opposite and seems to desperately try to please everyone. The groups follow the throw-it-out-and-see-what-sticks approach. And every new spec is thrown on the stack, with none of them having a helpful description for orientation. No one is taking the time to reduce confusion, to properly explain who is meant to implement the spec, who is meant to use the spec, and how the spec relates to other ones. Sure, new specs raise the market entrance barrier and thus help the few early vendors to keep competition away. But if the market growth gets delayed this way, it may die, or at least an unnecessary number of startups do. (Siderean is one example, their products were amazing. Another one is Radar Networks, which suffered from management issues, but they might have survived if they had spent less money trying to implement an OWL engine for Twine.)

For the fun of it, here are some micro-summaries for RDF specs, how I as a web developer understand them:

RDF: "A schema-less key-value system that integrates with the web." (Oha!)

RSS 1.0: "Rich data streams." (This is the stuff the thought leaders then said would never be needed, and which now inefficiently have to be squeezed into Atom extensions. Deppen!)

OWL 1: "Dumbing down KR-style modeling and inference to the web coder level" (I really liked that approach, it attracted me to the SemWeb idea in the first place, even though I later discovered that RDF Schema is sufficient in many cases.)

GRDDL: "For HTML developers who are also XSLT nerds." (A failure, possibly because the target audience was too small, or because HTML creators didn't care for XML processing requirements. Or the chained processing of remote documents was simply too complex.)

OWL 2: "Made for the people who created it, and maybe AI students." (Never needed any of its features that I couldn't have more easily with simple SPARQL scripts. I think some people need and use it, though.)

RIF: "Even more features than OWL2, and yet another syntax". Alternative summary (for a good ROFL): "Perfect for Facebook's Open Graph". (No use case here. Again, YMMV.)

SPARQL 1.1: "Getting at par with enterprise databases, at any cost." (A slap in the face of web developers. Too many features that are not implementable in any reasonable time, nor in its entirety, nor with user-satisfying performance. Profiles for feature subsets could still save it, though).

Microdata: "RDF-in-HTML made easy for CMS developers and JavaScript coders" (Not sure if it'll succeed, but it works well for me.).

SKOS: "An interesting alternative to RDFS and OWL and a possible bridge to the Web 2.0 world." (Wish I had time to explore SKOS-centric app development, the potential could be huge.)

I still believe that the lower-end adoption issue could be solved by a set of smaller layer cakes, each baked for and marketed to a defined and well-understood target audience. If the W3C groups continue to add to the same cake, it's going to crumble apart sooner or later, and the higher layers are going to bury the foundations. Nobody is going to taste from it at all then.

And to answer the ARC-related question in more detail, too: Next step is collecting enough funds to test and release a PHP 5.3 E_STRICT version (Thanks so much to all donaters so far, we'll get there!). SPARQL 1.1 compatibility will come, but only for those parts that can be mapped to relational DB functionality. The REST API is on my list, too. Empty graphs, don't think so (which app would need them?). Sub-queries, most probably not. Federated queries, sure, as soon as someone figures out how to do production-ready remote JOINs ;-)

Update: This article has been called unfair and misleading, and I have to agree. I know that spec work is hard, that it's easy to complain from the sideline, and that frustration is part of compromise-driven specifications. Wake-up calls have to be a little louder to be heard, though, but I apologize for the toe-stepping. It is not directed against any person in particular.

Comments and Trackbacks

Bengee, thanks for an insightful post.

Please don't think that the whole community views your comments as negative. As someone that uses ARC2 on a daily basis, I wanted to take the opportunity to thank you for all the work you've put into it. It makes the lives of so many people that much better, and that's not said often enough.

Since you've had INSERT and DELETE in sparql+ for a long time, I think you're already ahead of an important part of the curve. I genuinely believe the sem web is on the cusp of something big with data wikis starting to emerge. I hope we can all share it in together!

Comment by Melvin Carvalho on 2010-09-17 14:03:21 UTC

Hi Bengee. Nice post... lots to think about. I agree with much of what you say, but as a SPARQL implementor (and a member of the WG), I was hoping you might elaborate on why you think 1.1 is a "slap in the face of web developers"? Most of the features in 1.1 are things that were already implemented in several projects (and that users have been requesting for a long time). My involvement in the WG should make you take this with a grain of salt, but I'd disagree that it's "not implementable in any reasonable time, nor in its entirety". Almost all of my actual implementation work occurs in my free time, and while being in the WG likely helps me track the changes to 1.1 more quickly than other developers, I've implemented almost all of the current 1.1 features without an overwhelming amount of effort. Are there specific features you have in mind that you think are particularly burdensome?

Greg, but you are way more smart than me. IIRC you managed to create a fully compliant SPARQL 1.0 implementation, too. I'm still more than 80 test cases away from that. I'm still looking for a way to support UNIONs in combination with efficient ORDER BY...