Posts tagged with: tagging

Still looking for a simple way to tag concrete resources (to-do items, people, locations) with personal concepts (e.g. "non-profit", "research", "semweb"), and also with other non-conceptual resources (clients, projects), I skimmed through the fresh SKOS Recommendation. I'm still a fan of SKOS and frequently wonder about semweb apps where the internal models are grounded in pluggable, personal(!) SKOS schemes, instead of coordination-intensive RDF Schemas or OWL ontologies. I don't know if such an approach could really work, I guess network effects benefit more from rather tightly defined relations and identifiers. Mainly just to have it written down somewhere (this is really not well thought out yet), here are some of the related entry points and considerations:

Tagging should be personal.
While I like the idea of grounding tags in existing dictionaries such as DBPedia, tags seem to work best when they are as user-defined and informal as possible. Last year, I experimented with a tool that allowed me to tag things with other people's delicious tags. It just felt wrong, I wanted my "own" tags. (I think the latest Faviki release is a nice example for combining the best of both worlds).

SKOS supports personal tags
Concepts in SKOS are sort-of scoped (or "namespaced"). If I describe a "Fun" concept, it is defined as seen by the creator of the concept URI, i.e. I can annotate it with ':Fun dct:creator <#me> ; dct:created "2009-08-19"' etc, even though the general idea of Fun was clearly not invented by me, and definitely before today.

Tags should be safely portable
Thanks to URIs, SKOS concepts can be ported to other applications, and they can be grouped and organized in so-called concept schemes, i.e. I could have a "Waving" in a "Dance" concept scheme, and also in a "Netiquette" scheme.

There is a need to merge tag sets
If tags are used to organize all sorts of personal things, it should be possible to merge them into a unified model. Mainly for personal use ("personal world view"), but also for sharing with other people and linking to their views. This is again possible thanks to SKOS being based on RDF, URIs, and very loose semantics.

There is a need to tag real-world objects with concepts
This is partly obvious. Tags are a means to an end. But while they are already widely used to annotate document-like resources (web pages, photos, etc), I'd also like to tag things like my projects, people in my address book, and similar non-documents. From the SKOS Primer:
While the SKOS vocabulary itself does not include a mechanism for associating an arbitrary resource with a skos:Concept, implementors can turn to other vocabularies So, whatever predicate URI we are going to use, it's not going to be provided by SKOS directly.

Maybe Dublin Core terms can link non-documents to concepts
This is a slightly controversial conclusion/assumption, given that DC terms are mainly associated with document metadata. But after exploring the DCMI website, I can't find any clear evidence that their terms can't be used more generally. Both the Usage Guide (thanks to Masahide for the pointer) and the Abstract Model actually support this thought. The Usage guide mentions that "DC metadata can be applied to other resources as well" (but notes that the suitability may depend on the particular context at hand), and the Abstract Model states that the notion of a Dublin Core "resource" is equivalent to "Resource" defined in RDF Schema, which can be anything, even including Literals. So, we can most probably use dct:subject or dct:relation to tag a project or person with a SKOS concept.

There is a need to associate concepts with real-world objects
If we organize our personal concept space with SKOS, we may also want to more formally specify our personal concepts, so that other applications or people can merge them with their tags. Therefore, we need a predicate that can relate concepts to non-concepts such as DBPedia identifiers. Such a mechanism could maybe also help with RDF's general problem of URI aliases. I could have a personal, canonical concept URI for a resource and use it as a container for the resource's various aliases. Again, SKOS does not provide a predicate for this use case, so we've got to look elsewhere.

Maybe Dublin Core terms can link concepts to real-world objects
Another possibly controversial conclusion, but again there is supporting text in the DCMI specs: "A value associated with the Dublin Core Subject property is a concept (a conceptual entity) or a physical object or person (a physical entity)". So, if the value of dc:subject can be a non-document, we can say things like :Berlin a skos:Concept; dct:subject dbpedia:Berlin .. This is very interesting because it could allow us to use dct:subject in both ways: for the tagging of things, and also for grounding tags. FOAF has a handy primaryTopic term, which could work in this context, too, but unfortunately, its scope is (currently) set to foaf:Document. DanBri also suggested the creation of a dedicated skos:it (or similar) predicate which would be even better.

Sometimes I'd like to "tag" real-world objects with real-world objects
Don't know if tagging is still the right word here, but what I mean is a generic relation for arbitrary things in a common application context. Often, we can do better by specifying the relation between two resources, but in other cases, a simple, maybe just temporary link, is better than laziness leading to a completely non-annotated resource. Given the two DCMI-related findings above, we could maybe conclude that a predicate like dct:relation can also be used to relate a project to a person, or the other way round, without having to invent a new predicate.

Update: I just read the spec again, I can't tag non-content with the CommonTag vocabulary. Too bad, ignore the last paragraph, please.

Sorry for raising my voice here, but some of us are really working hard to show that SemWeb technologies don't have to be complicated, and unfortunately, the new CommonTag effort seems to send exactly the opposite message.

Don't get me wrong, a widely used tagging ontology would be great. We do have 3 (or 4? 5?) tagging vocabularies already, but none really caught up, possibly because tagging is meant to be simple and the proposed solutions apparently weren't easy enough. CommonTag is promoted as being "simple" and "easy", but after looking at the examples in the QuickStart Guide, I'm not so sure:

The snippets are really off-putting (not only for Non-RDFers). Do I really need multiple nested HTML nodes to create something as simple as a tag?

Couldn't the term names be more intuitive? What could a ctag:Tag be? The actual tag or an intermediate resource that is then, err, tagged? A person ctag:tagged a resource, right? Ah, no.

Why aren't the term names at least consistent? "ctag:taggingDate" follows noun-role, "ctag:tagged" is a dunno, "ctag:means" is a present-form verb, "ctag:isAbout" sort-of follows the hasPropertyOf anti-pattern.

The vocabulary introduces aliases for well-deployed terms such as rdfs:label and dct:created, which makes its use in practical settings expensive (it'll ease things on the author side, though).

To be a little more constructive: Using the vocabulary doesn't have to lead to the complicated markup seen in the examples. I'm sure they'll soon get better snippets from someone in the RDFa community. And apart from that, there is also a handy term in the RDF Schema which might just be what you are looking for: "ctag:isAbout". It lets you directly point from a resource (default is the page) to a Linked Data identifier (e.g. from DBPedia), without the need for all those intermediate nodes (which lead to triple bloat and slow down SPARQL queries). CommonTag-consuming apps will have to implement some form of inferencing to handle "isAbout", but as the term is in the spec, I assume they plan to.

Granular modeling of tags is apparently tricky, but shouldn't there be some sweet spot? Something a little more expressive than rel-tag but less complex than a fully spec'd Tag ontology? xFolk looks promising, or maybe the CommonTag group members could have agreed on formalizing and supporting "scoped rel-tag" (rel-tags with an optional RDFa "about" container). Most rel-tag-to-RDF converters have some form of scoping already anyway (because tags can apply to reviews, pages, vcards, etc.). That would have been a cool outcome after 1 year of stealth work.

I may as well just over-stress the simplicity aspect here. Maybe CommonTag is "simple enough" for web publishers. There are some initial supporters, and for RDFers, the nested structures and bnodes will most probably be acceptable. So let's see how things evolve.

I personally think I'll have a closer look at ctag:isAbout. I'm still looking for an alternative to dc/dct:subject to tag arbitrary things with arbitrary identifiers, maybe CommonTag can provide it, although

<#me> ctag:isAbout dbpedia:Semantic_Web .

still doesn't sound right for a rich tag, and the domain is "ctag:TaggedContent" which sounds wrong for non-textual resources, too. (dct:relation is the best I could find so far for tagging things with things, but Dublin Core is coming from a publishing context and is therefore often recommended for describing publications only).