Since atom:category’s @schemeidentifies
the means of interpreting @term1,
using @scheme to indicate “this
atom:category is a tag” seems perfectly
reasonable to me. But maybe we should back up a bit. The more
general question is how should we represent tags in Atom?
Tim makes the same assumption I’ve been making —
that atom:category is the natural and correct
element for tagging in Atom. While this seems obviously true
— I think of tagging as a particular form of
categorization — perhaps some other representation would
work better. Aristotle
Pagaltzis, for instance, proposed the use of
atom:link instead. So what’s to be
done?

What do we want in a tag representation?

Here are some properties that I think a great representation of
tags in Atom would have. Note that I doubt any solution could
manage to have all of them.

No elements or attributes outside of those present in RFC 4287 —
i.e., no extensions. Attribute values requiring
registration or standardization suffer somewhat on this
point.

Would (or at least could) provide both the
human-readable and normalized version of
the tag. Flickr (and many other sites) normalize tags like
“San Diego” to “sandiego” —
for example, see this photo
of mine.

Would provide a dereferenceable URI to
something about the tag. In the typical blog
context, a blog post tagged “cat” should have a
link to a list of other posts on the same blog tagged
“cat.” It would be especially awesome if this
link were available in Atom processors unaware of this
tagging technique.

The URL structure of tags in the relevant system would be tag
space, and the Atom representation of a tag would
provide a dereferenceable URI to the tag space.

Using a tag space for your tags strikes me as nice in
several ways.
For one, it follows existing practice — flickr and del.icio.us both do so, as do
several
others.
Secondly, tag space URIs are nicely hackable.
I often pull up photos of mine by just typing in
http://flickr.com/photos/hober/tags/foo,
where foo is some tag I vaguely remember placing
on the photo.
Operator
takes advantage of the ubiquity of tag spaces by offering to
look up tags that it finds on pages on a variety of
services:

Operator’s handling of a blog post’s
“bzr” tag.

Basically, tag spaces are what make tags truly portable
across the Web.

It should be possible for an Atom processor to know that
this is a tag and not some other thing, without
local knowledge of the site in question.

That is, it should be possible to distinguish between an
atom:category used as a tag and an
atom:category used for some other purpose. (The
same goes for any other element used to carry a tag.)

It should be possible for an Atom processor to extract the
(normalized) tag from the element in which it’s stored
without parsing some attribute value or element content
into pieces.

All things being equal, atom:category is the
preferred element to use, as tagging is a form of
categorization — it’s a
semantically-appropriate element for tagging.

Tim’s proposal seems to be primarily motivated by #6, and
the way his question is phrased strongly implies the importance
of #8 as well.

Possible representations

Here are various possible ways to represent tags in Atom, and
how they fare against the above list:

This is how I store tags in my blog backend — as
atom:category elements with the
@schemehttp://tess.oconnor.cx/tags/. The
@term is the normalized tag, while the
@label is the human-readable version (if
different than @term).

I treat this specific @scheme as a tag space
— concatenating @scheme with
@term produces a dereferencable URI to a page
listing posts with the tag.
On this model, the mapping from atom:category
to the rel-tagmicroformat
seems quite natural:

In order to produce a dereferenceable URI to posts with this
tag (the sort of URI point 4 wants), an Atom processor would
have to somehow know that concatenating @scheme
with @term is something it might want to do
— there’s no explicit indication of that here.
That being said, such practice is already common enough in
the Atom world that several people have assumed this to be
the standard way to use atom:category —
for examples, see theseposts
on atom-syntax.

Without specific knowledge of my scheme, an Atom processor
has no way of knowing that I’d like it to treat this
atom:category as a tag, so this technique fails
6.
A global @scheme seems to be required for
tag-in-atom:category to pass 6.

While this doesn’t introduce any extension elements or
attributes, the urn:tag namespace would require
standardization (which, admittedly, is underway).
This makes point 1 somewhat arguable.

This technique just doesn’t have any interesting,
dereferenceable URIs (points 4 and 5), so it completely
releies on the Atom processor to come up with some, and it
doesn’t give the entry’s author the opportunity
to signal which tag space (if any) he’d prefer.

<categoryterm="foo"label="Foo" />

This technique is from Henry
Story’s comment
on Tim’s post: a
category is a tag with a namespace. So
don’t put a namespace (@scheme)
in if you want a tag… (emphasis mine).

This is basically technique 1, minus the tag space in
@scheme. This strategy scores well on points 1,
2, 3, 6 (arguable), 7, and 8.

As with the previous technique, this loses on 4 and 5 by not
containing any interesting, dereferenceable URIs.

Point 6 is contentious as Atom processors are under no
obligation to treat categories without schemes as being tags
— AFAICT, an
Atom processor would be perfectly conformant to assume
categories without schemes to be within some default scheme
of theirs. This is especially troublesome in the APP case —
I’d expect APP servers to do all sorts of crazy things
with such atom:category elements.

The rest of Henry’s comment implies that he
doesn’t think this technique has much to offer itself
over technique 1:

As I see it a category is a tag with a namespace. So
don’t put a namespace (scheme) in if you want a tag,
but you may as well put the scheme
in, since people can always treat your category
as a tag (by not querying on the scheme).

<linkhref="http://tess.oconnor.cx/tags/foo"rel="tag"title="Foo" />

This technique, proposed
by Aristotle, has the
nice property of being directly analagous to how the rel-tag
microformat is marked up in HTML.

It scores well on points 1 (arguable), 2, 4, and 6.

While it doesn’t introduce any extension elements or
attributes, it requires IANA registration of the
“tag” link relation per §7.1
of RFC 4287. So point 1 is debatable.

It only half-loses on points 3 and 5 — on the one
hand, an Atom processor that doesn’t know about the
“tag” link relation wouldn’t know what
this thing is, so it wouldn’t know how to find the tag
space and tag in @href.
On the other hand, I imagine the IANA registration for
“tag” could specify the same @href
parsing rules as the rel-tag microformat, thus providing
rel-tag-aware Atom processors the ability to extract the tag
space URI and the tag from @href.
Atom processors unaware of this link relation could and
presumably would display this link to the user, so all is
not lost in the fallback case.

It loses on point 7 for the same reasons outlined in the
previous paragraph — extracting the tag space and tag
requires knowledge of rel-tag’s @href
parsing rules.

This loses on point 8 — atom:link
doesn’t pack the semantic punch of
atom:category for representing tags. Though
given the rel-tag microformat I don’t think this is
that big of a deal.

This suffers on point 6 for the same reason as technique 1,
and on point 7 for the same reason as techniques 4 and 5.

The techniques which fare best on point 6 — which appears
to be the itch Tim’s trying to scratch — are 2, 3,
4, and 5. I’m guessing he’d eliminate technique 4 as
it doesn’t use atom:category. That leaves
2, 3, and 5 for Tim.
Now, to me, principles 4 and 5 are more important than 6, so
I’m more inclined to support techniques 1, 4, or
6. Err.
These are completely different sets of solutions.
I think it’ll help to see how actual behavior in the wild
matches up with these possible techniques.

Observed behavior in the wild

Vox uses something like
technique 1, though with a per-tag specific, non-tag-space
scheme. For example, consider the two tags on this
post of mine: “meta” and
“placeholder.” This is how Vox’s feeds
represent them:

Joe Gregoriocorrected me
in the comments — he occasionally uses the rel-tag
microformat.
In fairness, none of the entries appearing in his feed when I
wrote this were tagged.
Joe didn’t specify how he puts tags in his feed; I
imagine he stores them as part of each entry’s
atom:content.
Which reminds me, I didn’t list that as one of the
options above.

Granted, the plural of anecdote is not data, but it does look
like deployed usage favors technique 1, or something resembling
it.

So how do we deal with technique 1’s failure to adhere to
principle 6?
Maybe we shouldn’t care.
Lenny’s comment
on Tim’s post struck a chord with me:

Besides, tags hardly ever mean the same thing to two people,
so why should they have the same scheme? If some application
really thinks that <category
scheme="http://example.org/farmer/tag/"
term="apple"/> means the same thing as
<category scheme="http://example.org/geek/tag/"
term="apple"/>, it can just drop the scheme.

The frustrating bit of representing tags in Atom boils down to
the difference between the intentional type “tag”
and the representation type atom:category.1
Lenny’s comment reveals a way out: if you want to treat an
atom:category as a tag, just go ahead and do so
— ignore @scheme.
Essentially, TAG-EQUAL-P should only compare
@term, whereas CATEGORY-EQUAL-P should
compare @scheme and @term.
Which one you call depends on your purposes, and insofar as
tagging goes, actual world usage implies that it’s only
the @term that’s important.

Of course, atom:category elements are useful for
many more things than tagging.
But when representing tags in atom:category
elements, using @scheme as a tag
space and @term as the
tag seems like the best compromise to me.

Comments

Aristotle: I agree that #4 and #7 are contradictory aims -- I said upfront that I didn't think it would be possible for a representation of tags to satisfy all of these criteria. Also agreed on #6 -- I included it as it seemed to be the motivating concern behind Tim's post.