Goodbye time, datetime, and pubdate. Hello data and value.

Please note that since this was written, <time>, datetime and (possibly) pubdate have been reinstated, and made more powerful. Doctor Bruce has the low-down in his blogpost The best of <time>s. We preserve this merely to show our grandchildren that we played a role in the Time Wars.

We’ve come a long way in the HTML5 specification’s steady march towards ratification and implementation. The WHATWG’s energy has recently been more on post-HTML5 features that are being added to “HTML The Living Standard”, plus tidying up HTML5 for Last Call. However we’re still not past losing (or gaining) an element, with last week seeing the removal of <time> and the addition of <data>.

TL;DR — it looks like <time> will remain, probably with more permissive datetimes, and <data> will also remain, but it’ll take a little while before the dust settles.

<time> was originally added to allow dates and times to be machine readable, via the datetime attribute. This gives us human-readable content (“yesterday”) plus hidden machine-readable content (“2011-11-02”) with no accessibility problems. It allows for e.g. browsers to offer to localise dates. The pubdate attribute indicating an article’s date of publication was added for HTML to Atom conversion (also removed from HTML5 in this change), and would make it easy for search engines to sort by date. Having permitted dates and times specified in HTML5 (a subset of ISO 8601) allows a validator to check a datetime value is valid.

<time> has been one of the easier elements to understand for authors, as it’s semantically obvious. By comparison the microformats Class-Value pattern for datetimes is clunky.

HTML5 <time> element:

<time class="dtstart" datetime="2011-10-05T09:00Z">9am on October 5</time>

<time> has been pretty widely used for weblog article publication dates, and has made it into WordPress and Drupal plus being used by Google for search results.

The issues raised about <time> by authors were mainly that it didn’t do everything: it didn’t cover ancient and vague times, time durations, and there was no “last updated” attribute equivalent to pubdate. The other problem is there are a bunch of other less common but similar kinds of data that would also benefit from being machine readable and validatable, such as weights and prices. Minting a new element for each one would (arguably) be a lot of work, so Ian Hickson has added a generic element for these use cases instead — the <data> element, with a required value attribute.

The data element represents its contents, along with a machine-readable form of those contents in the value attribute.

The value attribute contains the machine-readable equivalent of the element’s content. The <data> element can be used as-is as an element equivalent of data-* for marking up private data for scripts (although without the dataset API). It can also be used in conjunction with microdata vocabularies (and potentially microformats), in which case the format of the value attribute is specified by the vocabulary.

This is because while <time> is semantically obvious, <data> is seen as an equivalent to <div> or <span>. However, “semantics for the sake of it” isn’t enough to justify being in the spec, despite the benefits. Another reason for the dismay is many people have had trouble pushing to use HTML5, and having an element removed gives fuel to anyone arguing HTML5 isn’t suitable for production.

While Google would no doubt love everyone to start using schema.org vocabularies, it’s a big increase in complexity. Adding <time datetime="…" pubdate> is fairly straightforward — learning and implementing microdata plus an appropriate schema.org vocabulary … not so much. Because of this fewer people will implement machine-readable article published dates.

To make matters worse, Google’s Rich Snippets Testing Tool (so presumably Google Search too) understandably does not yet know about <data>. This means if you use <data> to replace <time> now, Google will only see the human-readable text. <data itemprop="datePublished" value="2011-11-03">today</data> is interpreted as datePublished = today, not datePublished = 2011-11-03.

Also, now that specifying a datetime is not part of HTML5 we (presumably) can no longer validate datetime values using the HTML5 validator. Instead our only option is currently doing the two-step with Google’s Rich Snippets Testing Tool. Ironically as schema.org defined dates using ISO 8601, the imprecise dates and durations requested for <time> are now valid for datePublished, even though pubdate is the one usage of datetime everyone agreed on.

This conflicts with microdata, where the values of some elements are their URLs rather than their content. For example, currently <img itemprop="photo" src="http://oli.jp/photo.jpg"> gives the microdata output of photo = "http://oli.jp/photo.jpg". Adding value to the mix means there’d be two machine-readable values, so authors would need to know which elements couldn’t accept value.

This means the HTML5 specification has to specify each approved type value. While not as much work for implementers as a new element for each type of data, it’s still a bunch of work if the browsers actually do anything with that knowledge (like auto-converting type="money" into your currency). If type is required it also limits <data> to the types that are defined.

The easiest way to make everyone happy is to keep <time>in addition to adding <data>. However the cons for this are we’d have two confusingly similar but not always interchangeable ways to mark up datetimes, potentially with different rules on what’s a valid datetime. For example, we’d need to mark up an article’s published date and updated date using different syntax. Special cases and exceptions make things harder to teach and learn.

While our private conversation between doctors about this has tended towards the WTF end of the spectrum, I’m personally up in the air about it. Despite the easy response (“bring back <time>!”), this is one of those thorny problems where there’s no simple right answer. WHATWG is performing a delicate balancing act: pragmatically adding only features that have a lot of value, and removing any that don’t make the grade. In this case Hixie decided the cons of <time> (and of removing it from HTML5) outweighed the pros, and <data> is the result.

The one thing that bothers me about Hixie’s argument is that while datetimes are similar to other types of data that <data> now lets us mark up, they’re orders of magnitude more common on the web. Regardless of how it’s marked up, almost all weblog posts have the published date, and the majority of sites have a copyright date in the footer. In these use cases <time> was perfect, and definitely covered the 80%.

In my ideal world I’d like <time> to return, with the addition of a “pubupdate” attribute, and for all dates and times that fall inside HTML5’s definition to use <time>. For datetimes that <time> currently doesn’t cover, and for general use, we’d have <data>. Then again, I’m not sure I’d want to try teaching such intricacies to someone.

What do you think about this? While you’re welcome to just jump on the hogpile, I’d be interested to hear people consider all the pros and cons, and try to come up with a better (or less problematic) solution.

As a data point: I’ve developed one HTML5 website (i.e. using HTML5 elements), from a cursory read of an HTML5 book, including the time element.

It was fairly easy. I’m going to re-read the book and review the site as I’m sure the section-type elements could do with improvement, however, using time for the published articles was a no-brainer, and slotted in very easily.

The data element does not seem nearly so easy or obvious to use.

Also, I don’t understand how one value attribute on such an element is that useful? If anything, shouldn’t we add an ‘updated’ attribute to time that takes a data-time value?

Also(2), the time element nicely solved the accessibility issues with the microformats datetime-design-pattern, it doesn’t appear that data will solve that as nicely?

Now to find out which data are times, an agent has to consult a proprietary schema on some external website? How do I markup a time that isn’t part of a schema? How do I markup a time if I don’t use microdata? Note that microdata aren’t part of HTML5, they are only one of the vocabularies that can be used with HTML5. So HTML5 shouldn’t require them.

Clearly, with this decision, the overwhelming consensus, not only from web heavyweights such as Zeldman, Dr. Bruce, Eric Meyer, and Jeremy Keith but the larger web author community is that this is a bad idea, a horrible idea, a genuine WTF idea. Yet today it is seemingly being imposed on the web, and *you* dear reader, by one person, King-for-Life of the WHAT WG Mr. Ian Hickson. Prop up a dictator, and this is what you get.

I am an unabashed fan of the W3C. Yes, the W3C moves slower (an issue/problem that the W3C is grappling with), but the W3C *does* weigh all points of view and arrives at their decisions using a consensus based process. If the larger web-authoring community grasped (embraced?) the fact that building consensus takes time and discussion and real work, then perhaps you could better understand why the W3C doesn’t turn on a dime. Supporting a consensus based process has a cost, yes, but the alternative cost is having one person impose (or try) his decision on you, with no real recourse. Welcome to the world of the WHAT WG.

It’s not too late. The Mighty Steve Faulkner has requested a revert request to the W3C spec, a request that will likely be agreed to, especially given the overwhelming evidence and sentiment that this was an abuse of editing power by hixie.

But here’s my challenge to you: add your voice to the W3C process, and express your outrage, frustration or disappointment about this move to the W3C. It’s simple, it’s easy, it’s significantly more democratic, and it’s consensus-based. Send an email to public-html-comments@w3.org and speak up. Lend your support to Steve’s revert request.

And next time you are sitting around, slagging off the W3C as being a bunch of old out-of-touch gray-beards, think back to this day, this move, and contemplate what the real value of the W3C is to you: it let’s *you* have some ownership of the Open Web, and doesn’t hand off all your rights to one not-always-perfect editor.

Personally, I’m all for the idea of using a type attribute. However, I don’t see why this would cause a problem. If the type attribute is absent, then the element would work as it is defined now. If is used, then it would work exactly as the element used to.

The way I see it is that the element could work like the element, where changing the type attribute makes it behave like a completely different element.

Well, that’s what you’d expect from a “living standard”. Living things change. Who’s about to decide how we have to live? If we want to use a common language, we also have to accept that at some point there has to be someone to decide things that we don’t approve. If we want “our” HTML, we have to create it all by ourselves and ship the proprietary browser with it.
Of course yelling might change things, too, so I would not condemn it. But maybe there should be a “on hold” phase for every new decision, so we, the users, can give some feedback to them, the browsers.
This might lower the waves a bit.

You say, “browser makers in general are for this change, whereas authors are in general against it” but provide no reference. I’m not doubting you, but out of interest is that documented? I understand Opera does support <time>.

Regarding the last ideas of a value and type attributes: this is already possible by using RDFa. (Those names cannot be used though, as there are elements which already use them for different purposes.)

RDFa defines two attributes, content and datatype, for this very purpose. They are applicable on any element. Combined with property you can then use any property URI (such as the Dublin Core terms dc:issued, dc:created, dc:updated and so on), and any datatype URI (commonly xsd:date or xsd:dateTime). Fully decentralized regarding which vocabulary you use, and precise regarding the datatype of the value.

(Regarding capturing links, RDFa uses rel for that. There is some debate going on now regarding making the mechanism “smarter” in 1.1, for those not understanding the difference to property. Suffice to say, currently RDFa can capture both a rel+link and a property+literal on the same element.)

This has been a W3C Recommendation since October 2008.

The RDFa 1.1 working group is currently working hard to accomodate any ideas to further ease the use of RDFa in HTML5. (Also note that in RDFa 1.1, common prefixes like xsd: and dc: will be predefined.)

In general, I think that the semantics of document structure and the semantics of the content (message) are quite different. It’s a bit like the difference between sentences of words in a specific language and the various categories of writing (formal, lyrics, scientific and so on).

This difference can be captured by using RDFa with all kinds of vocabularies, enabling a means for expressing semantics orthogonal to the structure of a web page document (or application). Kind of like using semantic marker pens of different colors to enrich a paper article.

I’m disappointed for two reasons: the first is simply that time is immediately useful, one of the simplest HTML5 elements, and solves aptly solves minor but daily real-world problems – an example of what I’d hope HTML5 is about, making the web a better place. As has also been noted, the problems seem overblown: I work with ancient, ranged and fuzzy dates every day and there’s a repeated answer to the questions on the wiki: “Follow ISO-8601. They were not unaware of this need.”

My bigger cause for dismay was the arbitrary rush: just a simple fait accompli without a comprehensible justification. As has been freuently noted, the reasoning could apply to any semantic element: why and not ? The idea that something useful can disappear arbitrarily without much discussion or a compelling emergency makes me think that the spec is going to end up needing to be forked by less mercurial maintainers, which would be a rather unfortunate distraction. I can’t imagine this episode won’t increase hesitation about adopting HTML5 heavily: who knows what’s next?

(I’ve made some minor copyedits and clarifications to the article, just FYI)

@AlastairC — value alone is useful for your own scripts. For something like an article’s published or updated date, you’d need to use a microdata vocabulary or similar eg itemprop="datePublished". Regarding accessibility, <data> is equivalent to <time> — users get the element’s content, and the machine value is hidden in an attribute.

@boris — why would you want to do any of those things (find all the datetimes on a page etc)? Or to phrase it another way, what benefit do you want to get from marking up datetimes? There’s no requirement to use microdata vocabularies with <data>, but not doing so means you’d be using it like data-*.

@Michael C. Harris — this is inferred by the lack of dissension from implementers in the bug, including Opera who’ve actually implemented <time>.

@Chris Adams — dates and times in HTML5 were a strictly defined subset of ISO 8601, so using e.g. vague dates in datetime was never valid (not sure I understood you though). Also, the bug was open for 3+ months, during which <time> was marked as “at risk”, so I don’t think the change was rushed (unless you only heard about it now :)

I’m guessing you wrote something like “why <time> and not <article>”. <article> and other structural elements have benefits beyond semantics, such as the outline algorithm, ARIA role equivalence (accessibility), and being easier to author. <time> also has accessibility and ease of authoring benefits, but disadvantages too.

Final comment: In “Timeless” Jeremy Keith writes in more detail about the priority of constituencies. I suspect there may now be another level in this — the web platform itself. I think Hixie made this change for the long-term benefit of HTML, and I don’t think it’s cut and dried (I could personally argue for either side). If you want to change Hixie’s mind, providing real-world use cases like Ben Buchanan is would be the way to do it.

Using <data itemprop="datePublished" > for describing date and time is confusing and overly complicated and generally feels like using wrong tool for the job. <time> is already there, widely used (I think Bruce pointed out that WordPress blogs use it in the default template, for example) and feels like the right tool for the job.

I don’t think the arguments put forward for removing <time> are justified. Surely <time> (in the majority of uses) will be for modern dates that ideally should be machine and human readable. It is more likely to be used than say pricing or other data ideas.

I would bring back <time> as so much is based on time on the Internet as it’s a giant publishing platform. I think it deserves it’s own tag. However it should be made broader in scope to cover ancient dates etc. I can’t see any reason why the date range should be limited.

<data> should also be kept for marking up other types of information. For example measurements are not always as important as a date – they often exist only in their own context such as the dimensions of a room. I would add a type attribute to <data> that covered a large range of types. Think of the massive differences in what the type attribute does on <input> tags. I would like to visit a site and have the measurements shown in metric for me by the browser, or some easy way to convert. I want weights (well mass really, just everyone uses the wrong word!) on cooking sites to be converted to metric too, or my mum might want imperial. This sort of automation would be amazing and make building sites easier, we wouldn’t need cookies or logins to remember people’s user preferences. I’m making an estate agent website where automatic conversion would be great for room sizes.

I also think <data> should accept microdata vocabs for more advanced marking up. The type attribute would cover the basis like dimensions, weights, temperatures that can be understood by themselves but for something more complicated where several bits of information work together to create the entire context such as events, or the other stuff here: http://schema.org/docs/full.html then schemas should be used.

This way we have granular control over what we need – simple markup for simple data and complete markup for complex data.

Oli: my point regarding ISO8601 was simply that it seemed that most of the questions could be addressed by allowing a more complete subset of the spec.

I will definitely admit not hearing about the issue by now but it appears that was fairly common among people who were affected, suggesting that communications is an area for improvement, particularly for backwards-incompatible changes. It certainly didn’t feel like feedback mattered on e.g. the rapidly rebutted claim that nobody used .

What’s often missing in discussions like this (though not necessarily this one!) is semantic context – who are element semantics actually for? They’re for machine readers (bots, ATs..) not humans. The time element may have more obvious human- comprehensible semantics, but since humans never get to see them it makes no difference at all – as far as humans are concerned, it wouldn’t matter if we still built table based layouts.

brothercake: all of those clients are running on behalf of humans, however, who presumably benefit from making them more reliable. In this case there’s also the obvious possibility of human-visible benefits such as a browser being able to display using the user’s preferred formatting conventions.

And next time you are sitting around, slagging off the W3C as being a bunch of old out-of-touch gray-beards, think back to this day, this move, and contemplate what the real value of the W3C is to you: it let’s *you* have some ownership of the Open Web, and doesn’t hand off all your rights to one not-always-perfect editor.

On Thursday November 3rd the chairs of the W3C HTML WG instructed the editor of the W3C specification to revert this change and reinstate <time> back into the spec. Whether or not this will also be reflected in the WHAT WG spec remains to be seen, but in the consensus-driven process that is the W3C, the process won the day.

I dislike having both <time> AND <data> intensely. It feels hacky and underthought, like having <link> be the way you include a separate stylesheet instead of <style src="blah"/>. We can and should do better than that now.

So I like the compromise of adding lots of types to the <data> element, which browsers can eventually treat differently and helpfully.

But I wonder if supporting lots of types of <data> is really so much easier for the browsers than supporting lots of types of data-elements. None of them are doing fancy conversions with <time> yet, anyhow. Would adding <currency>, <distance>, <mass>, etc really be that much harder than adding all of these as types to <data>?

I like the simplicity of using <time>. I like the ideal of supporting all kinds of data more. But I really wouldn’t mind having a whole slew of data-elements, all of them just as easy to use as <time>. If that’s outrageous, then I think <time> should be sacrificed to make way for a cleaner, more useful future full of all kinds of <data>.

I disagree with your criticism of using the type attribute. The spec doesn’t need to specify every value, just a handful of important ones like “datetime”. Then authors can use any other type they feel like and it would just fall back to acting like a span tag (similar to how using unsupported input types falls back to a text box).

Also just wondering, how have Opera “supported the time element”? What special thing do they do with it?

Glad to hear that <time> will be back in the spec as of November the 8th. From a semantic point of view data never appeared to make any sense to me, as surely everything in a web page is data of some form.

[…] Goodbye time, datetime, and pubdate. Hello data and value. – The <time> tag has been removed from the HTML5 spec. and is replaced by <data>. <time> was originally added to allow dates and times to be machine readable, via the datetime attribute. This gives us human-readable content (“yesterday”) plus hidden machine-readable content (“2011-11-02”) with no accessibility problems. […]

Of course, a shorthand for this in the form of <time property="dc:issued" datetime="2011-11-11">today</time> would be nice, so the return of <time> is good. We’re currently discussing how RDFa 1.1 will interpret @datetime (e.g. its datatype).

Include, exclude, include again… reads like someone has to make up his (or her) mind.

As long as major browsers still fail to support CSS 2.1 to the fullest, and as long as not every SVG and EMACScript is supported… who cares for specs anyway? Been there, used them, and ended up shouting at browser developing companies.

Same with html5. From my origins, I once was an xhtml fanboy. Then html5 came around the bend and I saw filesizes shrink – which I still think is a good thing. Next you know all those and tags show up, which started to give me a bad deja-vu of eye-burning and tags from the 90s. Are we going back to our caves? Sure looks like it.

Why not simply give us usable (read: css stylable) forms, support for what the specs promised us “back in the days” while we’re at it?

Meanwhile I’m convinced we don’t need new specs to keep us busy, we need new browsers that do what they should be doing. If I would release software that needs “fixes” and “workarounds”, people would call it “trash”. Now tell me, what do you call browsers that don’t fully support CSS2.1? By “major browser” league, even you are losing in that section. And now that CSS3 is coming in, there’s more room for them to mess things up.

Let’s just hope html5 won’t end up where html2 went… it was nice while we were at it, but when html4 came around the bend, we all went “yippie” because browser developers said: “we can solve all problems by using that one”! Yeah, right. IE6 sound familiar? LOL

And what about today? General support for HTML4 and CSS2.1 is around 82% while HTML5 and CSS3 readyness is lower than 40%. That’s nothing a shiny, orange logo can cover up. But hey, browser vendors say it’s gonna be the coolest thing ever, so why not trust them?

And while we’re at it… let’s talk about the fact that in a not-too-far future, html6 is really gonna rock by “dumping all the useless tags”? ;)

No thanks. I’m stepping of the train before I fall back on supporting Netscape again. [EOF]

The real benefit of time will come when browsers actually implement rendering it. Only then will we see whether we need a more stricter interpretation, ie. whether there maybe a difference between the machine readable attribute and the text.

I think we all need to know what a compliant browser will do with:

Now

My understanding is at the moment “Now” should be rendered which will would contradict the supplied data. If things remain as they are I don’t see this being helpful for authors (unexpected results) or browser makes (thankless task). I would like to keep the time element and see it extended with useful formatting options such as long, short, etc. Maybe it should lose enclosed text altogether to avoid any confusion and allow for some dressing up?

The more I read this site (the best I have found on HTML5 so far) the more I am inclined to believe that many of the new elements are geared specifically towards blogs or other sites where the structure and content varies little from page to page.

My view is that we should have both <time> and <data> using each in the most appropriate context. While I think data is fine for complex date/timestamp information it seems to be ridiculously cumbersome approach for situations such as returning a time selected from a form – a likely scenario on an app or ecommerce site.

“Time” is about as semantic as you can get. Why have a more confusing element that needs to be qualified with a type. Browser vendors will most likely implement it differently (at least to start with) and I dare not even think what Microsoft might end up implementing….

I test my website on w3 then result is 11 Errors 10 are related to pubdate. it would be more if page contain more 10 posts.
mmm and thats too bad I think the solution is only by wordpress TM by releasing update specific to this problem.

Ok we’re now heading into 2013, this article was written in 2011 has there been any development on the use of vs . I for one have always used but using really isnt an issue to me in regards to additional time spent in development but the whole reason I would use it is that it’s semantically correct “at this time” but if Google doesn’t pic up the correct value for the datePublished that pretty much is a killer for me.

My site is not a blog, it collets lots of ancient book, all I know about the publish date for the books is “year”. For example, a book’s release year is 1917, if I use <time>1917 </time> for the page, can Google identify it? Or it is just meaningless?