# [01:42] <Hixie> " student is making a timeline of important events in Apple's history. As he reads Wikipedia entries on the topic, he clicks on dates and selects "add to timeline", which causes an entry to be added to his timeline"

# [01:43] <Hixie> assuming that the timeline is implemented by a third-party app or web site

# [01:43] <Hixie> and assuming that the browser doesn't have built-in knowledge of this app or site

# [14:26] <Philip`> Hmm, I meant something more like "Validation badges are a bit useless given that people use them for invalid files"

# [14:26] <Philip`> If they were a guarantee of validity, they'd be reasonably useful, because they'd save you the effort of running the validator when you want to see that a page is valid

# [14:28] <jgraham> Philip`: Since you can't rely on people to do that, I maintain that my restatement is equivalent to the original

# [14:30] <Philip`> jgraham: I assumed you were reading my statement as more like "Validation badges a bit useless in the specific cases in which people use them for invalid files", and your restatement was not equivalent to, and more correct than, the original

# [14:32] <Philip`> (so I was clarifying that my statement was indeed intended to be equivalent to your restatement)

# [15:49] * Philip` reads a book on network routing design, and finds a bit saying "This design looks good on paper until you try to implement it. It's very difficult to justify all the AS numbers from American Registry for Internet Numbers (ARIN) or other Regional Internet Registries (RIRs) required for such a network. AS numbers are a finite resource that is quickly being depleted."

# [15:49] <Philip`> It's both funny and sad that arbitrary integers are a scarce resource

# [15:49] * jgraham decides to pick up a few spare when he is next out

# [15:52] <Philip`> (and just like NAT is a 'solution' to IPv4 scarcity, the book suggests using private AS numbers and then doing some ugly tricks to hide them and look like a single globally-unique AS number to the rest of the world)

# [15:54] <Philip`> (and just like IPv6 is a better solution, there's a new extension for 4-byte AS numbers (which is more than anybody will ever need), with approximately zero adoption so far)

# [15:56] <Philip`> (I suppose the lesson is that when you design a protocol with a globally-unique bounded number, decide how many bits you'll need if the protocol is wildly successfully and used by the entirety of human civilisation throughout our galaxy, and then double it)

# [15:56] <jgraham> The other solution might be to not design protocols that rely on globally unique bounded numbers (but that may be impractical)

# [19:39] <tantek> takkaria, and regarding biases, Hixie tends to have a bias for productive pragmatism, and microformats certainly have a stated bias for productive pragmatism, so it should not be much of a surprise that some overlap in opinions is the result.

# [19:40] <takkaria> I have no side in this debate, and some of the accusations being bandied around are just mad

# [19:42] <tantek> takkaria, while some of the accusations could certainly be taken as "mad", they do serve two useful functions, 1. the above-noted prioritization enabler, and 2. sometimes even "mad" accusations may actually be a sign of some other issue (other than the one made in a "mad" way) that is worthy of consideration.

# [19:43] <takkaria> but I think it's a bit extreme to say that Hixie gave you vetting powers over the use cases list

# [19:44] <tantek> I believe Hixie has made this point (I don't have a citation) that sometimes even trolls (or more precisely, input of a trolling nature) can be helpful with the issues that are unintentionally point out.

# [19:45] <tantek> takkaria, such a statement ("gave ... vetting power over") is both extreme, and has the flaw of being a deliberate exaggeration, likely to construct the basis of a strawman argument. http://en.wikipedia.org/wiki/Straw_man (logical fallacy)

# [19:47] <takkaria> indeed, I do have something of a grasp of argumentative logic and fallacies. :)

# [19:47] <tantek> the only logical steps to take are to: 1. point out the strawman fallacy (and any other logical fallacies being made in defense of positions that are typically indefensible with evidence) and 2. (de)prioritize/filter input from such sources accordingly over time.

# [19:54] <tantek> Hixie, it is not that weird, in that the paranoid are quite skilled at inserting however many layers of abstraction / meaning are necessary in order to twist a quote to justifying/bolstering their claim.

# [19:54] <tantek> Of course, if the very intent of the paranoid is to push for positions advocating for increasing layers of abstraction / meaning, then resultant accusations exhibit a level of meta-irony that itself has value along the axis of humor.

# [19:58] <Hixie> i think it's important to distinguish between certain people who have clearly just been spurned (like whoever does lastweek, and probably shelley) and people who are taking what those people say as gospel (like a lot of other people)

# [19:58] <Hixie> the latter group aren't "paranoid", they've just been misled

# [20:04] <Hixie> of course the most useful thing to work out here would be why those who got spurned feel like they got spurned

# [20:05] <tantek> Hixie, not sure if that is necessarily "useful" per se, if by "useful" you mean productive, and if by productive, we mean the next highest quality/quantity yield next-actions you could take, or even close. econ / diminishing-returns etc.

# [20:06] <Hixie> i think finding out why people feel spurned is important from a moral perspective and from a pragmatic "stop making people run away" perspective

# [20:06] <tantek> sometimes you just have let different ideas compete in the market, and let some succeed, while others struggle, wither, and are abandoned.

# [20:07] <Hixie> yeah, but as i don't want mine to wither (since that would mean my life was mostly wasted), i'd rather figure out why it is people are abandoning it, and fix that problem

# [20:09] <Hixie> speaking of which, i spoke to yet another group of people working on microdata stuff yesterday, who have neither a horse in the mf space not the rdfa space, and heard yet again the same thing -- "we want to have a generic syntax so that people can mark up stuff that is then exposed to an API, without hard-coding specific class values into the parser"

# [20:09] <tantek> Hixie, lest we forget history, it was the browser makers (and you and I included, at the time) who felt "spurned" by W3C staff and advocates for XHTML2+XForms+SVG+namespaces+(insert X-* specs here) at the Web Applications and Compound Documents workshop in 2004-04 in San Jose. http://www.w3.org/2004/04/webapps-cdf-ws/minutes-20040601.html

# [20:09] <Hixie> apparently they are considering using a subset of rdfa instead

# [20:10] <Hixie> though their faces when they said that indicated that they felt that was more of a situation they were being forced into than a good thing

# [20:12] <tantek> Hixie, the point is that both WHATWG and microformats efforts were created because you and I both thought we it was worth the time and effort to explore alternatives to what the W3C was saying was "the right thing".

# [20:15] <Philip`> I think we should remove all the data from web pages, and use the saved bandwidth for adding more colours

# [20:16] <Hixie> i've spoken to multiple groups during this research, from big companies like yahoo to small companies like manu's, from individuals doing front-line work like livebrum.co.uk to researchers like tbl, and many of these are actively investing time and money into implementing solutions that do both mf and rdfa, and with the exception of people actively working on one or the other, they are complaining about both.

# [20:16] * tantek wonders if we should plan a party for 2009-06-01 - the 5 year anniversary of the Great Web Semantics Schism.

# [20:16] <Hixie> many of these are in fact shipping (e.g. yahoo, shipping support for both mf and rdfa, livebrum.co.uk, shipping mf support)

# [20:19] <Hixie> just because it evolved from POSH doesn't mean it's solving all the relevant problems people are having

# [20:21] <tantek> Philip` - anything less than schism would be politically correct watering down. Nothing short of a schism happened that day almost five years ago.

# [20:21] <tantek> When all the browser vendors and professional web designers/developers found themselves agreeing, and also disagreeing with (most of) the W3C staff and X-* markup language advocates - it became quite clear to everyone in the room that a schism had occured.

# [20:22] <Hixie> the big indicator was that Microsoft and Sun were in agreement

# [20:25] <tantek> Hixie - I'll politely remind you that microformats both never deemed to solve "all" problems people are having (leaving out "relevant" as different people have different measures for that), and even made that a non-goal: http://microformats.org/wiki/microformats#microformats_are

# [20:25] <Hixie> just because you explicitly set out to not solve all the problems people want solved doesn't mean people will like it :-)

# [20:26] <Hixie> the point is everyone i speak to who is using microformats complains that it doesn't provide the flexibility they want and they end up implementing other things too (rdfa, or a subset thereof, typically)

# [20:26] <Hixie> in my mind, that tells me there is a problem i need fixing in html5 with a solution that isn't just "use microformats"

# [20:27] <Hixie> (it's also not just "use rdfa" given that those same people complain about rdfa, e.g. saying things like "oh we'll ignore the xmlns attributes and just prebind certain prefixes that we recognise")

# [20:27] <tantek> or you could punt on the problem altogether in html5

# [20:34] <tantek> I'm familiar with that problem, but perhaps not as many as 15, more like 3-5.

# [20:35] <Hixie> i read 15,000 lines' worth of e-mails and got about 150 lines' worth of use cases out of it

# [20:35] <Philip`> I think you should solve the problem by saying that all HTML documents are conforming, and conformance checkers must always return success, and then that'd let people do whatever solutions they want

# [21:30] <tantek> gsnedders the only real difference between POSH, in particular semantic class names and rel values, and microformats is that microformats specify a shared vocabulary (developed via the principles and the process) where as with POSH, there is little or no sharing of vocabulary. Is it the sharing of vocabulary you think is too case-specific?

# [21:33] <gsnedders> tantek: No, well, sort of. I see being able to use a generic parser to be a good idea™, but also I think RDF's approach of just mapping everything as object -> property (or whatever the terminology is) is too broad, and that there need

# [21:34] <gsnedders> *needs to be more, um, let me say "non-orthogonal parallel mapping at the same level", which is really clumsy but is the sort of thing I mean

# [21:35] <tantek> gsnedders, there are lots of "good idea™"s, but in a resource constrained universe (say, a universe bound by the laws of physics), you end up having to prioritize some ideas over others, which leads one to realize, that that prioritization is actually what is important. And ideas that may seem "good" in the abstract (in a vacuum) actually seem not so essential when compared to other ideas that you'd rather spend your limited re

# [21:36] <tantek> the notion of "making it really easy to write a parser" is one such seemingly good idea in the abstract.

# [21:38] <jgraham> tantek: That page wasn't really very helpful. It seemed to be a list of html elements

# [21:39] <jgraham> I really meant "what characteristics do you think markup has to have to be considered semantic" and, as a followup, "why is having those characteristics a good thing"?

# [21:44] <tantek> jgraham, I prefer to answers first with the immediate pragmatic answers ("what do you mean by semantic HTML") and only after those are discussed/understood, to raise the conversation to an abstract level ("characterists", what is "good")

# [21:45] <tantek> in my experience, most people simply want the immediate pragmatic answers to the questions, and pragmatic answers form a better basis for abstract discussion than pure abstract discussion.

# [21:45] <jgraham> tantek: Hence the second question being the followup to the first

# [21:45] <tantek> precisely, and yes, the POSH page goes into more depth

# [21:52] <jgraham> So the rules seem to be a) validity, b) Replace some elements (<b>, <br) with other elements, c) use @id rather than <a name=""> d) Use some patterns (only one seems to be documented?) e) Use classnames that identify abstract concepts

# [21:54] <tantek> jgraham - many semantic (X)HTML patterns have been documented across the web, and the POSH wiki page serves as an attempt to provide a collection and index of sorts

# [21:56] <jgraham> tantek: OK, so coming back to my question, a) does not seem to be a question of semantics. It is likely ood for the producer of the page (since valid markup is often easier to debug) and consumers (since the HTML 5 parsing algorithm is not yet widely implemented)

# [21:58] <jgraham> b) Is closer to the issue that I care about. It is very easy for me to make an argument for using certian elements "correctly" (<h1>-<h6>, <th>, <td>) because they are used consistently enough that UAs can process them in a useful way (and many do so)

# [21:59] <jgraham> So if you use <td> when you mean <th> it actually hurts users

# [22:00] <jgraham> But <strong> is used to mean a whole laundry-list of things and in practice there is no difference between <strong> and <b> in UAs. So what is the argument for caring?

# [22:01] <jgraham> Similarly, what is the harm ifI make a bibliography and mark up all the titles of the works using <i>?

# [22:01] <jgraham> It's not like any UA could do anything useful with the information if I marked it up in a differnt way

# [22:02] <jgraham> Yet people spend hours debating the "right" markup to use in every tny situation

# [22:05] <tantek> jgraham, there is some abuse of <strong> no doubt. my own personal anecdotal experience is that there a <strong> *is* a bit better used (less abuse) in practice than <b>. using more semantically named tags also helps reinforce a more semantic mindset when authoring (rather than the "HTML as a way to print pixels" mindset).

# [22:05] <tantek> and I have seen some reasonable arguments for repurposing of <i> as a semantic element for bibliographic (and other such) /instances/

# [22:06] <tantek> thus yes, there are points of semantic HTML development which are debated, and there are points that are well agreed upon (often as a result of such debate occuring)

# [22:06] <tantek> these debates started occurring mostly on blogs in the early 2000s.

# [22:09] <jgraham> tantek: If you could get enough people to agree on and use say <i> for a specific purpose so that UAs could actually do something other than guess it should be emphasised in some way then that might be worthwhile (hence <h1>-<h6> being good). But I don't see that as possible

# [22:28] <Philip`> and the issue is more about how they don't solve the entire problem, because you still need to write a specific tool or design a specific language on top of those things, and then the problem is whether that generic framework plus the extra task-specific work is more expensive than a different generic framework (e.g. one that's far more minimalist) plus the extra task-specific tools

# [22:28] <Hixie> gsnedders: other than IE, i believe so, yes, but test if you want to find out for sure

# [22:33] <jgraham> Philip`: Which tends to suggest that people underestimate the extra effort that you have to put in after you have a parser to do something usable.

# [22:33] <tantek> one of the benefits that sought but often unsaid (unfortunately) is accurate data, and it turns out, the more human-usable the data (both authoring and viewing) the more accurate the data tends to be.

# [22:34] <jgraham> On the other hand there is value in not having to continually reinvent the lower layers just because the high layer stuff is non-trivial

# [22:35] <tantek> you could broaden that aforementioned "common theme" to technically inclined people tending to design for machines (e.g. parsing the content) at the expense of humans (e.g. authoring the content)

# [22:35] <Philip`> As someone who's spent substantial amounts of time reverse-engineering weird custom binary file formats for some games, I'm certainly happy when they just use XML, because it saves a load of that low-level work

# [22:36] <Philip`> (Of course they then compress the XML in a custom binary format, but at least that's just one format and not half a dozen for all the different types of data)

# [22:36] <tantek> sure, having hints at the apparent boundaries and types of data helps a lot

# [22:37] <Philip`> Presumably the RDFa idea is it's much easier to process data when it's in a graph of subject-predicate-object triples than when it's in a tree of elements/attributes

# [22:38] <tantek> it also turns out to be often more easy/robust to author data in a tree of elements/attributes that's displayed in some readily viewable tree-like form rather than a stream of bytes.

# [22:38] <tantek> so the move from bytes -> tree of data made sense on both the cost and benefit side of the equation

# [22:40] <Philip`> You don't author data in a tree, you author it in a stream of bytes that represents a serialised tree

# [22:40] <tantek> but at some point, the requirements to add structure actually cost more than humans are willing to do to author it.

# [22:40] <Philip`> (Well, I suppose you could have a fancy XML editor that lets you edit in a tree-like UI)

# [22:42] <tantek> thus it's easy for people to do, because it's part of human culture to do so

# [22:42] <tantek> however, as soon as you talk about attempting to author things as a "graph" (and you're not talking about charts), all of that breaks down. authoring cost goes straight out the window, resulting in much less content. simple economics.

# [22:46] <tantek> Philip - it's precisely that complexity of the tradeoffs that is all too often ignored, and specifically, the harder to quantify human aspects (ease of authoring, mapping to existing understood and practiced modes of human communication etc.)

# [22:46] <tantek> and so you get people debating about generic parsers instead, while the human related problems actually have a much bigger impact on both the overall cost and the benefits from any such system

# [22:48] <tantek> it's precisely because of this (technically minded folks ignoring human aspects because such aspects are often much harder to understand and design for), that "What are microformats?" starts with "Designed for humans first and machines second" - http://microformats.org/

# [22:48] <jgraham> I can certianly agree that one of the big problems wih RDF(a) is the ifficulty of thinking in the right mode to author it. Proponents seem to say things like "it's just four attributes" whilst failing to understand that the underlying data model is rather abstract and complex

# [22:49] <Philip`> Yeah, it's a danger if you just consider the generic framework in isolation, rather than as part of a solution that includes some vocabulary and some vocabulary-specific processing in order to solve particular use cases, because you'll be optimising half of the costs and ignoring the rest

# [22:55] <Philip`> jgraham: I remember being taught about nouns and verbs when I was in primary school and it wasn't really that complex, and these triples are basically the same thing :-p (except with URIs, which I wasn't taught about in primary school)