Ian Hickson wrote:
> How is this different from what HTML4 did? HTML4 said "this is what is
> valid, and everything else should work too". And the browsers by and large
> did this, in an interoperable fashion (at great cost, and in a manner that
> made it very hard to enter the market). How does this differ from HTML5's
> approach, other than HTML5 making competition easier?
HTML 4 enabled parsers to defined their own error recovery. HTML 4
requires specific error recovery.
>> it very much raises the bar for implementing parsers
>
> This is demonstrably false, in that there are more interoperable HTML5
> parsers today, before the spec is even finished, than there have ever been
> interoperable HTML4 parsers. Even for valid documents of each.
Greater than zero (or perhaps one--can there be a single interoperable
parser?) is not a very high bar to hurdle.
> Absolutely. XML's approach has utterly failed on the Web (q.v. the
> universal feed parser for RSS and Atom). It would be amateurish of us to
> keep following this model after what we have learnt over the past ten
> years. We have a responsibility to the Web to do better.
There are reasons for that, mostly due to mistakes the W3C made in the
development of HTML. They pushed a syntax change without compensating
features to make the syntax changes worthwhile to implementers and
users. HTML 5 makes the opposite mistake: it's only pushing features
with no syntax changes. This seems likely to cause other problems.
> Also, I think it's pushing the truth a bit to say that draconian error
> handling is a core value of XML. The XML working group was quite split on
> the issue. [1]
They were split but draconian error handling won.
>> It makes the spec far harder to understand and implement.
>
> Half of the error handling is almost implicit, in that the algorithm that
> says what you have to do just handles all cases without needing to be
> explicit. So that's not harder to understand. The other half might be
> somewhat more involved than ignoring error cases, but, well, tough. We're
> not making toast here, we're trying to define one of the most important
> platforms that humanity has ever used. If it's a little harder to
> understand, sobeit.
Straw man. I am not suggesting that one ignore error cases. I am simply
suggesting that one might wish to report them and indicate them as such,
rather than defining them out of existence. HTML 5 error handling is
much harder to implement than draconian error handling that refuses to
parse or display malformed documents. Is the additional difficulty worth
it? I'm not sure? Is the HTML 5 spec actually clear and unambiguous
enough to achieve that goal? Maybe, but I've learned to be cautious
about such ambitious goals.
--
Elliotte Rusty Harold elharo@metalab.unc.edu
Refactoring HTML Just Published!
http://www.amazon.com/exec/obidos/ISBN=0321503635/ref=nosim/cafeaulaitA