The SitePoint Forums have moved.

You can now find them here.
This forum is now closed to new posts, but you can browse existing content.
You can find out more information about the move and how to open a new account (if necessary) here.
If you get stuck you can get support by emailing forums@sitepoint.com

If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Wrapping an anchor around a block level element in xhtml doesn't validate. It might be well formed in html5

Just a slight correction. "Well formed" is an XML concept; you could apply it to XHTML5 but not HTML5. HTML5 speaks of "conformance," so well-formedness from XML would be conformity, so possibly "conformal" would be the equivalent term for "well formed."

The W3C DOM Core module defines how to access, read and manipulate an XML document. Well-formed HTML documents are XML documents, so these methods and properties can be used to completely rewrite any HTML page, if you so wish.

Though HTML documents are XML documents, they have a number of special features that the average XML document doesn't have. The W3C DOM HTML module defines these special cases and how to deal with them.

So you're saying you can only consider things well-formed in xhtml and not html?

Weeeellll......

OK. I didn't intend to start a trivia discussion, but that's a fair question.

You can *talk* about well-formed in HTML2-4, but not in 1 or 5, and in all versions of XHTML. Why?

Because I oversimplified things just a bit, partly because I didn't want to bog things down with discussions of things other than the difference between HTML5/XHTML5, but I see now that was a mistake on my part.

I said "well-formedness was an XML concept," when, if I were to be absolutely and universally precise I should have said, "well-formedness is a concept XML inherited from SGML." Since there was a wholehearted attempt to make HTML a subset of SGML in versions 2-4, you *can* apply those concepts to those versions of HTML in a futile way (futile, because no browser depended upon well-formed HTML).

The point I was specifically referring to mentioned it in an HTML5 context, which was why I answered the way I did, that because XML had the concept of well-formedness, XHTML5 could be that, but HTML5 itself had no concept of well-formedness, only of conformance, so the term shouldn't be used there.

But I think your PPK quotes are out of context. Specifically the second: Peter-Paul Koch most assuredly knows that not all HTML documents are XML documents. I think the context for that quote assumes "well-formed" instead of plain HTML.

In fact, the preceding requirements are the only ones that you must satisfy to make your HTML files well-formed XML. It doesn't matter which browser's HTML extensions you use or whether you "abuse" HTML tags or not. XML is a truly liberal language; it makes you a creator of your own universe whose rules you're unlikely to break simply because it's you who establishes them.

That page doesn't contradict a single word I wrote. It speaks of well-formedness as an xml concept and applies it to an older version of HTML. In short, it says the same thing I said.

If you want to extend the practice to HTML5, then you're simply writing "bilingual" XHTML5 (sorry I'm blanking on the official term for it, but I'm after the flavor of HTML5 that can be served with either xml or html mime types). Well-formed still doesn't apply to vanilla HTML5.

Even if it is a widely spread misconception that well-formedness is a concept that can be applied to HTML or SGML, it is not technically correct. The SGML and HTML specs do not define such a concept. (Try searching for "well-formed" in http://www.w3.org/TR/html4/html40.txt )

Originally Posted by Arlen

I said "well-formedness was an XML concept," when, if I were to be absolutely and universally precise I should have said, "well-formedness is a concept XML inherited from SGML."

No. It's a concept that XML invented. (It was invented in order to support DTDless parsing.)

In SGML, a document instance is either valid (according to its DTD and SGML declaration), or it is not.

A well-formed document conforms to the XML syntax rules; e.g. if a start-tag (< >) appears without a corresponding end-tag (</>), it is not well-formed. A document not well-formed is not in XML; a conforming parser is disallowed from processing it.

Also, I have validated some of my xhtml served as text/html(which is html) in a xml validator and it passed.

I have yet to see documentation stating that I am incorrect in my belief.

I didn't say you were incorrect in your belief, I was just asking a few questions. They were not rhetorical, by the way; I'm still curious.

I agree that the XML spec says that any stream of bytes that matches the syntax for XML is XML, even if that stream of bytes is labeled as being something else. This is because the XML spec does not discuss the transport layer at all.

The requirements of the HTTP transport layer are given in RFC 3023 which says which content you should treat as XML. text/html is not part of it. The RFC for text/html says that

Originally Posted by RFC 2854

Implementors of text/html interpreters must be prepared to be "bug-compatible" with popular browsers in order to work with many HTML documents available the Internet.

(See -- it says you should be bug-compatible! It doesn't even say you should use an SGML parser for text/html.)

(RFC 2854 will probably be obsoleted by a new RFC that makes this statement clearer by saying something like "Implementors of text/html interpreters must follow the HTML5 parsing rules.".)