Would it be a major piece of work to get tidy to recognise whitespace within
tags? Is that valid HTML that Tidy should be trying to cope with, or is it
badly written HTML.
Badly written HTML.
End-tag ::= '</' name white-space* '>'
Start-tag ::= '<' name (white-space+ attribute-name white-space*
('=' white-space* attribute-value)?)*
white-space* '>'
White space characters, including line breaks, are allowed before the
closing '>', not NOT after the opening '<' or '</'.
Strictly speaking, in SGML, if you have a '</' that is not followed by
a letter, it is character data. If you have a '<' that is not followed
by a letter, slash, bang, or question mark, that's data.
So
Here < I > stand and < em > can do other </ i >.
is all data with not a tag in sight.
(This isn't the full truth, but it's right for the SGML features that
HTML selects.)
In XHTML, a '<' that is not followed by a letter, slash, bang, or question
mark, or a '</' that is not followed by a letter, is flat-out illegal.