6. Characters, Words, and Paragraphs

6. Characters, Words, and Paragraphs

An HTML user agent should present the body of an HTML document as a
collection of typeset paragraphs and preformatted text. Except for
preformatted elements (<PRE>, <XMP>, <LISTING>, <TEXTAREA>), each
block structuring element is regarded as a paragraph by taking the
data characters in its content and the content of its descendant
elements, concatenating them, and splitting the result into words,
separated by space, tab, or record end characters (and perhaps hyphen
characters). The sequence of words is typeset as a paragraph by
breaking it into lines.