Top 3 HTML Straw Man Arguments

In the sense used here, a "Straw Man Argument" is a
misrepresentation of the opposing view, set up in such a way that it is
easy to demolish. This "set-up" is meant to bring the opponent's position
into disrepute, in the hope of avoiding having to address the real arguments.

There are several Straw Man Arguments that appear on the WWW
authoring newsgroups so often that they are no longer funny.

Before we start counting, let's examine the myth of the "HTML Purist".
No-one has captured a real live HTML Purist that genuinely does express
the views attributed to him - or her - in the Straw Man
Arguments, so we can't put one "on the stand" and question them as to
their beliefs.
Nevertheless, the myth of the "HTML Purist" is one that has persisted
since quite early in the life of the WWW:
although the details have changed, the underlying principles remain.
Well, I am one of the people accused of being an HTML Purist, so, in
default of finding a genuine witness, you're now getting my views on the
matter.

The HTML Purist stands accused of wanting authors to only use the
HTML "Lowest Common Denominator"[c] of all
available browsers, wanting only to use HTML in such a way
that it "looks the same on all browsers".

Response:

The second claim is particularly absurd,
since HTML purists want HTML documents to be accessible also to
speaking browsers, as well as to indexing
robots, where they self-evidently don't "look
the same" as on mainstream browsers.

The HTML language is designed,
from the start, to represent the logical structure of the content
(emphasis, blockquote, cite etc.) which can be depicted in different
ways according to the situation in which it's being rendered -
graphics display, with or without auto image loading; character mode
display; speaking machine; etc. - last but by no means least the
indexing robot, without which your page is effectively becalmed on the
vastness of the WWW without hope of rescue.

"HTML Purists" look to the browser developers to develop
increasingly attractive and novel ways to present the content[d]. They consider that the HTML author's job
is primarily to create, organise and mark up their content, whatever it
may be, such that each one of a whole range of browsers can present the
content to the best of its ability, covering between them a
whole range of different presentation situations. They were frustrated
that the browsers' inherent ability to present well-structured HTML
seemed to have stagnated somewhere around the heyday of NCSA Mosaic (and
even gone backwards in some respects), while the browser makers added
more and yet more glitzy effects that had little or nothing to do with
the quality of their HTML rendering. Newcomers then approach HTML as if
it were a DTP
page design language, missing the point of the flexibility
that was designed-in to HTML from the start.

So, vendors added more and more presentation-specific hacks to HTML,
resulting in the awful compromise that was HTML/3.2.
The upshot was that the HTML Purists were frustrated at the poverty of
design exhibited by browsers, and at browsers' relative inability to
adapt to different browsing environments and user choices, while at the
same time the "DTP designer" crowd are frustrated at HTML's inability to
guarantee the precise appearance that they have convinced themselves
they need; and nobody feels that they are doing a good job on the WWW.
So sad.

Meantime, ingenious authors were trying to make-good the
browsers' lack of user functionality by devising additional client-side
gadgets based on, typically, Javascript. That might not be such a bad
idea in an otherwise hopeless situation, but it does mean that every web
site seems to have a new and incompatible collection of user interface
gadgets, needing to be learned afresh as each site is
visited, instead of having the tools present right there in the browser
and learned once.
And in all too many cases these
ingenious authors contrive to make their web sites un-navigable by
conventional means, making them entirely dependent on their own
Javascript gadgets, and thus unusable for readers who may not
be willing, or may not be permitted for security reasons, to execute
client-side scripts.
(A hint for browser users: fight back with bookmarklets:
snippets of Javascript which you execute under your
own control.)

But HTML4 reversed the trend by deprecating presentation-specific
attributes in HTML, and re-establishing the long-standing principle
of structural markup in HTML, with presentation proposals delegated
to stylesheet(s).
On the browser front too, we have seen improvements, with browsers
increasingly doing a good job of appealing to their users -
whereas previously there were rumours that browsers were being designed
primarily to deliver uncomplaining readers to wealthy advertisers.

This "HTML Purist" encourages you, as author,
to specify presentation as much as you find
appropriate, provided your message still makes sense when the
presentation details fail[e].

It simply isn't true to claim that HTML purists tell you to avoid
all newly-defined markups. Often the newly-defined markups have useful
fallback behaviours, and so those browsers that implement the new markup
get the benefit in terms of improved presentation, while those that
don't can still access your content or message.

Conclusion:

HTML Purists do not for a moment claim
that appearance isn't important - far from it: some of them
indeed have long been campaigning for better stylesheets support
in browsers.
They are happy for you to improve the
appearance of your pages for your favourite browsing situation(s) -
just so long as they are capable of graceful fallback in
other browsing situations, without impairing the content of
your message[g].

The HTML Purist stands accused of wanting everyone to see HTML
documents with black text on a mid-grey background, just like Netscape's
browser, and NCSA Mosaic before it, had always done when installed
straight out of the box.

Response:

HTML Purists consider that every reader
has the right to choose how they read their text. That means, the
freedom to choose the author's proposals, if any, or to reject them if
the reader finds them inappropriate. The problem with printed books, for
instance, is that everyone has to make do with the same font size, the
same paper colour..., irrespective of their choices or abilities. One of
the benefits of the WWW is that readers are relieved of this problem,
for example if they suffer from poor eyesight or colour-blindness.

Nevertheless, pages can be made more visually interesting, to those
readers that have no problems with it. The principle is to find ways
that work well, for those readers who can and wish to take advantage of
them, without impairing the results for others. HTML markup techniques
can (and, in recognition of the reality of access on the WWW, "should")
be assessed for their ability to fall back gracefully, when viewed in a
wide range of browsing situations and settings. To take a concrete
example: the HTML/3.2 BODY color attributes could be used effectively,
so long as the author specified all of the
color attributes, since browsers typically allow the reader to
insist on their own color configuration if they need to;
in CSS there is a similar best-practice principle of specifying
explicit colours either for both text and background,
or for neither, at any given specificity.

"HTML purists" can point to the mention of style sheets
already in the HTML2.0 specification - even one of the earliest web browsers
used a stylesheet to govern its presentation (see screenshot) - wonder why it has taken so much
time and effort to get them deployed; can explain that, when used
appropriately, they are the most reliable way to propose a
specific presentation for those readers able to take advantage, without
the risk of inadvertently impairing the presentation for those who
cannot; and wonder why the browser makers took sooooo long to
get started on implementing them.

Finally, most of the "HTML Purists" known to me do not actually like
mid-grey backgrounds, and are quite baffled that that this remained the
installation default for so long.
Of course, they support every
reader's democratic right to select a mid-grey background, if they
happen to be one of those few who really do prefer it!

Conclusion:

HTML purists have no objection to you
specifying a colour scheme if you wish, so long as you do it in
a way which can fall-back gracefully when circumstances are
against it.
HTML purists equally
defend an author's right to specify no colour scheme, although
many of them find it curious how long the installation default remained
mid-grey in spite of so many complaints about it.

Disclaimer: This document consists almost entirely of boring text.
If you think that proves anything, there's this bridge you
might be interested in buying...

Response:

The ambitious aim of the WWW was to be (and I quote) "the embodiment of human
knowledge". Not surprisingly, that encompasses more than
just "boring text": photos, diagrams, audio, video, client-side
scripting[h],
VRML, and more.
Long before Netscape introduced their scripting language (and then
suddenly confused everyone by renaming it Javascript), there was an
experimental browser which offered
client-side scripting
(and, by the way, was another early browser which had a form of stylesheets).
HTML Purists often refer back to this early browser, Viola: so what was
that again about them wanting "only boring text"?

The HTML
purist calls attention, however, to the fact that text is the one medium
that can be made accessible to all browsing situations: it can be
presented on a character-mode display, on a graphic display, it can
be printed out for later consideration, it can be input to a Brailler,
fed to a speaking machine for a vehicle driver or telephone caller,
or to blind readers. By comparison, images cannot be
read-out over the telephone, nor are they very accessible to a
blind reader, nor can they be indexed by robots in any content-related
fashion; audio cannot be perceived without appropriate equipment, nor by
a deaf reader, nor is it always convenient (in a library, for
example). In short, text is the most widely accessible
of all the media, as well as being the reliable
way to get your material indexed by the web robot services.

HTML Purists recognize that readers may access your pages in more
than one way (from the multimedia station at home over dial-up; from the
now-quite-old PC their company has at the office; from the top-end
workstation in the design office; WebTV and similar TV-based
appliances; from a laptop with an inadequate
display, dialling up over an overpriced hotel phone line;
palmtop/cellphone combo; from the
public library...); and the trend is clearly towards ever
more diverse browsing situations:
if you repel them, keep them pointlessly
waiting, or stick a load of unusable junk onto their display, without
justifiable reason as far as they can see, what's the
chance that they'll bother with your pages even when they're in the
situation that fits your demands? Instead, they might be so annoyed as
to complain about you to their friends and colleagues, something that
you could do well to avoid.

On the other hand if you welcome them with the basic information
that they need, they'll be realistic enough to understand that your
other excitements (VRML, whatever) were inaccessible to them for
justifiable reasons, and might want to revisit later when they're in a
position to enjoy.

Conclusion:

In recognition of realities, HTML Purists encourage
you to make full use of all appropriate media for
your purpose, but to ensure that the core of your message
be available as well-marked-up text, so that it can be perceived by
any reader in any reasonable browsing situation.

Those pages with meaningful text will be the ones that count at the
robot indexers. Those who welcomed the indexing robot with 'helpful'
messages like "get a proper browser", "your browser does not support
frames" etc. are right there on the record: you surely don't want to
join them?

Summing up

The HTML "Purists" are really "Pragmatists". They have seen what happens
when over-ambitious pages collapse in a heap, in browsing situations
that are just a little outside of what the author expected. They have
some familiarity with the HTML specifications and drafts, and the actual
browsers that are out there, and they have developed some idea of how to
enhance their pages for a high-end browsing situation without causing
the page to become inaccessible to other situations.

They would, indeed, love to be able to rely on the
full range of the many useful HTML constructs that have been drafted
over the years, but they recognize that it is unrealistic to do so. They
are authoring for the World Wide Web, for the readers that are out
there, with their various abilities, using the variety of
browser/versions that readers use; they are not authoring exclusively
for some vendor-narrow or 21-inch-screen-24-bit-color readership. They
understand how to use optional enhancements, in ways that do not harm
those unable to use those enhancements; and try to avoid relying
on a feature without which the content may make no sense, or
become misleading. They make their authoring decisions accordingly.

Pedantic notes

[b]
There never was an "HTML/1" - the original HTML
didn't have a number at the time, and HTML 2.0 was the first
version to be numbered and codified.

[c]
Lowest Common Denominator is another of
those terms borrowed from a specialist field (in this case
of course mathematics), where they have
a precise technical meaning, and misused to mean something
quite different, just for the sake of sounding impressive.
To get the sense that's intended in common parlance, the correct
mathematical term would be "Highest Common Factor".

[d]
Most browsers have really been very primitive, lacking even
the most obvious user-friendly refinements such as Overviews;
multiple windows into a corpus of
knowledge (e.g a user-oriented "frame" view, rather than
the crude and limited author-oriented "frames" "design" introduced
by Netscape); user-driven access to footnotes and asides; well-designed
print formatting; etc., all based on fitting one, well-authored
well-structured HTML document to each of many different browsing
situations and user choices, optionally with hints taken from
an author-provided style sheet.

"Purists" consider that the HTML author's job is primarily to marshall
their content, and mark it up honestly according to its logical structure.
This can then be presented beautifully, in those browsing situations that
are capable of it, by supplying appropriate stylesheet(s), without
harming the accessibility of the content to more-unusual browsing
situations.
Many newcomers seem blissfully unaware of what could be possible, and
simply take it for granted that HTML is nothing more than another
DTP facility for them to spend their
time and effort "designing" page layouts, one at a time, when good
stylesheet design would leverage its designer's ability across a
whole corpus of work, and good browser design could leverage
the browser designer's expertise across the whole WWW.

Purists do not for a moment decry good graphic design, indeed
they welcome it when used in genuinely effective ways. All too
often, though, on the WWW, "graphic design" is applied in misguided
ways that only work in a limited range of browsing situations, and
are actively hostile to effective browsing in other situations.

[e]
Don't say "click on the green text" - you have
no idea whether the text is green for every reader.
In fact, don't say "click on" at all; make the actual purpose
of the link into the link's active text,
leaving the sense to flow
naturally. As an old but still relevant Web Style Guide at the W3C
said, Don't mention
the mechanics.
There are some good reasons for this, quite apart from it being good
style: some browsers and indexers can give a summary
of the links on a page, and it really isn't helpful when the summary
reads something like this:

Click here

Click here

and here

without any indication in the active text of the purpose of each link.
You say you've never seen such a useful feature in a browser?
Well, Microsoft provided this as an add-on
for IE5 for example; and recent versions of Opera (e.g 7.01)
include such a feature as standard, making it possible to show
a link list in at least two different ways: on the "links panel",
or by means of the "View"> "Links in Frame" menu.

See also the next footnote, for a specific problem
of this kind.

[f]
For dealing with adverse viewing situations, browsers
typically offer the user the ability to override the
author's colour specifications.
However, Netscape versions up to and including
Netscape 4 are/were notorious in allowing the author's
font color specification to take precedence over the
user's "use my colors" option: thus, if the author's text colour
happens to match the user's choice of background, the text disappears.
It's kind-of poetic that font color
was at its most risky on the
browsers from the very vendor that introduced this extension.
Style sheets, even in the face of buggy implementations,
are a safer way to handle colour proposals.

[g]
..and if your material is of the kind where that genuinely isn't
possible (calligraphy, say) then you really should be asking yourself
whether HTML is the appropriate medium for your material.
There's nothing wrong, in appropriate cases, with using HTML merely
as a thin deposit of "hyperglue" to paste together the various media
that you use for your graphic designs or whatever, but this isn't what
I'm talking about when I'm seriously discussing the "authoring of
HTML for the WWW".

[h]
There's a curious superstition around that the "purist" insistence
on syntactically valid HTML means that client-side scripting is ruled
out. This is entirely
wrong, but the propagation of factually-incorrect
myths and superstitions seems to be an
essential part of the "anti-purist" approach.

[?]
If you aren't using a CSS-aware browser, that's no problem:
this page has been designed to work with or
without, although naturally I think it's better with the style sheet.
By now, you will probably have worked out that these numerous
footnotes were intended as a bit of an academic-type joke.