This chapter is from the book

After finishing up the discussions about the World Wide Web and getting
organized, with a large amount of text to read and concepts to digest,
you're probably wondering when you're actually going to get to write a
Web page. That is, after all, why you bought the book. Wait no longer! Today,
you get to create your very first (albeit brief) Web page, learn about HTML (the
language for writing Web pages), and learn about the following:

What HTML is and why you have to use it

What you can and cannot do when you design HTML pages

What HTML tags are and how to use them

How you can use style sheets to control look and feel of your
pages

What HTML IsAnd What It Isn't

Take note of just one more thing before you dive into actually writing Web
pages. You should know what HTML is, what it can do, and most importantly what
it can't do.

HTML stands for Hypertext Markup Language. HTML is based on the
Standard Generalized Markup Language (SGML), a much larger
document-processing system. To write HTML pages, you won't need to know a
whole lot about SGML. However, knowing that one of the main features of SGML is
that it describes the general structure of the content inside
documentsrather than its actual appearance on the page or
onscreendoes help. This concept might be a bit foreign to you if
you're used to working with WYSIWYG (What You See Is What You Get) editors,
so let's go over the information carefully.

HTML Describes the Structure of a Page

HTML, by virtue of its SGML heritage, is a language for describing the structure
of a document, not its actual presentation. The idea here is that most documents
have common elementsfor example, titles, paragraphs, and lists. Before
you start writing, therefore, you can identify and define the set of elements
in that document and give them appropriate names (see Figure
3.1).

If you've worked with word processing programs that use style sheets
(such as Microsoft Word) or paragraph catalogs (such as FrameMaker), you've
done something similar; each section of text conforms to one of a set of styles
that are predefined before you start working.

HTML defines a set of common styles for Web pages: headings, paragraphs,
lists, and tables. It also defines character styles such as boldface and code
examples. These styles are indicated inside HTML documents using tags.
Each tag has a specific name and is set off from the content of the document
using a notation that I'll get into a bit later.

HTML Does Not Describe Page Layout

When you're working with a word processor or page layout program, styles
are not just named elements of a pagethey also include formatting
information such as the font size and style, indentation, underlining, and so
on. So, when you write some text that's supposed to be a heading, you can
apply the Heading style to it, and the program automatically formats that
paragraph for you in the correct style.

HTML doesn't go this far. For the most part, HTML doesn't say
anything about how a page looks when it's viewed. HTML tags just indicate
that an element is a heading or a list; they say nothing about how that heading
or list is to be formatted. So, as with the magazine example and the layout
person who formats your article, the layout person's job is to decide how
big the heading should be and what font it should be in. The only thing you have
to worry about is marking which section is supposed to be a heading.

NOTE

Although HTML doesn't say much about how a page looks when it's
viewed, cascading style sheets (CSS) enable you to apply advanced formatting to
HTML tags. Many changes in HTML 4.0 favor the use of CSS tags. And XHTML, which
is the current version of HTML, eliminates almost all tags that are associated
with formatting in favor of Cascading Style Sheets. I'll talk about both
XHTML and CSS later today.

Web browsers, in addition to providing the networking functions to retrieve
pages from the Web, double as HTML formatters. When you read an HTML page into a
browser such as Netscape or Internet Explorer, the browser interprets, or
parses, the HTML tags and formats the text and images on the screen. The
browser has mappings between the names of page elements and actual styles on the
screen; for example, headings might be in a larger font than the text on the
rest of the page. The browser also wraps all the text so that it fits into the
current width of the window.

Different browsers running on diverse platforms might have various style
mappings for each page element. Some browsers might use different font styles
than others. For example, a browser on a desktop computer might display italics
as italics, whereas a handheld device or mobile phone might use reverse text or
underlining on systems that don't have italic fonts. Or it might put a
heading in all capital letters instead of a larger font.

What this means to you as a Web page designer is that the pages you create
with HTML might look radically different from system to system and from browser
to browser. The actual information and links inside those pages are still there,
but the onscreen appearance changes. You can design a Web page so that it looks
perfect on your computer system, but when someone else reads it on a different
system, it might look entirely different (and it might very well be entirely
unreadable).

NOTE

In practice, most HTML tags are rendered in a fairly standard manner, on
desktop computers at least. When the earliest browsers were written, somebody
decided that links would be underlined and blue, visited links would be purple,
and emphasized text would appear in italics. They also made similar decisions
about every other tag. Since then, pretty much every browser maker has followed
that convention to a greater or lesser degree. These conventions blurred the
line separating structure from presentation, but in truth it still exists, even
if it's not obvious.

Why It Works This Way

If you're used to writing and designing documents that will wind up
printed on paper, this concept might seem almost perverse. No control over the
layout of a page? The whole design can vary depending on where the page is
viewed? This is awful! Why on earth would a system work like this?

Remember in Day 1, "The World of the World Wide Web," when I
mentioned that one of the cool things about the Web is that it's
cross-platform and that Web pages can be viewed on any computer system, on any
size screen, with any graphics display? If the final goal of Web publishing is
for your pages to be readable by anyone in the world, you can't count on
your readers having the same computer systems, the same size screens, the same
number of colors, or the same fonts that you have. The Web takes into account
all these differences and enables all browsers and all computer systems to be on
equal ground.

The Web, as a design medium, is not a new form of paper. The Web is an
entirely different medium, with its own constraints and goals that are very
different from working with paper. The most important rules of Web page design,
as I'll keep harping on throughout this book, are the following:

Dodesign your pages so that they work in most browsers.

Do focus on clear, well-structured content that's easy to read
and understand.

Don't design your pages based on what they look like on your
computer system and on your browser.

Throughout this book, I'll show you examples of HTML code and what they
look like when displayed. In examples in which browsers display code very
differently, I'll give you a comparison of how a snippet of code looks in
two very different browsers. Through these examples, you'll get an idea for
how different the same page can look from browser to browser.

NOTE

Although this rule of designing by structure and not by appearance is the way
to produce good HTML, when you surf the Web, you might be surprised that the
vast majority of Web sites seem to have been designed with appearance in
mindusually appearance in a particular browser such as Microsoft Internet
Explorer. Don't be swayed by these designs. If you stick to the rules I
suggest, in the end, your Web pages and Web sites will be even more successful
simply because more people can easily read and use them.

HTML Is a Markup Language

HTML is a markup language. Writing in a markup language means that you
start with the text of your page and add special tags around words and
paragraphs. The tags indicate the different parts of the page and produce
different effects in the browser. You'll learn more about tags and how
they're used in the next section.

HTML has a defined set of tags you can use. You can't make up your own
tags to create new appearances or features. And just to make sure that things
are really confusing, various browsers support different sets of tags. To
further explain this, take a brief look at the history of HTML.

A Brief History of HTML Tags

The base set of HTML tags, the lowest common denominator, is referred to as
HTML 2.0. HTML 2.0 is the old standard for HTML (a written specification for it
is developed and maintained by the W3C) and the set of tags that all browsers
must support. In the next few days, you'll primarily learn to use tags that
were first introduced in HTML 2.0.

The HTML 3.2 specification was developed in early 1996. Several software
vendors, including IBM, Microsoft, Netscape Communications Corporation, Novell,
SoftQuad, Spyglass, and Sun Microsystems, joined with the W3C to develop this
specification. Some of the primary additions to HTML 3.2 included features such
as tables, applets, and text flow around images. HTML 3.2 also provided full
backward-compatibility with the existing HTML 2.0 standard.

NOTE

The enhancements introduced in HTML 3.2 are covered later in this book.
You'll learn more about tables in Day 8, "Tables." Day 12,
"Multimedia: Adding Sounds, Videos, and More," tells you how to use
Java applets.

HTML 4.0, first introduced in 1997, incorporated many new features that gave
you greater control than HTML 2.0 and 3.2 in how you designed your pages. Like
HTML 2.0 and 3.2, the W3C maintains the HTML 4.0 standard.

Framesets (originally introduced in Netscape 2.0) and floating frames
(originally introduced in Internet Explorer 3.0) became an official part of the
HTML 4.0 specification. Framesets are discussed in more detail in Day 15,
"Working with Frames and Linked Windows." We also see additional
improvements to table formatting and rendering. By far, however, the most
important change in HTML 4.0 was its increased integration with style
sheets.

NOTE

If you're interested in how HTML development is working and just exactly
what's going on at the W3C, check out the pages for HTML at the
Consortium's site at
http://www.w3.org/pub/WWW/MarkUp/.

In addition to the tags defined by the various levels of HTML, individual
browser companies also implement browser-specific extensions to HTML. Netscape
and Microsoft are particularly guilty of creating extensions, and they offer
many new features unique to their browsers.

Confused yet? You're not alone. Even Web designers with years of
experience and hundreds of pages under their belts have to struggle with the
problem of which set of tags to choose to strike a balance between wide support
for a design (using HTML 3.2- and 2.0-level tags) or having more flexibility in
layout but less consistency across browsers (HTML 4.0 or specific browser
extensions). Keeping track of all this information can be really confusing.
Throughout this book, as I introduce each tag, I'll let you know which
version of HTML the tag belongs to, how widely supported it is, and how to use
it to best effect in a wide variety of browsers.