Intro to HTML5

Most of the web standards curriculum is based on the last stable version of HTML — HTML 4.01. The HTML 4.01 spec was completed in 1999, over 10 years ago as of the time of this writing! But unless you’ve been hiding under a rock for the last year or so, you’ll be well aware that there is a new version of HTML in production — HTML5!

So why have we been teaching you HTML 4.01 in spite of this?

HTML5 is a really good thing for web developers and designers, because it:

Is mostly backwards compatible with what’s already there — you don’t need to learn completely new languages to use HTML5. The new markup features work in the same way as the old ones (although the semantics of some elements have been changed and the new APIs are based on mostly the same JavaScript/DOM that developers have been programming in for years.

Adds powerful new features to HTML that were previously only available on the Web using plugin technologies like Flash, or with complex JavaScript and hacks. Form validation and video are prime examples.

Is better suited to writing dynamic applications than previous HTML versions (HTML was originally designed for creating static documents).

What does HTML5 mean to me?

To start with, let’s answer that question you’ve had circling round your head since you started reading this article — why did we teach you most of the web standards curriculum using HTML 4.01, if HTML5 is on the horizon?

First of all, when the WSC was first published in 2008, HTML5 was a lot more in flux than it is now, and we didn’t want to teach you something that would likely be changed at a later date.

Second, and more importantly, HTML5 is backwards compatible — in practical terms, this means that all the stuff inside HTML 4.01 is also in HTML5. So by learning HTML 4.01, you are also learning a large chunk of HTML5.

Some parts of HTML5 are implemented in a stable enough fashion across browsers to be used safely even on a production site (as always, you will have to make a judgement call depending on your site’s target audience and features). Plus if a feature is not supported in certain browsers, you can work around this.

To give you a short concluding answer, HTML5 is the future of the Web, and a large part of your future as a web designer or developer. I’d recommend that you start learning HTML5 as soon as you are ready — many of the new features will make your existing work a lot easier, and it will future proof your knowledge.

HTML5 features

HTML5 contains many new features to make HTML much more powerful and suitable for building Web applications. In the list below I’ve summarized the main ones you really should know about.

Some of the features listed below are NOT actually part of the HTML5 spec itself, but are defined in closely related specs, therefore they are still valid parts of the new movement towards modern web applications, and useful for you to know about.

New semantic elements: As you will already know, semantics are very important in HTML — we should always use the appropriate element for the job. In HTML 4.01 we have a problem — yes, there are many elements for defining specific means such as tables, lists, headings, etc., but there are also many common web page features that have no element to define them. Think of site headers, footers, navigation menus, etc. — up until now we have defined these using <div id=”xxx”></div>, which we can understand, but machines can’t, plus different web developers will use different IDs and classes. Fortunately, HTML5 comes with new semantic elements such as <nav>, <header>, <footer> and <article>.

New form features: HTML 4.01 already allows us to create usable, accessible web forms, but some common form features are more fiddly than they should be, and require hacking to implement. HTML5 provides a standardized, simple way to implement features such as date pickers, sliders and client-side validation.

Native video and audio: For years, video and audio on the Web has been done using Flash, generally speaking. In fact, the reason Flash became so popular around the dawn of the 21st century is because open standards failed to provide a cross-browser compatible mechanism for implementing such things, with different browsers implementing different competing ways of doing the same thing (eg <object> and <embed>) and thereby making the whole process really complicated. Flash provided a high quality, easy way of making video work cross-browser.

HTML5 includes <video> and <audio> elements for implementing native video and audio players really easily with nothing but open standards, and it also includes an API to allow you to easily implement custom player controls.

Canvas drawing API: The <canvas> element and associated API allows you to define an area of the page to draw on, and use JavaScript commands to draw lines, shapes and text, import and manipulate graphics and video, export in different image formats, and a whole lot more.

Geolocation: The Geolocation spec (again, not a part of the HTML5 spec) defines an API that allows a web application to easily access any location data that has been made available, for example by a device’s GPS capabilities. This allows you to add all kinds of useful location-aware features to your applications, for example highlighting content that is more relevant to your location.

Introducing HTML5 structural elements

HTML4 already has a lot of semantic elements to allow you to clearly define the different features of a web page, like forms, lists, paragraphs, tables, etc. However, it does have its shortcomings. We still rely heavily on <div> and <span> elements with different id and class attributes to define various other features, such as navigation menus, headers, footers, main content, alert boxes, sidebars, etc. Something like <div id=”header”> works in terms of developers and designers knowing what it is for, and being able to use CSS and JavaScript to apply custom styles and behaviour to make it understandable to end users.

But it could be so much better. There are still problems with this kind of set up:

Humans can tell the different content apart, but machines can’t — the browser doesn’t see the different divs as header, footer, etc. It sees them as different divs. Wouldn’t it be more useful if browsers and screen readers were able to explicitly identify say, the navigation menu so a visually impaired user could find it more easily, or the different news items on a bunch of blogs so they could be easily syndicated in an RSS feed without any extra programming?

Even if you do use extra code to solve some of these problems, you can still only do it reliably for your web sites, as different web developers will use different class and ID names, especially when you consider the international audience — different web developers in different countries will use different languages to write their class and id names.

It therefore makes a lot of sense to define a consistent set of elements for everyone to use for these common structural blocks that appear on so many web sites. The new HTML5 elements we will cover are:

<header>: Used to contain the header of a site.

<footer>: Contains the footer of a site.

<nav>: Contains the navigation functionality for the page.

<article>: Contains a standalone piece of content that would make sense if syndicated as an RSS item, for example a news item.

<section>: Used to either group different articles into different purposes or subjects, or to define the different sections of a single article.

<time>: Used for for marking up times and dates.

<aside>: Defines a block of content that is related to the main content around it, but not central to the flow of it.

<hgroup>: Used to wrap more than one heading if you only want it to count as a single heading in the page’s heading structure.

<figure> and <figcaption>: Used to encapsulate a figure as a single item, and contain a caption for the figure, respectively.

Why isn’t there a <content> element?

While this may seem like a glaring omission, it really isn’t. The main content will be the top level block of content that isn’t the <header>, <nav> or <footer>, and depending on your particular circumstance, it might make more sense to mark the content up using an <article>, a <section>, or even a <div>.

Presenting an example HTML5 page

Some meta-differences
The first thing is that the doctype is much simpler than in older versions of HTML:

<!DOCTYPE html>

the creators of HTML5 chose the shortest possible doctype string for this purpose — after all, why should you, the developer, be expected to remember a huge great long string containing multiple URLs, when in reality the doctype is only there to put the browser into standards mode (as opposed to quirks mode)?

The purpose of the <header> element is to wrap the section of content that forms the header of the page, usually containing a company logo/graphic, main page title, etc.

<hgroup>
You’ll notice that in the above code, the only contents of my header are an <hgroup> element, wrapping two headings. What I want to do here is specify the document’s top level heading, plus a subtitle/tag line. I only want the top level heading to count in the document heading hierarchy, and that’s exactly what <hgroup> does — it causes a group of headings to only count as a single heading for the purposes of the document structure. you’ll find more out about how heading hierarchies work in HTML5, in the HTML5 outlines, and the HTML5 heading algorithm section, below.

<footer>
You’ll see this code:

<footer>
<h3 id="copyright">Copyright and attribution</h3>
</footer>

<footer> should be used to contain your site’s footer content — if you look at the bottom of a number of your favourite sites, you’ll see that footers are used to contain a variety of things, from copyright notices and contact details, to accessibility statements, licensing information and various other secondary links.
Note: You are not restricted to one header and footer per page — you could have a page containing multiple articles, and have a header and footer per article.

The <nav> element is for marking up the navigation links or other constructs (eg a search form) that will take you to different pages of the current site, or different areas of the current page. Other links, such as sponsored links, do not count. You can of course include headings and other structuring elements inside the <nav>, but it’s not compulsory.

<aside>
We have the following:

<aside>
<table>

<!– lots of quick facts inside here –>

</table>
</aside>

The <aside> element is for marking up pieces of content that are related to the main content, but don’t fit directly into the main flow. For example we can have a bunch of quick fire facts and statistics about the company, which wouldn’t work so well shoehorned into the main content. Other suitable condidates for <aside> elements include lists of links to external related content, background information, pull quotes, and sidebars.

<figure> and <figcaption>
The dynamic duo of <figure> and <figcaption> have been created to solve a very specific set of problems. For a start, doesn’t it always feel a bit semantically dubious and unclean to mark up an image and its caption as two paragraphs, or a definition list pair, or something else? And second, what do you do when you want a figure to consist of an image, or two images, or two images and some text? <figure> is on hand to wrap around all the content you want to comprise a single figure, whether it is text, images, SVG, videos, or whatever. <figcaption> is then nested inside the <figure> element, and contains the descriptive caption for that figure.

Conversely, the date inside the datetime attribute is an ISO standard (see W3C Tip: Use international date format (ISO) for more information) machine readable date, so you get the best of both worlds. You can also add a time onto the end of the ISO standard, like so:
<time datetime=”1989-03-13T13:00″>One o’clock in the afternoon, on the 13th of March 1989</time>
You can also add a timezone adjustment, so for example to make the last example pacific standard time, you would do this:

<time datetime="1989-03-13T13:00Z-08:00">One o'clock in the afternoon, on the 13th of March 1989</time>

<article> and <section>
Now we turn our attentions to probably the two most misunderstood elements in HTML5 — <article> and <section>. When you first meet them, the difference might appear unclear, but it really isn’t so bad.

Basically, the <article> element is for standalone pieces of content that would make sense outside the context of the current page, and could be syndicated nicely. Such pieces of content include blog posts, a video and it’s transcript, a news story, or a single part of a serial story.

The <section> element, on the other hand is for breaking the content of a page into different functions or subjects areas, or breaking an article or story up into different sections. So for example:

Where does that leave <div>?

So, with all these great new elements to use on our pages, the days of the humble <div> are numbered, surely? NO. In fact, the <div> still has a perfectly valid use. You should use it when there is no other more suitable element available for grouping an area of content, which will often be when you are purely using an element to group content together for styling/visual purposes. For example:

#wrapper {
background-color: #ffffff;
width: 800px;
margin: 0 auto;
}

<mark>
The <mark> element is for highlighting terms of current relevance, or highlighting parts of content that you just want to draw attention to, but not change the semantic meaning of. It’s like when you are going through a printed article and highlighting lines important to you with a highlighter pen. So for example, you might want to use this element to markup lines in a wiki that need to be given editorial attention, or to highlight instances of a search term that the user has just searched for on a page, and then give them appropriate styling in your CSS.

How to get it working in older browsers

Older browsers: always the bane of our very existence when trying to get to grips with using shiny new toys on the Web! In fact, the problem here is all browsers – no browsers currently recognise and support these new HTML5 structural elements, as such. But never fear, you can still get them working across browsers today with the minimum of effort.

First of all, if you put an unknown element into a web page, by default the browser will just treat it like a <span>, ie, an anonymous inline element. Most of the HTML5 elements we have looked at in this article are supposed to behave like block elements, therefore the easiest way to make them behave properly in older browsers is by setting them to display:block; in your CSS:

This solves all your problems for all browsers except one. Have a guess which one? … Yup, amazing isn’t it, that IE should prove to be trickier than the other browsers, and refuse to style elements it doesn’t recognise? The fix for IE is illogical, but fortunately pretty simple. For each HTML5 element you are using, you need to insert a line of JavaScript into the head of your document, like so:

IE will now magically apply styles to those elements. It is a pain having to use JavaScript to make your CSS work, but hey, at least we have a way forward? Why does this work exactly? no-one I’ve talked to actually knows. There is also a problem with these styles STILL not being carried through to the printer when you try to print HTML5 documents from IE.

One thought on“Intro to HTML5”

Jamie, you did nice! But… it’s okay to group style selectors… alphabetically, even. This nicety becomes especially critical when getting it to work in older browsers – matching up and confirming CSS and script entries. A great tool to one-click clean up typical sloppy alphabet soup code is Blumenthal’s Webuilder http://www.webuilderapp.com. Yes, like kids. We-can-do, so lets learn how letters work, lol.