Understanding HTML5 Content Models

Earlier this week we looked at the new text-level and structural semantic elements html5 provides. Today I want to continue and talk about content models in html5, specifically the new outline algorithm for creating hierarchy.Note:Unfortunately no browser or user-agent ever implemented the HTML5 document outline or likely ever will. You might prefer a series I wrote more recently about HTML5 and the document outline. Here’s the series wrap up, which links to all the other posts in the series.

Sectioning content — the new structural tags described in my previous post

Currently to create a hierarchical outline of our content we use a set of h1–h6 tags. They work for the most part, but can break down at times. Consider the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

<h1>Web Design</h1><p>Some general info about web design</p><h2>Layout</h2><p>Info about layouts</p><h3>Grids</h3><p>Info about grids</p><h2>Typography</h2><p>Info about typography</p><h2>Color</h2><p>Info about color</p><h2>Design Principles</h2><ul><li>List of</lI><li>several different</lI><li>design principles</lI></ul><p>Where in the outline does this paragraph belong?</p>

The above would produce the following outline based on the headings.

web design

layout

grids

typography

color

design principles

In general each paragraph below a heading belongs under that heading in the outline in the hierarchy, but do they have to?

Where in the outline does the very last paragraph belong? Is it under the Design Principles or does it belong under Web Design?

You can tell my intention based on the indentation, but a machine isn’t going to see that with the whitespace stripped and there’s no reason the code needed to be indented the way it is above.

Visually that last paragraph will look just like the one above it as well. Reading you wouldn’t really know which section it belongs to.

HTML5 helps solve the problem above.

Sectioning Content Model

The first tool html5 provides is the section tag we discussed last time. Using the section element we can rewrite the above as

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

<h1>Web Design</h1><p>Some general info about web design</p><section><h2>Layout</h2><p>Info about layouts</p><h3>Grids</h3><p>Info about grids</p><h2>Typography</h2><p>Info about typography</p><h2>Color</h2><p>Info about color</p><h2>Design Principles</h2><ul><li>List of</lI><li>several different</lI><li>design principles</lI></ul></section><p>Where in the outline does this paragraph belong?</p>

Once again the outline produced is the same as we saw above, but now it’s much clearer where the last paragraph belongs. We can do better though. Let’s mix in the header element and better define the different sections of the document.

<h1>Web Design</h1><p>Some general info about web design</p><section><header><h2>Layout</h2></header><p>Info about layouts</p><section><header><h3>Grids</h3></header><p>Info about grids</p></section></section><section><header><h2>Typography</h2></header><p>Info about typography</p></section><section><header><h2>Color</h2></header><p>Info about color</p></section><section><header><h2>Design Principles</h2></header><ul><li>List of</lI><li>several different</lI><li>design principles</lI></ul></section><p>Where in the outline does this paragraph belong?</p>

Once again the above html produces the same outline. So far not much is really new other than the addition of some new tags. We could have done the same thing by using divs instead of section and header.

So where’s the new stuff?

HTML5 Outline Algorithm

In html5 each sectioning element has its own self-contained outline. What that means is we can start each section with an h1 tag and the algorithm will figure out the overall outline.

<h1>Web Design</h1><p>Some general info about web design</p><section><header><h1>Layout</h1></header><p>Info about layouts</p><section><header><h1>Grids</h1></header><p>Info about grids</p></section></section><section><header><h1>Typography</h1></header><p>Info about typography</p></section><section><header><h1>Color</h1></header><p>Info about color</p></section><section><header><h1>Design Principles</h1></header><ul><li>List of</lI><li>several different</lI><li>design principles</lI></ul></section><p>Where in the outline does this paragraph belong?</p>

Believe it or not the above html where every heading is an h1 still produces the same outline in html5.

web design

layout

grids

typography

color

design principles

Under html 4 the outline would be

web design

layout

grids

typography

color

design principles

Quite a difference. It might seem somewhat strange to have every heading be an h1 tag, but it does have advantages. You won’t have to keep track of your overall hierarchy, only the hierarchy within a section.

Maybe not such a big deal with a single document, but it does allow our content to be more modular and portable, which will get to momentarily.

Other Sectioning Elements

Above I mentioned that the sectioning content model includes all the structural tags we talked about last time. It’s not only the section tag that creates its own self-contained outline.

Tags like aside, article, and nav also do the same.

While it wouldn’t be appropriate had I used article tags instead of section tags in the above code the same outline would have been produced.

The hgroup Element

Note: The W3C removed the hgroup element from the html5 spec in the spring of 2013 citing little real world use. It’s no longer recommended for use.

Sometimes you may want to use headings so you can better show and style visual hierarchy, but you don’t want the heading to be part of the document outline.

hgroup allows us to do just that. For example say you have the following markup:

1
2
3
4

<hgroup><h1>Main heading</h1><h2>Tagline</h2></hgroup>

Only the h1 above would be included in document outline. The h2 wouldn’t be included. Only the first heading, regardless of how many are there would be included in the outline.

The hgroup element can only contain h1–h6 tags and it’s meant to be used for subtitles, alternative titles, and tag lines.

Do we need hgroup? The above could have been coded as:

1
2

<h1>Main heading</h1><p class="tagline">Tagline<p>

This would produce the same outline and allow for the same visual styles, however the hgroup probably adds more semantic meaning and certainly uses a bit less code.

In addition to using hgroup to hide some headings from the document outline there are a few elements that by default are invisible to the document element and are called sectioning roots.

blockquote

fieldset

td

Even if you use headings inside the above elements those headings won’t be part of the document outline under html5.

Modular Content

The new outline algorithm helps us create content that is more modular. The idea of not needing to keep track of your hierarchy might not seem like such a big deal until you consider what happens when you move a piece of content around.

For example typical of many blogs is to display the title and a short paragraph of several posts on the main blog page. In the individual posts the headings would be marked up with an h1. On the main blog page you might have an h1 for the page and then have each of the blog post titles as an h2.

With the new outline algorithm you can move the post titles back and forth with the same h1 heading and let the outline algorithm figure out the hierarchy.

This makes any section of content more portable as we can mix it in with other content without worrying that it might break the hierarchy of the page.

While you’ll probably never have need you can now also structure a document with more than 6 levels. Ultimately we can now create an infinite amount of levels using the same h1–h6 elements in nested sections.

Scoped Styles

A new problem is created in being able to move content around from document to document and that’s in the styles that get applied to that content.

Our modular content will inherit the styles of the parent document, which may not be what we want. html5 offers a solution with the boolean scoped attribute that can be applied to the style element as seen below.

In the above code the h1 of our article will be the scoped styles regardless of where the article is displayed. This allows us to move not only content, but the styles associated with that content easily.

Browser Support

In order to use the new semantic elements we defined those elements in our stylesheet as display: block to ensure they won’t break our layouts. We should now add the hgroup element.

However the good news is you don’t have to use an h1 to start each new section. You can continue to use h2 and h3 tags inside sections to produce the outline you want.

We’ll lose the portability benefits until browsers are supporting the new algorithm, but we can start preparing for when they do offer support.

For now it’s probably better to stick with using headings as you always have, though it is safe to enclose your headings in the new semantic elements.

Summary

HTML5’s sectioning content model gives us greater control over the hierarchy of our documents. The new outline algorithm provides for an unlimited number of heading levels and helps make our content more modular and portable.

At the moment there’s limited browser support the the html5 outline algorithm, but we can still prepare for it while using h1–h6 tags as we do now.

We won’t be able to take advantage of some of the benefits the new outline algorithm will gives us, but we can prepare our documents for when browser support is more robust.

It will probably feel a little strange to markup a document with multiple h1 tags and leave it to the browser to sort out the hierarchy, but hopefully you can see the advantages in such an approach.

Scoped style isn’t something I can see myself getting into as I prefer to do all my styling in the css file for a clean style/content separation. You can use some simple selectors like ‘article h1’ instead of scoped style to maintain the look you want.

A good tool to see how a page’s HTML5 Outline looks:http://gsnedders.html5.org/outliner/
One note is if you use that nav element the outline will return an Untitled Section where the nav is.

I think it has some limited use cases. It seems mainly for when you’re content is going to be embedded on another site and you want the styles to go along with it. For example what you code a badge for your site and want to keep the branding. You might not be able to count on the other site owner styling things to your liking.

It could also make it easier to move some content within your site without having to deal with styling them for different places. Probably not the thing most of us will be excited about, but useful in some cases.

Thanks for the link. I thought I linked to the outliner in this post or maybe I did in one of the other html5 posts. I know I meant to link to it in one of them.

When you want something to be a heading. Nesting sections won’t create headings inside. Unfortunately user-agents didn’t adopt any of the new HTML5 document outline so none of this is in use anywhere and it doesn’t look like it will be.