The figure & figcaption elements

In traditional printed material like books and magazines, an image, chart, or code example would be accompanied by a caption. Before now, we didn’t have a way of semantically marking up this sort of content directly in our HTML, instead resorting to CSS class names. HTML5 hopes to solve that problem by introducing the <figure> and <figcaption> elements. Let’s explore!

The <figure> element

The <figure> element is intended to be used in conjunction with the <figcaption> element to mark up diagrams, illustrations, photos, and code examples (among other things). The spec says this about <figure>:

The figure element represents a unit of content, optionally with a caption, that is self-contained, that is typically referenced as a single unit from the main flow of the document, and that can be moved away from the main flow of the document without affecting the document’s meaning.

The <figcaption> element

The <figcaption> element has been the subject of muchdebate. The spec initially wanted to repurpose <legend> rather than introduce a new element. Other suggestions included <label>, <caption>, <p> or the <h1>–<h6> elements. <legend> was changed, so we then used a combination of <dt> and <dd> inside <figure> at Jeremy’s suggestion. Most of these suggestions failed since there was no backwards compatibility for styling with CSS.

The <figcaption> element is optional and can appear before or after the content within the <figure>. Only one <figcaption> element may be nested within a <figure>, although the <figure> element itself may contain multiple other child elements (e.g., <img> or <code>).

Using <figure> and <figcaption>

So we’ve seen what the spec says about these elements. How do we use them? Let’s look at some examples.

Differences between <figure> and <aside>

We covered <aside> in an earlier article, but it is important to note the difference between the two. You should choose between <aside> or <figure> by asking yourself if the content is essential to understanding the section:

If the content is simply related and not essential, use <aside>.

If the content is essential but its position in the flow of content isn’t important, use <figure>.

Having said that, if its position relates to previous and subsequent content, use a more appropriate element — e.g., a <div>, a plain old <img>, a <blockquote>, or possibly even <canvas>, depending on its content.

Don’t stop there!

No need to constrain your <figure>s to images and code examples. Other content suitable for use in <figure> includes audio, video, charts (perhaps using <canvas> or <svg>), poems, or tables of statistics.

It may not always be appropriate to use the <figure> element, though. For example, a graphic banner should not be marked up with <figure>. Instead, simply use the <img> element.

Summary

As we’ve illustrated in this article, there are a lot of possibilities for the <figure> element. Just remember to make sure it’s the most appropriate element for the job. But you already do that for all your markup, right? :)

Before I resorted to using a div->image with a div->caption nested inside to express these kind of structures. The figure element will remove the need for the somewhat silly image class. Definite improvement for quotes too.

It does impose some extra difficulties with wysiwyg editors and such, as I would suggest using the figure element for every image that could have a caption (future-proof coding and such!).

I would think, from an accessibility point of view, that for the second example, that you would leave the alt attribute blank because it’s essentially providing the same content as the figcaption – no need for people listening to the page, or browsing without images to get the same piece of content twice.

For what it’s worth, I disagree with John Faulds. The caption should provide informationaugmenting the photograph. The alt should provide equivalent content to the photograph. For photographs like the second example, appropriate alt text is a description on what one can see in the photograph. They really shouldn’t amount to the same text, and indeed don’t in the actual example given.

Regardless of whether the two in that example are actually the same, I’d still argue that the content of the alt attribute is redundant; the only piece of info contained in the alt that’s not in the caption is that it is in the trees. So if it’s not an issue of accessibility, it is one about good copywriting.

But I also disagree that a caption’s sole purpose is to augment, rather it should describe first and then augment if required. I can think of lots of examples where the caption is simply describing, e.g. a headshot with the person’s name underneath – no augmentation there, and I believe the majority of captions won’t include any augmentation.

Completely agree with John Faulds. The alt should be filled if the redundancy between figcaption and alt can be avoided. There are plenty of examples where the captions are depicting a figure and where these textual informations are required for readers/viewers, even those with a regular browser (mean, not a screen reader). In those cases, the alt is often redundant with the (fig)caption.

Alohci is right when he/she says that a visual description of the image is to be provided within the alt. This macaque is in a tree. Is he hiding in the leaves ? looking at a female ? eating sandwiches ? Which information must be provided in a alt attribute is a matter of scope, of overall meaning.

I’ve been brought up nice, so don’t like to over-burden a user with “too much accessibility”. So I don’t think that having alt plus a caption that are pretty much the same helps anyone.

It seems to me that the question should be “what’s better for the user?” and the answer is “it depends”, because those pesky blind users will insist on not being an homogenous mass.

So some blind users will be fine to know that figure X is a picture of a monkey, or the CEO of Blammo Corp. Others will want to know whether the monkey is juvenile or mature, and what colour hair the CEO has.

I favour brevity and reducing repeitiion. So, a picture of the CEO of Blammo Corp leveraging a synergy in a news piece *probably* doesn’t need alt text to tell us that he’s 55 years old, with brown hair, a pinstripe suit.

But I might be a horrible fascist, chuckling as I place my foot on the throat of visually-impaired children; I lose track these days.

One thing to keep in mind is that <figcaption> maps to ARIA’s aria-labelledby. The alt attribute maps to ARIA’s aria-describedby. While it’s sometimes a tricky distinction, an image’s alt text should describe the image (what it looks like, what you can see in it), whereas <figcaption> (and the image’s title attribute) should entitle the image (what it is, which is influenced by why you’re including it).

I agree. The spec says “The intent is that replacing every image with the text of its alt attribute not change the meaning of the page,” and that’s a good rule of thumb.

Here’s what I’ve got from the spec: if the image has a function, describe this function. If it provides useful information, then make sure that info is also accessible. If the image is purely decoration use alt="". If the image is described adequately in <figcaption> (such that adding alt text wouldn’t convey extra info, or would be repetitive), then alt="" is fine there too. But above all else think about how the page would read with images turned off, and make sure it reads well.

Finally remember there’s no such thing as perfect accessibility, so just do the best you can.

If the image is described adequately in (such that adding alt text wouldn’t convey extra info, or would be repetitive), then alt=”” is fine there too.

My first thought. But apparently, alt=”” carries an implied ARIA role=”presentational” and, if the figure needs a caption it presumably isn’t presentational. So it should have no alt attribute at all, apparently.

@Bruce. The problem with omitting the alt attribute altogether is , as I understand it, that AT will attempt to “repair” the missing alt text, which really isn’t what is wanted if the image is described adequately elsewhere on the page.

I’m not wholly confident about WAI-ARIA roles yet, but couldn’t one do <img src=”…” alt=”” role=”img” > to override the implied role=”presentational”?

I noticed in your examples that the figures are preceded by paragraphs that introduce them. For example the line:

Nesting multiple images within one <figure> element with a single caption:

is then followed by a <figure> element. And yet the spec says that the <figure> is to be used for content that can be moved away into an appendix. Does the use of <figure> in this context meet that criteria?

@Christian — yes, as presumably the removed figure would be replaced by a link if done automatically. Also notice the word “can” — it is indeed content that could be moved. Ideally we’d be all literary-like and have (see Figure 1-12) and stuff, but hopefully you’ll forgive us for our lack of highfalootin’ ;-)

question… I’m working through Bruce and Remy’s book, and I’ve come to this wonderful new structure, however it won’t seem to work. My visual Studio environment is allowing for and giving feedback on the figure tag, but not the figcaption element (which to me is the cool part of the whole thing). Chrome has been rendering all of the other things in the book delightfully, but not this. suggestions?

@Jenny — Check your source in Chrome (View > Developer > View Source) and make sure that figcaption is there. If it is try adding figcaption {display: block;}, or adding figcaption to the handful of HTML5 elements you declare as display: block. However I suspect that Visual Studio is the problem, so try a quick test in Notepad, and if so check for an update for Virtual Studio, or ask when they will add it.

@Jenny, seems odd that you can use some new elements but not others. Personally I don’t use visual studio, but while the intellisense might not work so long as you’re losing Remy’s shiv every browser will ‘understand’ the new elements.

It was a good idea to add the element to the display:block section in my css, and I made sure to add the Sharp Shiv. Now it does show the text of the figcaption, though a bit disappointingly. Upon taking a closer look at the photo that Bruce’s mum snapped, I see that there is some css3 blingery involved. It looks good. Must be nice to be good looking AND talented!

I’m happy using figure in the context of your examples (images, code etc) but would I be pushing it to mark up links to documents where the front cover is the image and the book title is the caption (with one or both as links)? I ask because we have content authors who use tables to achieve a “gallery” of publications and I really want to eliminate this behaviour by incorporating it into a page edit screen. I’m pretty sure using a group of figures is fine (enclosed by <article> perhaps?) and would allow for a flexible CMS template gallery section (with regards to final content).

Figure is intended to be used for images/quotes/code samples etc that are referenced from a main article but can be moved away in the source.

To answer your question, I wouldn’t really be using it for the page you describe (sounds similar to Amazon if I’ve interpreted correctly) each book could be within an article sure but I would just have a heading & image within that. E.g.

My specific use case here is author-generated content with a list of closely-related documents for download. Traditionally, the author adds the documents to the “resource gallery” (to use a CMS term) and then links to them in an unordered list (if you’re lucky). Otherwise it’s a paragraph or line breaks per link. If the author has images of the front pages (the document is a pamphlet or report) they’ll create a table to layout the images and links which even if it’s a very simple structure can still cause problems for a screen reader (remembering that authors are not crafting raw HTML).

I was thinking it would be clever to capture what downloads are associated to the content in the page edit screen and then get the template to put those documents into a consistent markup structure with the front page as an image. Happy with using <article> for this but wanted to explore “etc” from your flowchart (got it from @media, thx!) and this article.

[…] Well now lets go through our html5 code. As you can see I have used figure tag which has an optional figcaption tag of html5 in my code which is very helpful for sematics. You can check detailed explanation of figure and figcaption on html5doctor. […]

Currently with XHTML 4.x and CSS 2.x In order to have a caption be right justified with an image (like the way your ‘Submit Comment’ button lines up with the right side of the ‘Enter Comment’ box) , the following conditions need to be met:

1. The image&caption or containing box have to have a width property declared in a style sheet or inline.

With the html5 ‘figure’ tag and figcaption tag is there any inherent capability in these tags that would allow image&caption alignment without having to meet criteria 1?

If you have multiple sized images that you want centered in a containing element but want the caption to be right (or left) justified with the varying image sizes will you STILL have to have a width declared?

Or a better question may be:
Is the only inherent value of these new tags their allowance of more semantic mark up?

Could figure and figcaption be correctly used to display a post thumbnail and the title and/or excerpt as the figcaption. For instance a portfolio page whereby an individual item would look something like the following:

@Shawn — sure, that’s valid (ref my reply to Nathan above), but if the images are your main content I think in this case not using <figure> is better. Remember the “can be moved away from the main flow of the document without affecting the document’s meaning” bit in the spec quote at the start of this article.

For me <figure> is appropriate if a mobile browser could replace your <figure>s with a “See figure 3” etc link (to save on bandwidth), and the page still works.

I love that this code works. However, my issue is; I am placing images in a row and I can not for the life of me figure out how to give each image it’s own caption without misplacing the order of the images. Every time I try to add a caption to the other images in the row, after adding the caption to the first image, the next image will pop below the first:(

Thanks for reply, bruce. I agree that styling may be easier with the link inside.. Probably you’re right and link should be inside the figure element in such cases.
I just thought and got to conclusion that images in my gallery are the main content, so maybe it’s not correct to use figure at all.. Maybe it should be just:

@Sergey – You’re probably right that you don’t need figure at all, but note that of your two options using figure, the first (link inside figure) is invalid in HTML5. figcaption (not figcapture) must be the direct child of figure.

I like the figure and figcaption, if for one reason, it lets me get the description underneath the figure. However, I can not find a way (without using CSS, to get the text to run around the by using or right? Any ideas?

[…] am a big believer in semantic markup, so we will be using the HTML5 figure and figcaption tags for increased context. Even if none of your users are browsing with a screenreader or other assistive device, descriptive […]

Yes, the feature <figure> in html5 is great, but it seems that few sites and blogs are using, here in Brazil (where I live), for example, few, if not none, blogs and websites use this feature and think it should use more, I always use this tag on my blogs and projects, trying to make the site more semantic as possible.

Well, the clearest example for me is simply thinking about math books and miscellaneous papers: in those, ecuations, theorems, code snippets and graphs, among others, are clearly defined and referenced several times throughout the document.

@Bobby: Generally speaking, “yes”, you can put most elements inside a figure element. But of course, you should do your best to respect the rules regarding what type of content is suitable to be put inside a figure element, as described in the spec:

The figure element represents some flow content, optionally with a caption, that is self-contained (like a complete sentence) and is typically referenced as a single unit from the main flow of the document.

So basically I am building image gallery, each image has its own figure tag and social widget. I was about to put social widget inside figcaption but I’m sure social widget doesn’t belong in figcaption.

so apparently the figcaption element (if present) has to be the first or the last child of the figure element.

That would make your example invalid.

Anyhow, the spec. doesn’t say anything about UI controls (like your ‘social widgets’). It’s actually a very good question. Personally I feel that the requirement of the content model is a bit too strict (why only first or last?), and I don’t see any major reason not to allow UI controls if appropriate, like in your case.

Bobby: Just let me add this: the way the spec. is written at the moment, I think you should not add UI controls at all if you want to follow the spec. strictly. After all, as you say, it is semantically dubious to put them in the caption. But it is equally dubious to put them outside the caption, because

The figcaption element represents a caption or legend for the rest of the contents of the figcaption element’s parent figure element, if any.

so then the caption would also describe the UI controls…

Personally, I am vey much into “hypertext documents” as opposed to “hypertext appliactions“, so I kind of like the strict semantics. Still, HTML5 is not only about documents, and even in documents, I can see the benefit of putting UI controls in a figure (show fullscreen, copy to clipboard, etc.), BUT I would prefer these to be features integrated in the browser.

Anyhow, your example shows there might actually be an occasional need to add custom UI controls to a figure. To me, a new element like figcontrols would make sense, but I don’t think that will happen. So the question remains: what should you do? It is a good question, but I don’t think there is a perfect answer according to the current spec.

I find the concept of an image being ‘essential’ very vague. When is a picture essential?
the more I think about it the more I get confused. Isn’t a picture always essential except when it is a filler or presentational (in which case they should be placed in a css background, not an img).
Example: when writing a blogpost about a home project, placing a picture of the end result, is that an essential picture?
I can guess the picture of a t-shirt on a webshop site is essential, but what about a picture of a usb-cable on the same site? Is it really essential to show a usb cable (most people know what it looks like and is not really essential to know what the product looks like). But if a t-shirt is essential and a usb-cable is not, then using figure seems arbitrary from a design point of view (think about standard product html templates). Then do I need one template for products with essential pictures and one without?
I am getting a headache.

I know this is an old post and that much of discussion has passed but I wanted to chime in. John Faulds is wrong in a major way.

To apply his rational to someone with sight the image is not necessary since we have the caption. That is 100% incorrect. The alt attribute and the figure should not have the same content. The alt attribute is almost always poorly written for screen readers providing bad UX. In example #2 it reads “Macaque in the trees.” It should read something like, “A Macaque monkey resting in the trees looking down through the foliage.” It should be very descriptive. It should be the auditory equivalent of the image. Then the caption gives you context by which the image should then be placed. In example two it gives us the location — something neither sighted nor the blind could tell, and the photographer.

Don’t discount the blind just because the example, and just bout 100% of web sites, use alt incorrectly.

You introduced your figures with text like ‘An image within a element with an explanatory caption:’.

This is my preferred way of introducing pictures, tables and lists, because it demonstrates the basic principle of context before content, so that the reader knows what to look for in the content, rather than being left up to their own predilections.

I think that the table and list elements should allow a p element as an introduction, as can be done with the figure element.

For a picture, the figcaption element presents an opportunity to put something like a quote, for a portrait, or some other aspect of the subject.