Rich Media

The history of the web is punctuated with technological improvements. One of the earliest additions to HTML was the img element, which fundamentally altered the web. Then, the introduction of JavaScript allowed the web to become a more dynamic environment. Later, the proliferation of Ajax made the web a viable option for full-fledged applications.

Web standards have advanced so much that it’s now possible to build almost anything using HTML, CSS, and JavaScript— almost anything.

There are some gaps in the web standards palette. If you want to publish text and images, HTML and CSS are all you need. But if you want to publish audio or video, you’ll need to use a plug-in technology such as Flash or Silverlight.

“Plug-in” is an accurate term for these technologies—they help to fill the holes on the web. They make it relatively easy to get games, films, and music online. But these technologies are not open. They are not created by the community. They are under the control of individual companies.

Flash is a powerful technology, but using it sometimes feels like a devil’s bargain. We gain the ability to publish rich media on the web, but in doing so, we lose some of our independence.

HTML5 is filling in the gaps. As such, it is in direct competition with proprietary technologies like Flash and Silverlight. But instead of requiring a plug-in, the rich media elements in HTML5 are native to the browser.

Canvas

When the Mosaic browser added the ability to embed images within web pages, it gave the web a turbo boost. But images have remained static ever since. You can create animated gifs. You can use JavaScript to update an image’s styles. You can generate an image dynamically on the server. But once an image has been served up to a browser, its contents cannot be updated.

The canvas element is an environment for creating dynamic images.

The element itself is very simple. All you specify within the opening tag are the dimensions:

<canvas id="my-first-canvas" width="360" height="240">
</canvas>

If you put anything between the opening and closing tags, only browsers that don’t support canvas will see it (fig 3.01):

fig 3.01: Users without canvas support will see the image of a cute puppy.

All the hard work is done in JavaScript. First of all, you’ll need to reference the canvas element and its context. The word “context” here simply means an API. For now, the only context is two-dimensional:

The 2D API offers a lot of the same tools that you find in a graphics program like Illustrator: strokes, fills, gradients, shadows, shapes, and Bézier curves. The difference is that, instead of using a Graphical User Interface, you have to specify everything using JavaScript.

Dancing about architecture: drawing with code

This is how you specify that the stroke color should be red:

context.strokeStyle = '#990000';

Now anything you draw will have a red outline. For example, if you want to draw a rectangle, use this syntax:

strokeRect ( left, top, width, height )

If you want to draw a rectangle that’s 100 by 50 pixels in size, positioned 20 pixels from the left and 30 pixels from the top of the canvas element, you’d write this (fig 3.02):

context.strokeRect(20,30,100,50);

fig 3.02: A rectangle, drawn with canvas.

That’s one very simple example. The 2D API provides lots of methods: fillStyle, fillRect, lineWidth, shadowColor and many more.

In theory, any image that can be created in a program like Illustrator can be created in the canvas element. In practice, doing so would be laborious and could result in excessively long JavaScript. Besides, that isn’t really the point of canvas.

Canvas. Huh! What is it good for?

It’s all well and good using JavaScript and canvas to create images on the fly, but unless you’re a hardcore masochist, what’s the point?

The real power of canvas is that its contents can be updated at any moment, drawing new content based on the actions of the user. This ability to respond to user-triggered events makes it possible to create tools and games that would have previously required a plug-in technology such as Flash.

One of the first flagship demonstrations of the power of canvas came from Mozilla Labs. The Bespin application (https://bespin.mozilla.com) is a code editor that runs in the browser (fig 3.03).

fig 3.03: The Bespin application, built with canvas.

It is very powerful. It is very impressive. It is also a perfect example of what not to do with canvas.

Access denied

A code editor, by its nature, handles text. The Bespin code editor handles text within the canvas element—except that it isn’t really text anymore; it’s a series of shapes that look like text.

Every document on the web can be described with a Document Object Model. This DOM can have many different nodes, the most important of which are element nodes, text nodes, and attributes. Those three building blocks are enough to put together just about any document you can imagine. The canvas element has no DOM. The content drawn within canvas cannot be represented as a tree of nodes.

Screen readers and other assistive technology rely on having access to a Document Object Model to make sense of a document. No DOM, no access.

The lack of accessibility in canvas is a big problem for HTML5. Fortunately there are some very smart people working together as a task force to come up with solutions (http://www.w3.org/WAI/PF/html-task-force).

Canvas accessibility is an important issue and I don’t want any proposed solutions to be rushed. At the same time, I don’t want canvas to hold up the rest of the HTML5 spec.

Clever canvas

Until the lack of accessibility is addressed, it might seem as though canvas is off-limits to web designers. But it ain’t necessarily so.

Whenever I use JavaScript on a website, I use it as an enhancement. Visitors who don’t have JavaScript still have access to all the content, but the experience might not be quite as dynamic as in a JavaScript-capable environment. This multi-tiered approach, called Unobtrusive JavaScript, can also be applied to canvas. Instead of using canvas to create content, use it to recycle existing content.

Suppose you have a table filled with data. You might want to illustrate the trends in the data using a graph. If the data is static, you can generate an image of a graph—using the Google Chart API, for example. If the data is editable, updating in response to user-triggered events, then canvas is a good tool for generating the changing graph. Crucially, the content represented within the canvas element is already accessible in the pre-existing table element.

There is another option. Canvas isn’t the only API for generating dynamic images. SVG, Scalable Vector Graphics, is an XML format that can describe the same kind of shapes as canvas. Because XML is a text-based data format, the contents of SVG are theoretically available to screen readers.

In practice, SVG hasn’t captured the imagination of developers in the same way that canvas has. Even though canvas is the new kid on the block, it already enjoys excellent browser support. Safari, Firefox, Opera, and Chrome support canvas. There’s even a JavaScript library that adds canvas support to Internet Explorer (http://code.google.com/p/explorercanvas/).

Given its mantras of “pave the cowpaths,” and “don’t reinvent the wheel,” it might seem odd that the WHATWG would advocate canvas in HTML5 when SVG already exists. As is so often the case, the HTML5 specification is really just documenting what browsers already do. The canvas element wasn’t dreamt up for HTML5; it was created by Apple and implemented in Safari. Other browser makers saw what Apple was doing, liked what they saw, and copied it.

It sounds somewhat haphazard, but this is often where our web standards come from. Microsoft, for example, created the XMLHttpRequest object for Internet Explorer 5 at the end of the 20th century. A decade later, every browser supports this feature and it’s now a working draft in last call at the W3C.

In the Darwinian world of web browsers, canvas is spreading far and wide. If it can adapt for accessibility, its survival is ensured.

Audio

The first website I ever made was a showcase for my band. I wanted visitors to the site to be able to listen to the band’s songs. That prompted my journey into the underworld to investigate the many formats and media players competing or my attention: QuickTime, Windows Media Player, Real Audio—I spent far too much time worrying about relative market share and cross-platform compatibility.

In the intervening years, the MP3 format has won the battle for ubiquity. But providing visitors with an easy way to listen to a sound file still requires a proprietary technology. The Flash player has won that battle.

Now HTML5 is stepping into the ring in an attempt to take on the reigning champion.

Embedding an audio file in an HTML5 document is simple:

<audio src="witchitalineman.mp3">
</audio>

That’s a little too simple. You probably want to be a bit more specific about what the audio should do.

Suppose there’s an evil bastard out there who hates the web and all who sail her. This person probably doesn’t care that it’s incredibly rude and stupid to embed an audio file that plays automatically. Thanks to the autoplay attribute, such malevolent ambitions can be realized:

<audio src="witchitalineman.mp3" autoplay>
</audio>

If you ever use the autoplay attribute in this way, I will hunt you down.

Notice that the autoplay attribute doesn’t have a value. This is known as a Boolean attribute, named for that grand Cork mathematician George Boole.

Computer logic is based entirely on Boolean logic: an electric current is either flowing or it isn’t; a binary value is either one or zero; the result of a computation is either true or false.

Don’t confuse Boolean attributes with Boolean values. You’d be forgiven for thinking that a Boolean attribute would take the values “true” or “false.” Actually, it’s the very existence of the attribute that is Boolean in nature: either the attribute is included or it isn’t. Even if you give the attribute a value, it will have no effect. Writing autoplay="false" or autoplay="no thanks" is the same as writing autoplay.

If you are using XHTML syntax, you can write autoplay="autoplay". This is brought to you by the Department of Redundancy Department.

When an auto-playing audio file isn’t evil enough, you can inflict even more misery by having the audio loop forever. Another Boolean attribute, called loop, fulfills this dastardly plan:

<audio src="witchitalineman.mp3" autoplay loop>
</audio>

Using the loop attribute in combination with the autoplay attribute in this way will renew my determination to hunt you down.

Out of control

The audio element can be used for good as well as evil. Giving users control over the playback of an audio file is a sensible idea that is easily accomplished using the Boolean attribute controls:

<audio src="witchitalineman.mp3" controls>
</audio>

The presence of the controls attribute prompts the browser to provide native controls for playing and pausing the audio, as well as adjusting the volume (fig 3.05).

If you’re not happy with the browser’s native controls, you can create your own. Using JavaScript, you can interact with the Audio API, which gives you access to methods such as play and pause and properties such as volume. Here’s a quick ’n’ dirty example using button elements and nasty inline event handlers (fig 3.06):

Buffering

At one point, the HTML5 spec included another Boolean attribute for the audio element. The autobuffer attribute was more polite and thoughtful than the nasty autoplay attribute. It provided a way for authors to inform the browser that—although the audio file shouldn’t play automatically—it will probably be played at some point, so the browser should start pre-loading the file in the background.

This would have been a useful attribute, but unfortunately Safari went a step further. It preloaded audio files regardless of whether or not the autobuffer attribute was present. Remember that because autobuffer was a Boolean attribute, there was no way to tell Safari not to preload the audio: autobuffer="false" was the same as autobuffer="true" or any other value (https://bugs.webkit.org/show_bug.cgi?id=25267).

The autobuffer attribute has now been replaced with the preload attribute. This isn’t a Boolean attribute. It can take three possible values: none, auto, and metadata. Using preload="none", you can now explicitly tell browsers not to pre-load the audio:

<audio src="witchitalineman.mp3" controls preload="none">
</audio>

If you only have one audio element on a page, you might want to use preload="auto", but the more audio elements you have, the more your visitors’ bandwidth is going to get hammered by excessive preloading.

You play to-may-to, I play to-mah-to

The audio element appears to be nigh-on perfect. Surely there must be a catch somewhere? There is.

The problem with the audio element isn’t in the specification. The problem lies with audio formats.

Although the MP3 format has become ubiquitous, it is not an open format. Because the format is patent-encumbered, technologies can’t decode MP3 files without paying the patent piper. That’s fine for corporations like Apple or Adobe, but it’s not so easy for smaller companies or open-source groups. Hence, Safari will happily play back MP3 files while Firefox will not.

There are other audio formats out there. The Vorbis codec— usually delivered as an .ogg file—isn’t crippled by any patents. Firefox supports Ogg Vorbis—but Safari doesn’t.

Fortunately, there’s a way to use the audio element without having to make a Sophie’s Choice between file formats. Instead of using the src attribute in the opening <audio> tag, you can specify multiple file formats using the source element:

A browser that can play back Ogg Vorbis files will look no further than the first source element. A browser that can play MP3 files but not Ogg Vorbis files will skip over the first source element and play the file in the second source element.

You can help the browsers by providing the mime types for each source file:

The source element is a standalone—or “void”—element, so if you are using XHTML syntax, remember to include a trailing slash at the end of each <source /> tag.

Falling back

The ability to specify multiple source elements is very useful. But there are some browsers that don’t support the audio element at all yet. Can you guess which browser I might be talking about?

Internet Explorer and its ilk need to be spoon-fed audio files the old-fashioned way, via Flash. The content model of the audio element supports this. Anything between the opening and closing <audio> tags that isn’t a source element will be exposed to browsers that don’t understand the audio element:

The transcript will only be visible to browsers that don’t support the audio element. Marking up the non-audio content in that way isn’t going to help a deaf user with a good browser. Besides, so-called accessibility content is often very useful for everyone, so why hide it?

Video

If browser-native audio is exciting, the prospect of browser-native video has web designers salivating in anticipation. As bandwidth has increased, video content has grown increasingly popular. The Flash plug-in is currently the technology of choice for displaying video on the web. HTML5 could change that.

The video element works just like the audio element. It has the optional autoplay, loop, and preload attributes. You can specify the location of the video file by either using the src attribute on the video element or by using source elements nested within the opening and closing <video> tags. You can let the browser take care of providing a user interface with the controls attribute or you can script your own controls.

The main difference between audio and video content is that movies, by their nature, will take up more room on the screen, so you’ll probably want to provide dimensions:

<video src="movie.mp4" controls width="360" height="240">
</video>

You can choose a representative image for the video and tell the browser to display it using the poster attribute (fig 3.07):

fig 3.07: This placeholder image is displayed using the poster attribute.

The battleground of competing video formats is even bloodier than that of audio. Some of the big players are MP4—which is patent-encumbered—and Theora Video, which isn’t. Once again, you’ll need to provide alternate encodings and fallback content:

The authors of the HTML5 specification had originally hoped to specify a baseline level of format support. Alas, the browser makers could not agree on a single format.

Going native

The ability to embed video natively in web pages could be the most exciting addition to HTML since the introduction of the img element. Big players like Google have not been shy in expressing their enthusiasm. You can get a taste for what they have planned for YouTube at http://youtube.com/HTML5.

One of the problems with relying on a plug-in for rich media is that plug-in content is sandboxed from the rest of the document. Having native rich media elements in HTML means that they play nicely with the other browser technologies— JavaScript and CSS.

The video element is not only scriptable, it is also styleable (fig 3.08).

fig 3.08: The video element, styled.

Try doing that to a plug-in.

Audio and video are welcome additions to HTML5, but the web isn’t a broadcast medium—it’s interactive. Forms are the oldest and most powerful way of enabling interaction. In Chapter 4, we’ll take a look at how forms are getting an upgrade in HTML5.