HTML5 Audio and JavaScript Control

HTML5 features an elegant new audio tag implementation and the specification includes simple HTML audio controls that allow audio playing in pages
without plugin or script support. On this page we'll explore integration of these new HTML5 audio features with JavaScript to create complex soundscapes.

Updates

October 31, 2010: Matched the latest version of the HTML5 spec (preload instead of autobuffer)

A Note About Encoding

Both the audio and video tags have revealed a rather unholy mess in regards to software patents - which leads to the ridiculous situation that Mozilla will not implement MP4 since it's
not free and the patent owners haven't convinced anybody that they won't at some point charge the hell out of MP4, while Apple and Google have licensed MP4, but will
not implement Vorbis and Theora (Mozilla's choice), since they can't get convinced that these encoders are really as free as they say.
So both sides think the other is dealing with a patent troll. Wonderful.

For the examples on this page we are therefore using WAV audio clips, since all current browser implementations support that format. For short sound clips the
uncompressed nature of WAV is not a major problem.

Simple JavaScript Control

The link text gives it away - an audio tag represents a single audio channel in the browser sound implementation. Once the sound is initiated, the channel is blocked until the audio track is done playing. Click the link above repeatedly to test this - after the first click no more sounds will be played until the five second sample is done (this is true for Safari 4.0.4 and Firefox 3.6).

This is not a big problem if we try to play a long piece of music, but this limitation creates some headaches for game developers that want to play multiple, repeated and overlapping audio effects.

Rotating Audio Channels

Here is one solution to overcome the single-channel limitation of the audio tag: Use multiple rotating audio channels and assign new sounds to currently unused channels. Click the links above
rapidly to test this.

In the example above we use 10 channels (generated audio objects) and whenever the user clicks another sound to play, the script finds an inactive (and therefore unblocked) channel and then loads and plays the selected sound.

Each of the sounds is being preloaded with an audio html tag that is actually never used to play the sound - the preload="auto" property suggests to the browser to load all of the sounds when the page loads (this depends on available space and general user preferences in the browser), instead of when the
sound is played for the first time through one of the generated audio channels.

The script checks each channel if it is done playing the previous sound. There is an "ended" property for the audio object, but since that property is "false"
when new objects are created (which is correct, but inconvenient), I've decided to keep track of the expected end times for each sound channel instead.

This solution can easily be incorporated into complex animations and games. If all sounds are preloaded with audio tags, this
function is able to play any sound at any moment, even multiple instances of the same sound effect. The 10-channel limit in the
example above is arbitrary and it will take some experimentation to find the true limits of how many parallel audio objects can be
generated without performance issues.