What Planet Is This?

Now that we understand the basics of playing audio with QuickTime,
let's think about what else we'd need to provide a more complete
player application to end users.

One of the most obvious needs for a modern player is the ability to
present metadata about the current song: information such as the title,
the artist's name, what album it's from, etc. Practically any player
puts this information front and center in the GUI.

There are different schemes for different audio formats, since some
were designed to contain metadata and others weren't. MP3s, for
example, weren't designed with these needs in mind -- arguably the only
"metadata" per se is a copyright bit in the MPEG
frame header. However, the ID3
standard was cleverly developed as a means of attaching metadata to
MP3 files by defining a format that could be placed inside of an MP3 file but
outside of the individual media frames. Typically, this information
is simply placed at the beginning of an MP3 file, before its first
MPEG frame.

When we open an MP3 file in QuickTime, we're really
importing it, changing it into a QuickTime movie in memory. In
the course of doing this, the ID3 data is parsed and placed in the
movie's structure. If you recall from an earlier article on the QuickTime file format, QuickTime movies are represented
both in memory and on disk as a tree of "atoms." These
atoms can either contain data or other atoms, but not both. Typically,
the top level of a self-contained movie file will contain an
mdat atom to hold the media samples and a
moov atom, which defines the movie's structure. The
moov contains multiple trak structures, and
also a handy atom called udta, short for "user
data."

Once we know the atom type as a four-character code, getting the
atom's contents from the Movie is pretty straightforward.
We get a UserData object with
Movie.getUserData(), and then find our atom and retrieve
its contents with UserData.getTextAsString(). This
method takes three arguments: an int for the requested
atom type, an index that indicates our interest in the
index-th instance of the given type (note that
multiple atoms of the same type are legal, and also that this call is
one-based, not zero-based), and finally an "international region
tag" that takes one of the lang... constants from
quicktime.io.IOConstants (langUnspecified is
a useful wildcard value here).

This article's sample application, QTBebop, contains a
MetadataJTable with a setMovie method that
retrieves all of the defined metadata entries and turns them into the
model of a Swing JTable. It defines all of the constants
from Movies.h in an array called TAG_NAMES
and looks for matches in a UserData object like this:

After this section, the foundTags and
foundValues are converted into a two-dimensional array
and passed to a DefaultTableModel constructor.

Notice the squashed catch block. If a given type is
not found, QuickTime throws a QTException. For our current
purposes, we do nothing, because this exception simply means that one of the
many possible metadata atom types wasn't found in the user data. Returning an
error code may make sense in C, but in Java, using exceptions to
control program flow is considered something of a worst practice
because of the expense of building a stack trace that won't be used,
since the exception isn't really signaling an error state. From a purely
Java point of view, it would be nice if QTJ had something like a
UserData.hasType(int) method, so we could check for an
atom without the performance hit of building a throwaway stack-trace
if it isn't there.

That said, the MetadataJTable does its job, and works
fairly quickly. Figure 1 shows an example of the table, running
against an MP3 I ripped from my CD collection:

So how does iTunes support Unicode ID3 tags? Presumably, it has its
own ID3 library, which makes sense, considering that it needs to both read
and write ID3 data. So while QuickTime gives us easy ID3 tag parsing,
the lack of support for international character sets might make you consider
using another library for tag parsing, or rolling your own.

Bad Dog, No Biscuit

Since we know that QuickTime is used to play the AAC files
supported by iTunes 4 and
sold by the iTunes Music Store, we'd want and expect it to be able to handle metadata from those files, too.

In fact, since the M4A format for user-ripped AACs
and the M4P for Apple-DRM'ed songs are both in the
MPEG-4 file format, which itself was adapted from the QuickTime file
format, we might reasonably expect that their metadata tags are
already in the user-data atom, arranged in the same way that ID3 tags
are parsed.

Yeah, we might expect that ... but we'd be wrong.

The metadata is still in the movie's user data, but in a much
different and apparently undocumented format. So we have to examine
it by hand. (Sigh ... This kind of thing is why I keep HexEdit on my dock.)

These iTunes-ripped files have an atom in the user data called
meta. Its contents look like valid atoms, but aren't,
since the first four bytes, which should be the size of the first
child atom, are 0x00000000. Maybe that's meant to throw
off QuickTime file parsers. Interestingly, a set of valid atoms begins
after that, with four bytes of size and a four-byte type, just
as we'd expect.

meta has a child called ilst, which in
turn has children that use tag-name constants that we saw before. We
can't use getUserDataAsString to get values from these
atoms because we're now two levels below the user data, and besides, we're not through with
undocumented oddities yet. In this AAC world, these atoms seem not to
contain data, but rather a child atom called data, which
contains eight junk bytes (perhaps flags) and then, finally, the data
for the tag.

MetadataJTable also handles this kind of metadata.
Its strategy in setMovie(), which kicks off a parse, is
to look in the user data for the meta atom. If absent,
the movie is assumed to be an ID3-tagged MP3 and uses the
previously-described code. If it finds meta, then it
looks for an ilst atom. If that succeeds, it starts
looking for atoms named by TAG_NAMES. When one is found,
it jumps ahead 24 bytes (to skip the size, type, size, "data," and 8
junk bytes) and reads the value.

An example of parsing a song purchased from the iTunes Music Store
is shown in Figure 2.

Figure 2. Parsed M4P metadata

You Make Me Cool

Surprisingly, everything we've done so far is in the main QuickTime
API and is not strictly limited to audio content. Again, this speaks to QT's
worldview that anything it reads in is a movie. Still, there are cool
features that are specific to audio that we get at by retrieving a
"handler" for the low-level audio data.

One thing we might want to provide for an audio player is a visual
representation of the sound. On a home stereo or professional
recording or mixing equipment, this would be represented as level
meters that show the intensity of various frequency bands at an instant
in time. In iTunes, these values are used to distort the
visualizations and express the sound data in a visually pleasing
way.

We can get these levels from QuickTime by first getting an
AudioMediaHandler, which provides methods for
getting and setting balance and metering audio levels. It's
interesting to note that this class is an interface, implemented by
SoundMediaHandler, StreamMediaHandler, and
MPEGMediaHandler. The first is used for audio files and
sound tracks within normal QuickTime movies and the second for streaming
data, and the third represents the long-annoying fact that QuickTime
sees multiplexed MPEG-1 files not as separate audio and video tracks
but as a single opaque media type, which makes extracting sound and
video from MPEG-1 quite difficult. Fortunately, MPEG-4 files read in
as normal QuickTime movies, with separate video and audio tracks.

But how do we get an AudioMediaHandler? Again, it's
helpful to state things in terms of QuickTime's view of the world:

Notice that once again a QuickTime get-by-index call,
Movie.getTrack() in this case, uses indices that start at 1,
not 0.

Now that we have the AudioMediaHandler, we can set
balance, bass, and treble, and monitor sound levels. The first two are
trivial. For the third, we need to pass in a structure representing
which sets of frequencies, or "bands," we want to monitor.
We do this with a MediaEQSpectrumBands object, which
wraps the desired bands. For the QTBebop sample application, I've
used the bands shown by iTunes' graphic equalizer, represented by the
array EQ_LEVELS. So setting up for monitoring looks like this:

To get the levels, we call getSoundEqualizerBandLevels(), passing in the number of
bands that we set up in the first place (e.g., EQ_LEVELS.length). This returns an
int array, with values from 0 to 255. The QTBebop sample app uses a javax.swing.Timer to call this method every 100 milliseconds and redraw an offscreen java.awt.Graphics buffer with rectangles of a height proportional to the returned level values -- in other words, the rectangle gets 0 height if the level is zero, and is the height of the buffer when the level is 255.

The resulting application is shown in Figure 3.

Figure 3. The QTBebop application, with level meter

Author's Note: When run on Mac OS X with Java 1.4.1, the scrubber bar has repaint problems when a file is opened but is not yet playing. It does not have problems on OS X's Java 1.3.1 or on Windows, so this may be a version-specific bug, and has been filed appropriately. You can look in the sample code for the many workarounds
I tried to get the scrubber repainted correctly.

See You, Space Cowboy

Obviously, our sample application could benefit from a graphical
upgrade to make the bars more attractive -- perhaps spacing between
bars, LED-like blocks of color, use of red and yellow regions in the
upper part of each level, or a "sticky" line that represents
the peak of each band's frequency over the last second. Adding
balance and bass/treble controls would also be an easy
improvement.

A more significant feature to add would be support for audio
streams. As covered much earlier in this series, you can create a Movie from a URL by creating a DataRef from the URL string, which you
then pass to the static Movie.fromDataRef() method. In
terms of playable URLs, QuickTime can play RTSP-streamed content, of
course, and can handle Shoutcast-style HTTP-streamed
audio by changing the URL's http: protocol to the pseudo-protocol icy:, as detailed in the QuickTime 6 documentation.