Digital Audio Jitter and Image Stabilizer Binoculars

I recently bought a pair of Canon Image Stabilizer binoculars for wildlife viewing. Image Stabilizer technology, widely used in digital cameras and camcorders, optically eliminates the apparent image movement caused by hand shake. The technology is ideally suited to binoculars, where hand shake degrades images as seen through even the world’s finest conventional binoculars. You can instantly judge the effect of the Image Stabilizer by looking through the Canon binoculars and then pressing the button that engages the Image Stabilizer. As though by magic, the image instantly becomes motionless.

The subjective impact of this feature cannot be overstated. Fine details in the object under observation suddenly become apparent. The fatigue that quickly sets in with conventional binoculars is non-existent. But beyond these factors, the Image Stabilizer technology fundamentally transforms the experience because it involves the viewer so much more deeply in the subject under observation. There’s a greater depth of appreciation for what one sees, a sense of relaxed engagement, and of immersion in the subject that simply doesn’t occur with conventional binoculars.

It’s interesting that pressing the Image Stabilizer button results in no more information being presented to the brain. Rather, the massive benefit is conferred because the brain isn’t trying to reconstruct a stable image from the jittery image presented through conventional binoculars.

So what do Image Stabilizer binoculars have to do with digital audio?

There’s a parallel between clock jitter in digital audio and the image we see through conventional binoculars. Jitter is timing variations in the clock that controls when the digital-to-analog converter (DAC) chip turns each digital sample into an analog signal. In an 8-X oversampling CD player, this occurs 352,800 times per second. If the clock that controls this conversion isn’t perfectly precise in its timing, the samples are put back together into an analog waveform in a non-uniform manner. Some samples will be too close together, and some too far apart, resulting in a slight irregularity in the reconstructed analog waveform.

The audible effects of clock jitter include a glassy sound overlaying timbres; a metallic-sounding treble; a reduced sense of space, depth, and soundstage layering; a softening of the bass; and an overall uninvolving presentation.

I’ve heard this most dramatically when I reviewed Esoteric’s G-0Rb, a $16,000 rubidium-based external clock. That’s right: The G-0Rb is an atomic clock in your equipment rack whose sole purpose is to provide a precise clock for the digital-to-analog conversion process. With the push of a button, I was able to compare the conventional clock in the Esoteric P-03/D-03 combination with the rubidium-generated clock. Engaging the G-0Rb brought the soundstage into sharp focus, revealed the size and character of the hall through better resolution of low-level spatial cues, made instrumental timbres sound more natural and “organic,” and resulted in a wholesale increase in involvement in the musical performance.

Just as the image-stabilized view through the Canon binoculars contains no more information than the unstabilized view, a digital audio system with jitter conveys no less information than a system with virtually no jitter.1 But the difference is that the information presented to the brain though the Canon binoculars and through a low-jitter digital audio system is coherent. That is, the visual image and the auditory input are consonant with nature. The brain expects to receive images that aren’t jumping around, and analog waveforms that don’t contain jitter-induced micro-irregularities in their shapes. Millions of years of evolution have hardwired our brains to process sensory stimuli that exist in nature. When those stimuli contain distortions that don’t exist in the real world, the brain spends lots of processing power trying to decipher the sensory input into a coherent picture. With the brain thus occupied, little horsepower is left for appreciating the meaning of the sensory input—the very reason we pursue the sensory input in the first place.

(This wonderful analogy between Canon’s Image Stabilizer binoculars and clock jitter was conceived by the brilliant Michael “Pflash” Pflaumer, co-inventor of High-Definition Compatible Digital (HDCD) and principle author of the Berkeley Audio Design Alpha DAC, who mentioned it to me during a chance encounter at the most recent CES.)

1. That’s why computer data or even audio files can be transmitted and recovered with no degradation. The problem occurs when we convert those data into analog waveforms that are analyzed by the brain.