Does anyone else have any additional actual published papers on this topic of the audibility of jitter or listening tests?

What journals cover this topic, if any?

My focus is to understand:a) given a synthetic jitter profile, is it audible using DBT?b) given a real jitter environment, is it audible using DBT?c) can you DBT the difference between toslink and coax?

I have been looking into the basis of the audiophile belief that toslink is broken due to too much jitter and the implicit belief of the audibility of very small amounts of jitter. Given the intensity of the belief, perhaps where there is smoke, there is fire, even though this belief makes no sense to me based on my understanding of jitter and its impacts in this application space.

As far as I understand jitter impacts two things related to audio reproduction:a) at the level of the synchronous bit transport, it influences bit error ratesb) at the level of the DAC, it causes errors in the recostruction of the waveform. The outcome of this is essentially higher noise and distortion, i.e. you just get a bump in the noise floor related to the waveform being reconstructed - its correlated to what is being reconstructed, which is a wrinkle - and possibly spectral aliases being created.

Detailed studies of this understanding are also appreciated with actual waveforms & spectrum views. I also want to be clear I understand the techniques to dejitter a clock, including reclocking and most importantly, buffering. The issue is about impact, not about repair or avoidance.

I don't want any more audiophile "received wisdom" on the issue of jitter, I have received lots of that.

Papers without audibility studies set the threashold far lower. For example, Dunn's 1992 AES paper claims an audibility threshold of an astonishing 20ps at 20 KHz, based on his 1991 paper "Considerations for Interfacing Digital Audio Equipment to the Standards AES3, AES5, AES11, Proceedings of the 10th International AES Conference, 1991" (paper not yet found online). As another data point, "A Digital Discourse, Dr. Malcolm Hawksford; HiFi News & Record Review Feb,April, June, Aug, 1990" claims a peak jitter threshold of 400ps (also cited by Stereophile). I have not found the actual article yet, just citations and quotes.

Is Dunn's audibility curve an analytic derivation, or an audibility study? Dunn's curve of audibility is widely quoted. Anyone have a copy of this paper?

Others cited, but not yet found (I hesitate to pay the $20 AES paper fee) include "Eric Benjamin and Benjamin Gannon, "Theoretical and Audible Effects of Jitter on Digital Audio Quality", Preprint 4826 of the 105th AES Convention, San Francisco, September 1998" and "The Effects of Sampling Clock Jitter on Nyquist Sampling Analog-to-Digital Converters, and on Oversampling Delta-Sigma ADCs, 87th Convention of the Audio Engineering Society, October, 1989" (also cited by Stereophile).

There appears to be tremendous discussion of jitter measurement, but little understanding of what it actually means. A lot of this appears to me to be very old work and at best analytic, not audability based. None of it considers modern techniques to break the end to end synchronous clocking paradigm although some of them hint at what is now common practice in the telecommunications & Internet space.

Journals that publish in this area or references to further studies would be welcome.

(Zster, thanks for the reference to the Lavry overview paper. It is a useful overview document to help explain the impact of jitter in a well written way, and review some of the more modern methods to dejitter signals.)

A small group of experts on this subject (some from this forum) is currently investigating the audibility of jitter.The idea is to develop a jitter simulation application to enable testing (listening) without a low-jitter DAC.Alternatively an ultra-low jitter DAC and ADC are available for DBT.There's no strict time schedule for the project. I'll post updates in this thread.

A small group of experts on this subject (some from this forum) is currently investigating the audibility of jitter.The idea is to develop a jitter simulation application to enable testing (listening) without a low-jitter DAC.Alternatively an ultra-low jitter DAC and ADC are available for DBT.There's no strict time schedule for the project. I'll post updates in this thread.

If anybody is interested to do a listening test in France, I can provide a jitter modulator JM1 from Prism and a QSC ABX comparator for DBT. Get here the manuals.We have to verify that the setup and ABX box doesn't add any jitter (I can get jitter measuring tools).It's not as pratical as a software simulation but could be a complementary approach.

Is this the wrong place to ask "What IS Jitter and How does it affect people with digital music?" And I feel a little stupid asking this question having an EE background. The most simplistic description of Jitter is, "A Time-based error where a digital clock is not quite accurate,"

Now I've never really known whether typically when people say that, they're referring to an instance where a clock signal was held higher (or lower) longer than it was supposed to, or if a clock is operating at 44.2kHz instead of 44.1kHz.

Most of what I read about jitter issues involves ADCs, which although important, isn't all THAT important for people who prefer to listen to their music (as opposed to folks interested in recording it).

For digital transport, jitter can be a serious killer, but it has to be pretty damn serious before it affects this layer. Basically your data lines and clock lines have to be out of sync. By a lot. If people want to simulate this, I'd recommend writing a program to either randomly delete samples or randomly rewrite a sample to the sample immediately preceding it. My guess is that this happens as often as cosmic ray memory errors.

Since most applications for people with digital music is in the playback, only the DACs are relevent, and the question must be raised, "What are typical settling times for DACs?" But I think it also raises another question. Is Jitter at all relevent to people who don't know what the settling time of their DAC chip is?

Audibility threshold for timing jitter, for 'golden eared' listeners in a two-alternative forced-choice paradigm using their preferred listening environment and samples: 250 ns.

I've read this and found it largely reflects the findings of many friends. Wonder if this can be repeated exactly with a different group?

Single number jitter specifications don't say much if you don't know the jitter spectrum.If the effects of jitter are audible at all, it will be interesting to know which part(s) of the spectrum is/are responsible.Measurements of real world equipment indicate that the jitter spectrum can't be assumed to be flat. Jitter simulation should be able to reproduce this behavior.

For digital transport, jitter can be a serious killer, but it has to be pretty damn serious before it affects this layer. Basically your data lines and clock lines have to be out of sync. By a lot. If people want to simulate this, I'd recommend writing a program to either randomly delete samples or randomly rewrite a sample to the sample immediately preceding it. My guess is that this happens as often as cosmic ray memory errors.

Since most applications for people with digital music is in the playback, only the DACs are relevent, and the question must be raised, "What are typical settling times for DACs?" But I think it also raises another question. Is Jitter at all relevent to people who don't know what the settling time of their DAC chip is?

If the the sampling clock for the DAC (whatever rate it has) has a high jitter (and I guess this threadis about what is "high"), the sampling is not equidistant in time anymore, thus the sampling theoremis violated. This has an effect on the spectrum and the reconstructed signal. Question is, when is itaudible...? The effect of jitter being so high that actually data is lost is not part of the discussion.

Papers without audibility studies set the threashold far lower. For example, Dunn's 1992 AES paper claims an audibility threshold of an astonishing 20ps at 20 KHz, based on his 1991 paper "Considerations for Interfacing Digital Audio Equipment to the Standards AES3, AES5, AES11, Proceedings of the 10th International AES Conference, 1991" (paper not yet found online). As another data point, "A Digital Discourse, Dr. Malcolm Hawksford; HiFi News & Record Review Feb,April, June, Aug, 1990" claims a peak jitter threshold of 400ps (also cited by Stereophile). I have not found the actual article yet, just citations and quotes.

Is Dunn's audibility curve an analytic derivation, or an audibility study? Dunn's curve of audibility is widely quoted. Anyone have a copy of this paper?

Others cited, but not yet found (I hesitate to pay the $20 AES paper fee) include "Eric Benjamin and Benjamin Gannon, "Theoretical and Audible Effects of Jitter on Digital Audio Quality", Preprint 4826 of the 105th AES Convention, San Francisco, September 1998" and "The Effects of Sampling Clock Jitter on Nyquist Sampling Analog-to-Digital Converters, and on Oversampling Delta-Sigma ADCs, 87th Convention of the Audio Engineering Society, October, 1989" (also cited by Stereophile).

250ns and 400ps are not contradictory. One is saying "we've tested it subjectively - at around 250ns it starts to become audible". The other is saying "from first principles, if you keep it below 400ps, for the most sensitive possible signals, it will have a smaller impact on the signal than the limits of the system itself (i.e. the sample rate/bandwidth and bitdepth)".

These are two different approaches to audio engineering. One says "we can make it as bad as we want as long as no one can hear it". The other says "we will make it as good as we can to the point where this part can never be the limiting factor".

The real world has to sit between the two. You can't engineer something so that nothing is the limiting factor! You have to have some understanding of human ears to know when to stop improving everything (or be permanently depressed that nothing is good enough).

Conversely, you can't make everything "as bad as it can be before it causes an audible problem" because if you chain all these separate things together you can be fairly sure that you will have an audible problem at the end!

I like the idea of an experiment, but I don't see how a typical HA public test can work or be valid. We're all listening with unknown levels of jitter.

It would be like testing the audibility of -120dB of noise while we're all listening with soundcards which add noise at somewhere between -108dB and -60dB. Not hearing the -120dB of noise through these sound cards proves nothing. The same is true of jitter.

The authors of the paper above, Dr. Ashihara & Dr. Kiryu answered questions(1 , 2) raised by Steve N. at Empirical Audio. These exceptions are from hciman77(Jim), who received the answers via email:

QUOTE (hciman77 @ Dec 20 2006, 08:02)

Dear Jim,

Thank you for the e-mail. I suppose that you read our paper titled'Detection threshold for distortions due to jitter on digitalaudio(http://www.jstage.jst.go.jp/article/ast/26/1/50/_pdf).'. Before thispaper was published in the Acoustical Science and Technology, we hadpublished another paper 'The maximum permissible size and detectionthreshold of time jitter on digital audio.' Unfortunately, it was written inJapanese.

In our first experiment, which was reported in the Japanese paper, we used afixed listening condition and fixed materials. All of 14 participants wereuniversity students without any special training. The audio system that weused consisted of the following equipment.

They costed about $10,000. I don't know if they belong to high-end or not.

All participants could distinguish between sounds with and without timejitter when the jitter size was 9216 ns. A few could when it was 1152 ns. Noone could when it was as small as 576 ns.

There was a question, however, if the result would depend on the listeningenvironments and the skill of the listeners. That is why we carried on thesecond experiment. This second experiment is reported in the paper, the onethat you probably read.

Listeners in the second experiment were all professionals, audio engineers,recording/mixing engineers, musicians, etc... Sound materials were selectedby the listeners so that each listener could use his (her) familiarmaterials. The experimenter (we) visited the listeners' studios or listeningrooms so that we could use listeners' own DAC, amplifiers, loudspeakers andheadphones. The system configurations, therefore, varied among listeners.They were mostly mid-end or above, I suppose.

As you can find in the paper, some listeners could distinguish the soundswhen time jitter was 500 ns. It could not be detected, however, when thejitter was as small as 250 ns.

In both experiments, there was considerable difference in listeners'performance. I don't know, however, if it was because of their audioexperience. We had expected much better performance in the second experimentbecause the listeners were professionals and they could use their favoriteenvironments and materials.

Our conclusion up to now is that the normal hearing listeners' detectionthreshold for time jitter in program materials is several hundred ns.

I appreciate that you are interested in our paper. Thank you for askingquestions.

Best wish

--ASHIHARA Kaoru

...

QUOTE (hciman77 @ Dec 25 2006, 07:16)

Hello Jim,

Additional comments came from my co-worker, Dr. Kiryu.Prior to the second experiment, we had sent the materials with time jitterof several amounts to some of the participants. They could, therefore, trainthemselves with the materials. In fact, one listener told us that he coulddetect time jitter of several ns. However, in the experiment that was astrict double-blind test, his score was much worse as written in the paper.Recently, the materials were sent to Dr. Kiryu's friend who is anaudiophile. This man said that the sufficient training made it possible todetect time jitter of 150 ns.

I want to add some more.We had considered about the maximum permissible jitter in audio packagemedia. When random time jitter does not cause any distortions larger than1/2 LSB (Least Significant Bit), it does not degrade the quality of soundsbecause the distortions in this case are smaller than the quantization noiselevel. When there is time jitter, the maximum distortion occurs where theslope of the waveform is its maximum. The size of distortion can be obtainedby multiping the slope by jitter size. Does it make sense to you? My Englishmay be awkward sometimes. Please make it up for with your imagination.Anyway, if you can find the maximum slope (inclination?) in the waveforms ofthe sound materials, you can estimate the maximum permissible jitter size.We estimated the maximum permissible jitter size by checking the maximumslope in the music waveforms in many CDs. The values varied considerablybetween 182 ps and 2567 ps. This means that in certain materials, timejitter has to be smaller than 182 ps to guarantee a 16-bit resolution.

In our research, the material that was most susceptible to jitter was amusic played by a music box and it contained a lot of high-frequencycomponents. One hundred eighty-two ps correspond to the maximum permissiblejitter in a pure tone of about 13.3 kHz. If a 20 kHz pure tone has to bereproduced with a resolution of 16 bit, the maximum permissible jitter sizeis about 121 ps. By the way, I do not know any loudspeakers or headphonesthat have linearity corresponds to a 16 bit resolution. After all, I won'tbelieve someone who says that he can detect time jitter of 100 ps or less inCDs.

I visited the site of Headfi.org and found your heated discussion on thistopic.

We had considered about the maximum permissible jitter in audio packagemedia. When random time jitter does not cause any distortions larger than1/2 LSB (Least Significant Bit), it does not degrade the quality of soundsbecause the distortions in this case are smaller than the quantization noiselevel.

The above statement is incorrect. The quantization noise of a digital audio system does not fully mask all distortion that is at a lower level. This is a common misconception that ignores masking theory.

For example, a 3 kHz tone can be heard as much as 30 dB below the level of a white noise signal if the level of the 3 kHz tone exceeds the threshold of hearing.

Our ears are very good at detecting tones at levels that are below the level of the ambient noise.

This does not mean that the results of their paper are necessarily incorrect, it just shows that they ignored masking theory when speculating what distortion levels may be audible.

Let me say it again - just to be very clear: 1) The -96 dB FS quantization noise of a 16-bit CD system does not magically mask all distortion and music that falls below -96 dB FS.2) 3 kHz tones can be heard in a 16-bit TPDF system down to levels as low as about -126 dB FS if system gain is turned up enough to allow the tone to be reproduced at a level that is sufficiently above the threhold of hearing.

John, the post that you quote refers specifically to RANDOM jitter. How can this possibly produce anything remotely resembling a 3 kHz tone?

Thanks for the correction - I failed to note that their paper was confined to random jitter. The distortion caused by this random jitter should be much less audible than the distortion caused by siusoidal jitter.

To reach audibility, the distortion caused by random jitter may need to be 20 to 30 dB higher than the distortion caused by sinusoidal jitter.

The "random jitter" used in this experiment is frequency limited by the Nyquist theorem. Consequently, the jitter-induced distortion will have nearly the same spectral shape as the jitter. If the spectrum of the band-limited random jitter is white, we should expect the spectrum of the jitter-induced distortion to be nearly white. TPDF dither noise will be very effective at masking this spectrally-white jitter-induced distortion. If the jitter-induced distortion is the same amplitude as 16-bit TPDF dither noise, the system noise level will increase by 3 dB. If the jitter-induced distortion is 6 dB lower than the 16-bit TPDF noise, system noise will increase by 1 dB. In this experiment, the jitter-induced distortion is simply a white noise signal that gets added to the system noise.

Note: Use RMS noise summing equations to calculate resulting noise.

Digital audio transmission systems tend to generate jitter at very specific frequencies. The spectrum of the code-induced jitter at the end of a S/PIF cable is much closer to sinusoidal than random. Spectrally white random jitter is not likely to occur in the real world. Jitter composed one or two dominant sinusoidal frequencies is much more common. In my opinion it is more important to investigate the audibility thresholds for sinusoidal jitter.

Obviously the investigation of random jitter is a good first step as it requires far fewer tests than an investigation of random jitter. With random jitter we have one variable - amplitude. An investigation of sinusoidal jitter would require two variables - amplitude and frequency. Many tests would be required to plot the audibility curves.

We should be able to estimate the audibility of sinusoidal jitter-induced distortion using masking theory. Has anyone published these calculations?

We should be able to estimate the audibility of sinusoidal jitter-induced distortion using masking theory. Has anyone published these calculations?

Are there any good papers on the audibility of sinusoidal jitter?

Julian Dunn calculated the audibility threshold of jitter-induced sidebands produced by sinusoidal jitter. He took making effects into account when calculating audibility. His calculations are based upon a peak playback level of 120dB SPL and he assumes that un-masked sidebands become audible at 0 dB SPL.

Peak playback levels are usually lower than 120 dB SPL, and audibility thresholds will usually be slightly higher than 0 dB SPL (due to ambient noise), so his jitter-audibility plot is a worst-case audibility plot. These results could be scaled for other playback levels and ambient noise levels.

See section 3.3 for an explanation, and figure 9 for a plot of "maximum inaudible jitter amplitude" vs frequency.

In summary of Julian Dunn's calculations: 1us at jitter frequencies below 200 Hz should be inaudible1ns at a jitter-frequency of 600 Hz should be inaudible100 ps at a jitter-frequency of about 3 kHz should be inaudible20 ps a jitter-frequency of 20 kHz should be inaudible

A detailed paper on the derivation of theses numbers can be found here:

His calculations are based upon a peak playback level of 120dB SPL and he assumes that un-masked sidebands become audible at 0 dB SPL.

Can I explain this in suitably scientific language?

It's taking the Michael.

QUOTE

See section 3.3 for an explanation, and figure 9 for a plot of "maximum inaudible jitter amplitude" vs frequency.

In summary of Julian Dunn's calculations: 1us at jitter frequencies below 200 Hz should be inaudible1ns at a jitter-frequency of 600 Hz should be inaudible100 ps at a jitter-frequency of about 3 kHz should be inaudible20 ps a jitter-frequency of 20 kHz should be inaudible

So you need less than 20ps jitter - otherwise, when you play back 20kHz sine wave at 120dB SPL, the jitter-induced noise might have a total power equivalent to 0dB SPL.

QUOTE

A detailed paper on the derivation of theses numbers can be found here:

His calculations are based upon a peak playback level of 120dB SPL and he assumes that un-masked sidebands become audible at 0 dB SPL.

Can I explain this in suitably scientific language?

It's taking the Michael.

QUOTE

See section 3.3 for an explanation, and figure 9 for a plot of "maximum inaudible jitter amplitude" vs frequency.

In summary of Julian Dunn's calculations: 1us at jitter frequencies below 200 Hz should be inaudible1ns at a jitter-frequency of 600 Hz should be inaudible100 ps at a jitter-frequency of about 3 kHz should be inaudible20 ps a jitter-frequency of 20 kHz should be inaudible

So you need less than 20ps jitter - otherwise, when you play back 20kHz sine wave at 120dB SPL, the jitter-induced noise might have a total power equivalent to 0dB SPL.

The first fallacy here is the idea that the human threshold of hearing remains at 0 dB while a human is listening to 20 Hz at 120 dB. IOW, there is a presumption that theshold shifts never happen, even in the presence of 120 dB sounds.

The second fallacy is that there would ever be a natural sound that is a 120 dB 20 Hz pure tone with all other sounds 120 dB down.

The third fallacy is that there is anybody actually listens to reproduced sound in a context where the listening environment's residual noise is at 0 dB or below, other than as part of a lab esperiment.

The first fallacy here is the idea that the human threshold of hearing remains at 0 dB while a human is listening to 20 Hz at 120 dB. IOW, there is a presumption that theshold shifts never happen, even in the presence of 120 dB sounds.

The second fallacy is that there would ever be a natural sound that is a 120 dB 20 Hz pure tone with all other sounds 120 dB down.

The third fallacy is that there is anybody actually listens to reproduced sound in a context where the listening environment's residual noise is at 0 dB or below, other than as part of a lab esperiment.

I don't disagree with any of these comments. Julian Dunn never claimed that jitter becomes audible at these levels, just that we could guarantee that it would be inaudible if held below these levels.

In my post, I suggested that we could scale Julian Dunn's calculations to reasonable listening situations. I was not implying that these levels were typical, or that full-scale 20 kHz tones are found in any music recordings.

With a little more work, we could also adjust Julian's calculations for an audio input spectrum that is more typical of music.

What must be understood is that random jitter is also an extrreme case that also NOT typical of real audio hardware. For this reason, the random jitter audibility test results are as unrealistic as Julian's graph. I suspect realistic threholds for typical "real-world" jitter spectra will fall somewhere in between these extreeme cases.

Both papers provide good resources for finding a more realistic answer.

What must be understood is that random jitter is also an extreme case that also NOT typical of real audio hardware. For this reason, the random jitter audibility test results are as unrealistic as Julian's graph.

I agree that random jitter is one of the extreme cases. It should in general be the hardest to hear because it lacks tonality. Its cause is likely to be irreducable thermal noise.

IME most jitter traces back to the environment or the process. Power supply frequencies are common. Jitter at the same rate as data blocks is not uncommon. Then there are the cases where the signal jitters itself.

QUOTE

I suspect realistic threholds for typical "real-world" jitter spectra will fall somewhere in between these extreme cases.

So far I see none of the cases where jitter is really hard to discern, just different cases that are extremely small.

When all possible forms of masking are consdered, the amount of jitter that can be masked can be huge. I'm not sure that I can support those cases being used to calculate an average of extreme cases.

We hear very little about people hearing jitter during LP playback, yet there is jitter in LP playback that is commonly less than 60 dB down.

QUOTE

Both papers provide good resources for finding a more realistic answer.

We hear very little about people hearing jitter during LP playback, yet there is jitter in LP playback that is commonly less than 60 dB down.

Like, WOW, man!

LOL!

QUOTE

I keep pointing out to people that .55555_Hz jitter isn't that bad.

Unfortunately .55555 Hz isn't the only kind of jitter that LPs have. Anything that causes the record to be other than perfectly flat creates FM distorition. Then there is the FM distortion that is inherent in offset arms. It FM modulates everything with whatever bass is present. LPs also pick up whatever jitter the analog tape had, which includes capstan and scrape flutter that can go up into the 100s of Hz.

Needless to say, the high end reviewers seem to have cultivated selective deafness to the many audible flaws that are inherent in the LP format.

[1] yes- and it is quite well-known.[2] yes- done by Dolby.[3] objective measurements only- no need to meet TOS 8.[4] yes- and we have discussed it on this thread. Also there is an addendum from the authors.[5] not referenced above- no need to meet TOS 8.[6] no- written in 1970, well before David Clark's test in 1982.