The dynamic range of a selection of music is dependent on both estimating the time-varying loudness of the music and the timescale used for loudness evaluation. I propose a numerical method of estimating dynamic range that satisfies those dependencies using a modified ITU-R 1770 loudness filter and three moving windows to estimate loudness across three different timescales. The goal is to more accurately measure and compare dynamic range between different music genres and different masterings and processing techniques for the same music.

Dynamic range = range between 50th and 97.7th percentile, for each timescale

I've been kicking this around for almost a year, but I finally broke down and wrote the thing for real in an afternoon last November (it's been extensively tuned since then). The recent discussions about dynamic range have forced my hand, because so many important things were touched upon, and really, you can think of pfpf as an extremely elaborate reply to that topic.

This is a better way to measure dynamic range, for the following reasons:

It measures dynamic range as a ratio of loudnesses. Peak-to-average cannot claim this (it is fundamentally a comparison of two different units). ReplayGain comparisons cannot claim this.

It uses a real loudness model (flawed though it is) for the basis of loudness estimation. Waveform comparisons (especially for loudness-war-related discussions) are fundamentally flawed for this reason - what you get out of Audacity has a relatively tenuous connection to real perceived loudness.

Dynamic range is estimated across three different timescales - 3000ms, 200ms, and 10 ms - and each scale is fully decorrelated from each other. So pfpf can tell between when a quiet passage has a loud transient, or when a loud passage has a sudden pause. The timescales are configurable.

It uses a percentile approach on a histogram for estimating dynamic range, instead of min/max/avg. This makes the technique much more resilient to differences in mastering and medium; pops and ticks should not affect results, nor should small bits of digital silence, like in greynol's Tool example. (Yes, greynol, you can distinguish ppp from fff now.) The percentiles are configurable.

Background noise (when no music is playing) can be masked with a fixed threshold, so that silence won't pile up on one side of the histogram distorting the numbers, and the results should be invariant of any extra silence padding before/after music (this should make CD/vinyl comparisons a lot easier). The threshold is configurable.

Please read the paper, download the app and try it for yourself. Lemmeknow what you think.