13 January 2008

The dynamic range of a selection of music is dependent on both estimating the time-varying loudness of the music and the timescale used for loudness evaluation. I propose a numerical method of estimating dynamic range that satisfies those dependencies using a modified ITU-R 1770 loudness filter and three moving windows to estimate loudness across three different timescales. The goal is to more accurately measure and compare dynamic range between different music genres and different masterings and processing techniques for the same music.Summary of algorithm:

Using the pfpf application

Basic instructions: Run the program and open the folder icon on top to select a WAV file. Press the "Run analysis" button. The file is scanned for instantaneous loudness (indicated by the progress bar) and then a histogram operation is performed to calculate the dynamic range. The output is displayed at the bottom. Additionally, other tabs display plots of instantaneous loudness and histograms.

Interpreting the results:

Long-term dynamic range - loudness changes across multiple seconds, or across multiple measures of a piece of music. Wide swings in orchestration and sustained loud/quiet passages increase this number. Dynamic range compression, in any form, decreases this number. Typical values range from 16db for extremely dynamic orchestral and experimental music to 1-2db for pop/rock singles.

Screenshots

Advanced configuration

These options affect the computation of the dynamic range; when they are modified, the results should always include the new configuration. The "Output" string was created for this purpose.

Thresholds: If the instantaneous loudness drops under the threshold associated for that time scale, that timescale loudness (and the loudness for any shorter timescale) is clamped to NaN, and ignored in future dynamic range calculations. This is to prevent silence (assumed to be below the listening noise floor and is therefore inaudible) between music from affecting the results. Silence skews the histogram results so as to artificially compress dynamic range across all timescales. Its loudness also varies considerably between different formats (notably vinyl vs CD) and masking it aids in making an accurate comparison of formats.

Time scales: Controls the rms window size (in seconds) for each time scale.

Percentiles: By default, dynamic range is calculated as the loudness range between the 50% and 97.7% percentiles from histograms at each time scale of loudness. These percentile levels may be adjusted.

Application License

The pfpf application is free for non-commercial use. Do not redistribute it. Source code is available upon request (requires LabVIEW 8.2 or above and the Digital Filter Design toolkit).

Contact Info

Message me (Axon) on HydrogenAudio, or comment below.

Known Issues

It is important to take the results with a grain of salt. Transient loudness estimation is a topic of ongoing research, and no truly accurate method has yet to be agreed on. pfpf currently uses a moving-window modification to Leq(RLB), but in the future, a more elaborate loudness estimator, like HEIMDAL, might be used.

DC removal is applied at each short block (defaults to 0.01 seconds of signal) that is read, which are composed into the larger medium/long (0.2/3s) blocks. The end result is that the signal receives a 100hz highpass before analysis, removing all bass information. This is anticipated to not be a big deal because of the relatively small contribution that LFE provides to loudness models.

Histogram computation is not factored into the progress bar, so there is a noticeable pause between the completion of the progress bar and the display of results.

Beware of falling code. Parameters may not be well tested for failure cases or obviously incorrect inputs.