A
new form of audio measurement is poised to invade North American
control rooms and studios: loudness-based metering.

It
is one of the most fundamental changes to occur in audio metering
since the introduction of peak metering nearly two decades ago. It is
also a major departure from signal peak metering, which is so common
today, as well as the historic VU meter — a change that some
believe will resolve long-standing issues with irregular program
levels and listener annoyance.

Peak-reading
meters are designed to indicate the potential for signal peak
overload, but are not such good indicators of optimal audio level.
For that, engineers use their ears and gain controls to mix content
at the appropriate levels. As discussed below, loudness metering is a
more sensible alternative for characterizing the
dynamically-compressed and often mismatched program levels in today’s
audio.

Fortunately,
a great deal of work has been done on loudness measurement by some
dedicated engineers on working groups at the Radiocommunications
Sector of the International Telecommunications Union (ITU) and the
European Broadcasting Union. Their research over many years led to
the development of an algorithm for a better meter — one that
measures program loudness similar to human hearing. This algorithm is
defined by ITU Broadcast Systems Recommendation BS.1770-3. (A PDF can
be downloaded from the ITU here.

The
design of the loudness meter is illustrated in Fig. 1, showing a
simplified diagram of the ITU device. The left and right channel
audio are passed through separate “K-weighting filters” having a
frequency response indicated by the blue curve. The content below 100
Hz is rolled off by a high-pass filter, while content between 100 Hz
and nearly 1 kHz are passed normally. From 1 kHz to 2 kHz, the filter
increases gain, and then shelves the gain for frequencies above
approximately 2 kHz. The signal from each filter is converted to
means-square amplitude before being summed. Multichannel meters do
the same steps with center channel and surround channels, with slight
difference in gain compared to the left and right channels. The
low-frequency effects channel is not included, as it contributes
little to the sense of program loudness.

The
ITU standard design provides an indicator of real-time program
loudness in Loudness Units (“LU”), where a change of 1 LU is 1
dB. The numerical measurement is referenced to nominal full scale
with the designation LUFS.

For
long-term measurements a gate is added to pause the measurement when
the signal drops below a level-determined threshold. This prevents
silence or background sounds from biasing a long-term integrated
loudness value. The ITU algorithm also defined the method of
measuring the “reconstructed” signal peaks that accompany the
loudness graphs. This algorithm can estimate the height of signal
peaks that exceeded full scale — in effect displaying a positive
dBFS value!

THE
PROBLEM WITH PEAKS
Assuming
for a moment that the ITU loudness meter represents our sensation of
program loudness, the problem with peak-reading meters is illustrated
in Fig. 2. In this chart, five minutes of NPR’s “All Things
Considered,” a nightly news program, is measured for both signal
peak (in red) and loudness (in blue). The data was collected with the
Orban Loudness Meter,
which permits logging the audio data to a CSV file for analysis. In
the early part of the sample, at the left, a vertical arrow indicates
one of the highest local moments of loudness, where the difference
between the two meters is around 12 dB (the difference between the
peaks, in dBFS, and the loudness in LUFS). Toward the center, a high
loudness point coincides with one of the highest signal peaks, with a
difference of 18 dB. Then, a little more to the right, less than a
minute later, the highest moment of loudness occurs, but the signal
peak is only 10 dB above the loudness.

Fig.
3: Original Stream Loudness and Peaks

Fig.
4: Original Streams With Peak Normalization

Fig.
5: Original Streams With Loudness Normalization

The
great majority of loudness peaks in Fig. 2 range between –25 and
–20 LUFS, suggesting that the loudness is relatively even, with
this unprocessed audio. However, the signal peaks have ranged from
about –12 dBFS to –2 dBFS, a range of at least 10 dB. Had we
adjusted parts of this program to maintain a similar peak level, we
can expect that program loudness would have varied substantially.

For
those who are interested in the accuracy of the ITU loudness meter,
readers can review Appendix 1 of the standard, which describes the
years of psychoacoustic testing that produced the loudness algorithm.
The test data shows good correlation with listener assessment of
loudness.

CONSISTENCY
AND NORMALIZATION
The
following illustrates how loudness metering can be applied to
transmission, such as Internet streaming, to provide more consistent
loudness from stream to stream, and reduce the need for dynamic
compression. The histogram in Fig. 3 shows measurements of 49 public
radio streams carrying NPR’s “Weekend Edition with Scott Simon.”
The blue bars indicate the number of stations at various levels of
loudness for a one-minute portion of the program. The largest
grouping is at –18 LUFS. The gold-colored bars indicate the
distribution of maximum peak levels for the same part of the program,
ranging from –20 dBFS to 0 dBFS.

A
common approach in audio editing is to “peak normalize” a section
of audio by adjusting the overall gain of the section so that the
highest peak reaches a target value. This was done for each of the 49
measurements in Fig. 4, where all share the same peak level of –2
dBFS. It is apparent that the streams do not share the same loudness,
partly because of small differences from minute to minute in the
program, but mostly because of differences in the audio processing
for each stream. While it is understandable that audio processing may
vary from stream to stream, it is apparent that setting audio
transmission according to a peak target does not make the loudness
compatible with other streams; aligning to a loudness target does
make the streams more compatible.

The
chart in Fig. 5 shows the alternate condition, in which the loudness
of the streams have been aligned, or normalized, to a common target
of –23 LUFS. Notice that the stream samples share a common
loudness, in the blue bar, but the peak levels are distributed from
–16 dBFS to 0 dBFS. While this seems contrary to belief, based on
years of looking at peak meters, this effect is quite normal: As long
as peak levels do not exceed full scale, it’s the loudness that we
wish to target, not the signal peaks.

Because
audio signals that are adjusted to a loudness target are freer to
peak as dictated by their content, it’s natural to ask “What is a
reasonable loudness target?” The European Broadcasting Union had
the same question, and established a “PLOUD” committee to
determine these parameters and develop procedures for use. Based on
extensive study of programs from a range of broadcast material, PLOUD
adopted a target loudness of –23 LUFS for production and
transmission. (The EBU R128 standard
and the ATSC A85 standard for U.S. digital television share similar
values and techniques for loudness normalization.) This loudness
value permits most programs with greater dynamic range and signal
peaks to fit safely under the digital full-scale limit.

LOUDNESS
MEASUREMENTS
Loudness
meters look much like peak-reading meters in use today. An example is
the K-Meter, a program for Windows and Linux computers, as shown in
Fig. 6: ITU loudness is indicated by the solid green bar while the
momentary signal peak is shown by a single red segment. Loudness
meters often color-code the segments to indicate when the –23 LUFS
target is reached or exceeded. A long-term indication of loudness may
help a mixing engineer or producer to align a program to the target
loudness. The standalone metering system from TC Electronics, shown
in Fig. 7, includes a unique “radar” display of loudness history
that uses less screen space. This look back at earlier program
loudness can help a mixing engineer decide if the current program is
“on track” to deliver the target overall loudness. Advanced meter
systems may include other information about the audio, including
amplitude spectrograms and phase displays.

Fig.
6: K-Meter Display at 20 dB LUFS With Peak Signal Shown in Red

With
the help of loudness meters, especially ones that can display a
measurement log over time, consistency in loudness can be easily
achieved. Fig. 8 illustrates the process, called “loudness
normalization.” In this chart, the stream at the left was logged
for a few minutes, producing the solid blue line for short-term
loudness and the solid red line for signal peaks. It has a long-term
(average) loudness, indicated by the dotted blue line, of
approximately –14 LUFS at the end of the sample period.

Measurements
should be taken for longer periods when the program has greater
dynamic range. A second audio stream, shown at right, is logged for a
similar time interval and has a long-term loudness of about –27
LUFS. A listener switching from the first to the second stream would
hear a drop in loudness of approximately 13 dB.

Normalization
of these two audio streams to a level of –23 LUFS, then, simply
lowers the encoding gain of stream number one by 9 dB (from –14
LUFS to –23 LUFS), and raises the gain of stream number two by 3 dB
(from
27 LUFS to 23 LUFS). The two streams now have a similar
loudness.

It
should be noted that normalization in no way dictates how one should
process audio. Some engineers or programmers prize a particular
“sound” resulting from particular processing; normalization just
encourages agreement between the media distributors, which the data
show can please listeners (or at least can diminish their annoyance).
The technique is nothing more than observance of a common standard
for transmission loudness. There is nothing to prevent a rogue
operator from pursuing a “loudness war” on the Internet.

AUDIO
DISTRIBUTION BENEFITS
Another
consideration in the adoption of loudness-based metering is the
appropriate target level. Based on extensive study of programs from a
range of broadcast material, the EBU adopted a target loudness of –23
LUFS for production and transmission. This loudness value permits
most programs with greater dynamic range and signal peaks to fit
safely under the digital full-scale limit. This would be the
appropriate audio loudness target for production and distribution.

An
example of the alignment of the ITU BS-1770 scale, in comparison to
the IEC PPM, BBC PPM and VU meter scales is shown in Fig. 9. Full
scale (0 dBFS) is indicated for all by the vertical red line to the
right. The ITU loudness meter reads in LU (one LU = one dB), with –23
LUFS being the target. ITU peak readings share the same scale. PPM
meters typically are referenced to 9 dB below full scale, which
represents maximum permissible peaks for program audio. The BBC meter
uses a simple numbered scale with instructions to the user on where
to peak various types of program content. The VU meter, once common
in North America, has a longer response time that requires a greater
back from full scale, typically 18 dB or more below digital clipping.
In digital audio systems, Reference Level is commonly a 1 kHz tone at
–20 dBFS, which conveniently displays at –20 LUFS on the loudness
scale.

To
stream audio, some other factors should be considered in choosing a
target loudness value. For example, streams are all-digital, from the
encoder to the consumer: The HE-AAC stream operated for tests by NPR
Labs has a measured dynamic range over the Internet of 96 dBA, which
is possible with a 16-bit audio system. This is equivalent to the
performance of most audio production centers, so there is no
technical need to compress or limit the dynamic range for Internet
distribution. However, the sound chips in computers, smart phones and
tablets are not quite that good, and certainly the acoustic power
range of amplifiers and speakers limit the practical range of
reproduction, although their capability is far greater than most
commercially-produced music at present.

Fig.
7: TC Electronics Metering With Radar-Like History

Another
consideration is keeping up with the loudness of other content: The
average loudness of commercially-produced popular music is
substantially higher than –23 LUFS, due to processing in mixing and
production. A study of material on Apple iTunes by mastering engineer
Bob Katz found an average loudness of –16.5 LUFS with a variation
of only a few LU. Katz indicated that by normalizing commercial music
to that loudness target “the debilitating loudness war has finally
been won.” (While achieving this value of loudness requires dynamic
processing, Katz reasons that there is no advantage to use excessive
dynamic compression, thereby allowing producers to “turn down the
volume” on their processing and deliver more dynamics.)

Fig.
8: Loudness Normalization Technique

Beside
the output range capability of devices, such as smart phones, tablets
and car audio systems, the loudness necessary to overcome background
noise is a factor. As noted above, however, additional audio
processing is simply unneeded with commercial music. Fortunately, the
ITU loudness meter includes the objective LRA tool to determine
whether the loudness range is already sufficiently limited. As the
meter comes into increasing use, we can expect that engineers and
producers will rely on it to provide guidance on whether more dynamic
processing is needed or “just fine as it is.”
CONCLUSION

Fig.
9: Audio Scales Compared — Loudness, PPM and VU
Click to Enlarge

To
address the issues of loudness matching, optimal target level and
loudness range for consumer audio content, especially Internet
streams, the Consumer Electronics Association has established a new
“Audio Metrics” working group, R03WG15, sponsored by the R03
Audio Systems Committee, to evaluate techniques for improving
listener satisfaction related to loudness.

The
author is the chair of this group, and invites readers to follow the
working group’s progress and comment on their experiences.

If you want to add your comments
to the loudness discussion, email John Kean at
nprlabs1@gmail.com and include
“Loudness” in the subject line.”

Posts are reviewed before publication, typically the next business morning. Radio World encourages multiple viewpoints, though a post will be blocked if it contains abusive language, or is repetitive or spam. Thank you for commenting!