Loudness Control according to EBU R128

Current version of the page has been reviewed and is approved ().


Loudness Paradigm Shift and Tips & Traps in Metering with DAWs

Background of the Introduction of Loudness Measurements in Television and their Practical Impact
Michael Kahsnitz, Senior Director Product Management, RTW GmbH & Co. KG

2012 marked a significant year for the audio departments of German and Austrian broadcasters, as well as for television audiences: in this year, both public and private broadcasting institutions transitioned from the traditional QPPM peak level measurement to loudness normalization according to EBU Technical Recommendation R128. Since then, all new productions have been required to control levels based on loudness criteria, involving the integration and operational transition to suitable measurement tools in line with EBU R128 standards. This article aims to explore the background and practical implications of this substantial paradigm shift from peak level to loudness measurement.
This shift can be viewed as a historic opportunity within the broadcast industry to finally address the global ā€œloudness war.ā€ Complaints about sudden loudness jumps between different program segments and channels have been one of the most frequent sources of viewer dissatisfaction for years, not only in German broadcasting.

Film sound mixing could be considered one of the early adopters of loudness control, with terms like Dialnorm, TASA, and Leq(M) being essential tools in that field for some time. However, whenever a production is intended to leave the ā€œisolated universeā€ of cinema, such as for broadcast or streaming distribution, specific rules for loudness and program dynamics come into play.

Loudness- What is it?

Unlike traditional peak level measurement using QPPM instruments, which assesses audio signals purely by electrical criteria, loudness measurement takes into account the subjective perception of the human ear. Essentially, we’re aiming to objectively measure a subjective sensory impression. The German term ā€œLautheitā€ (loudness) probably captures this concept best—though it’s not to be confused with ā€œLautstƤrke,ā€ which refers to sound pressure.

The loudness discussion doesn’t aim to standardize listening volumes in living rooms nationwide; rather, the goal is to eliminate annoying loudness jumps between different channels and program segments through appropriate measurement techniques. The subjective loudness perceived by the listener ultimately depends on many factors that extend beyond simple power and frequency evaluations. These factors, however, are challenging to capture in standardized measurements. They include listener-specific elements such as age, mood, and personal preference, as well as the type and cultural background of the content, listening environment, duration of listening, and, more generally, the listener’s level of interest in the content.

The Problem

The technical capabilities available today through media like Blu-ray Disc, SACD, DVD-Audio, HDTV, and new channel formats—including immersive 3D formats such as 22+2, Auro 3D, Dolby ATMOS, or MPEG-H with additional height channels—are exceptional. High transfer rates, extensive storage capacities, dynamic ranges up to 130 dB, and sampling rates of 192 kHz or higher should theoretically enable perfect results. Unfortunately, the reality, especially in radio and television, is different. A whole arsenal of signal processing tools, including multiband compressors, limiters, and other techniques, is used not so much for quality enhancement but as ammunition in the omnipresent loudness war. Louder is often deemed better, and where a gunshot once sufficed, now an atomic bomb is needed—even if it’s merely an ad for a new chewing gum. Notably, it’s not just commercials that are guilty in this regard but also the station-produced trailers.

How did it come to this?

For decades, QPPM instruments with a 10 ms integration time have been used for level control in professional audio production of all kinds. This means that the peak meter's bar indicates a specific value when the equivalent voltage is present at the input for this time period. This approach is still valid when assessing the technical levels of a transmission path or medium, ensuring that the technical limits of the system in use are not exceeded. The commonly applied headroom of 9 dB above nominal level in television (or 9 dB below the maximum digital level) provides an additional buffer.

Unfortunately, reality often looks quite different. Rather than using the full headroom or allowing for any program dynamics, productions are driven to the absolute maximum level, coupled with strict signal limiting to prevent technical issues like clipping. After investing significant time and energy in the studio to optimize a production’s sound, broadcasters often flatten the result with fully automated loudness maximizers, ironing out every last gap in the spectrum to increase perceived loudness.

Additionally, there is, in my observation, a nearly obsessive adherence to the "0 dB" mark on the QPPM meter during reviews, with no allowance for creative reasons behind brief deviations from the standard level. Contributions that artistically exceed or fall below the specified levels are uniformly rejected, thereby limiting creative freedom on the production side—even though this restriction is unnecessary given today’s excellent technical conditions. Interestingly, "guidelines for level control" have existed for a long time; Michael Dickreiter documented these in his books about 20 years ago, offering clear instructions on how various types of content should be leveled sequentially on a QPPM to ensure consistent loudness for the listener at home. Unfortunately, these practices have largely been forgotten today.

Institutions and Standards

The topic of loudness is far from new and has been a focus for companies like RTW since the early 1990s. The ITU (International Telecommunication Union) began researching and initiating standardization efforts in 2002, with the first ITU documents (ITU-R BS.1770 and BS.1771) published in 2006. These regularly updated ITU recommendations continue to play a crucial role internationally, serving as the foundation for many recommendations and standards developed by other organizations.

In 2008, the EBU (European Broadcasting Union) established the P/Loud working group, with Florian Camerer (ORF) as chairman and driving force. As a representative of RTW, I have been involved in this group since its founding. The group focused on refining and applying ITU guidelines in practice, summarizing its results in the R128 recommendation, first published in 2010. The current ITU BS.1770-4 version incorporates several recommendations from R128.

Many other organizations globally are working towards harmonizing loudness measurements, fortunately collaborating rather than competing. In the U.S., the ATSC (Advanced Television Systems Committee) addresses loudness implementation in TV through its comprehensive document A/85, which, based on ITU guidelines, emphasizes listening conditions and reference playback levels over specific measurement methods. The 2010 CALM Act in the U.S. explicitly prohibits loudness discrepancies between commercials and other programming on U.S. television; since December 2012, the FCC (Federal Communications Commission) can impose fines for non-compliance. Japan's ARIB (Association of Radio Industries and Businesses) has closely aligned its standards with the EBU’s R128. In Europe, the adoption of the R128 standard is nearly universal.


P/Loud and R128
The EBU’s P/Loud working group has developed a holistic approach with its R128 recommendation, initially focused on broadcast audio but now encompassing all distribution channels—TV, radio, internet, CD, DVD, cinema, and more—from production to distribution. Their goal was to create a globally accepted, open standard for loudness implementation, avoiding proprietary solutions. According to the group, loudness metering should replace peak metering as the primary tool for level monitoring throughout many stages of the production and distribution chain, though not completely eliminating the use of peak meters. Binding international standards for comparable measurement values, along with clear guidelines for production and distribution, are considered essential.

The ITU recommendations, which form the foundation of R128, cover many technical aspects, such as weighting filters and TruePeak measurement. However, the ITU’s open guidelines lack specific targets, making it difficult to classify productions directly as "standard-compliant" or "non-compliant." Furthermore, the ITU standard allows for variability in several measurement parameters, which can lead to differences depending on the settings of individual metering devices, thereby reducing immediate comparability. This is where P/Loud stepped in, aiming to provide clear, practical guidelines for evaluating productions. Thus, R128, developed by P/Loud, did not reinvent loudness measurement but instead filled the remaining gaps in the ITU standard to improve measurement comparability.

One of these gaps was the need for a consistent reference point within a program, an anchor. This anchor could, for instance, be the dialogue in a production, though this doesn’t always work reliably, as seen with dialogue-light films like No Country for Old Men, golf broadcasts, or typical pop concert recordings. In these cases, using speech as an anchor point is inadequate. Therefore, one of P/Loud's primary goals was to devise measurement methods that would allow reliable loudness assessment even in such cases, which the initial ITU guidelines did not fully address. Additionally, a single measurement approach was intended to yield meaningful results regardless of genre or channel format.

Measurement Methods

A standard-compliant loudness measurement consists of multiple components, which differ by the number of channels being measured and various time constants. First, there’s single-channel measurement with fast integration, for instance, used to examine the center channel in a 5.1 mix individually. This per-channel instantaneous measurement, designated by the letter ā€œMā€ for "Momentary," is crucial for real-time monitoring during production and broadcasting, using an integration time of 400 ms. The summing short-term measurement, identified as ā€œSā€ for "Short-term," applies a 3-second integration window, enabling early detection and correction of tendencies. The long-term integrated measurement, denoted as ā€œIā€ ("Integrated"), captures the entire length of a segment or program, ranging from a few seconds to several hours in duration. This measurement is typically started and stopped manually by the user and is used for documentation, logging, and quality control purposes.

A weighting filter is an essential component in loudness measurement, designed to realistically reflect the human ear’s sensitivity to loudness. The currently used weighting filter (K), specified by ITU BS.1770, is based on findings from earlier response curves and is a modified version of the B-weighting curve. In cinema, the CCIR curve has long served as a specific weighting filter, used in Leq(M) measurements.

R128 defines ā€œLUFSā€ (Loudness Units relative to Full Scale) as the absolute unit for loudness measurement, while relative measurements use "LU" (Loudness Units). The notation follows a similar pattern as dBFS and dB. To correctly describe the loudness level of a segment, one would note, for instance, that Lk has a value of -18.5 LUFS. R128 specifies a target level of -23 LUFS with a tolerance of +/- 0.5 LU, and +/- 1 LU for live productions. If a segment needed adjustment to the target level, a full remix isn’t necessary—applying a static gain offset by the difference (in this case, -4.5 dB) is sufficient.

The ITU, however, uses LKFS instead of LUFS, with the "K" indicating the weighting filter used. In practice, LKFS and LUFS yield the same measurements. However, the U.S. ATSC standard sets its target level at -24 LKFS, with an allowed tolerance of +/-2 LU. With the updated ATSC A/85 standard (2013), the relative-gated measurement as defined in ITU 1770-3 is also required. Yet, ATSC retains a provision for anchor-based measurements that lead to equivalent loudness results.

In integrated measurement, a gate ensures that only defined program parts are included in the loudness measurement. This gate’s threshold is not set to a fixed level but is instead set to 10 LU relative to the current measured loudness level. Thus, program parts outside this threshold are ignored by the measurement. This approach aligns closely with the anchor principle mentioned earlier, ensuring that speech, effects, ambient sounds, and music consistently provide valid loudness measurement results—something that would not be achievable with a speech-only reference. Additionally, a ā€œSilence Gateā€ ensures that a measurement already started is only triggered with the actual start of modulation, and it automatically excludes low-level signal parts from the measurement.

LFE and Extended Channel Formats

As previously mentioned, under the R128 standard, summing measurements in 5.1 surround programs currently consider only the five main channels, excluding the LFE; technically, this makes it a 5.0 measurement. This approach is quite practical since the LFE channel is designed solely as an effects channel rather than a low-frequency supplement to the main speakers.

Currently, the two rear channels in 5.1 programs are adjusted with a +1.41 dB gain in the measurement process. This adjustment accounts for the evolutionarily based fact that humans are more sensitive to sounds coming from behind than those coming from the front.

Immersive Audio/3D Audio Formats

The 3D audio formats present new demands on the audio engineer and also include a currently unresolved issue. Let’s first consider the formats most commonly used in the sports sector today, which are 5.1-4 and 7.1-4 (5 or 7 channels in the main layer and 2 or 4 channels in the height layer). For this configuration, there is a clear definition regarding the calculation of the loudness value in ITU 1770-4. The situation changes when additional object channels are added. Their position and level can dynamically change during a segment. This means that loudness calculation is only possible when the positioning data (metadata) is available and can be continuously integrated into a summation based on the definition in 1770-4. RTW provides a tool within its TouchMonitor series, the Immersive Sound Analyzer (ISA), for the representation and measurement of immersive sound.

Loudness Range

The LRA parameter, which represents the result of a statistical method for determining the loudness distribution, allows for an assessment of whether a contribution still requires dynamic processing in relation to its intended distribution pathway or whether it may have already undergone excessive processing. For example, the original soundtrack of a feature film may exhibit higher LRA values than a mix intended for the TV broadcast of the same film.
Taking the film ā€œThe Matrixā€ as an example, the measured loudness distribution ranges from -60 LUFS to nearly 0 LUFS. If we exclude the rarely reached extreme values from the measurements using statistical methods, we find that the example leaves a range of approximately 25 LU, which encompasses the majority of the measured values. This range is referred to as the Loudness Range (LRA) and accurately describes the (in this example, extremely high) dynamic range of a contribution. The same measurement for a modern pop CD would yield an LRA value of only a few LU.

Magic LRA

To simplify the consideration and evaluation of LRA measurements, RTW has introduced a special instrument called "MagicLRA" in some of its product lines. The impetus for this development was the realization that, alongside highly differentiated loudness instruments for the production area—which require a certain level of background knowledge from the user—there is an increasing need for tools that are straightforward and easy to use, particularly for editorial teams or editing stations. This display allows users to capture and categorize the most important loudness parameters without prerequisites and at a glance. Only such instruments enable non-technicians to quickly determine whether a product complies with the applicable requirements or not.

Short Form Content

It has been shown that LRA measurements are insufficient for assessing advertisements due to the short duration of many spots, as statistical methods always require a sufficiently high number of valid measurements and, therefore, a certain minimum time span. Consequently, an additional characteristic, such as maximum short-term loudness, has been introduced. This value can be determined using most of the existing instruments—essentially, it functions as a simple hold feature for short-term measurement (S-measurement). A definition of the values for short program material was established in Supplement 1 (EBU R128 s1-2016 Version 2.0).

Immersive Audio

At the end of 2015, the ITU published the document ITU-R BS.1770-4. This document now describes the weighting in loudness summation for most common multi-channel formats. In relation to Dolby ATMOS or MPEG-H, further definitions will be necessary in the future.

Since that same time, the document AES TD1004.1.15-10 has also provided a recommendation for streaming audio. The target value here is set at -16 LUFS. However, caution is advised. This target value is not equivalent to program loudness as we know it from R128. Streaming audio is measured over different time periods (typically 24 hours), and the target value can vary depending on the provider's concept of the included dynamics. It should fall within a range of -20 LUFS to -16 LUFS. A more detailed description is beyond the scope of this article. Providers in this area should examine the mentioned document very closely.

In the field of terrestrial broadcasting (FM radio), there are interesting studies regarding coverage and intermodulation that could experience significant improvements through the consistent use of loudness assessment and would also save energy in the process.

Particularly in the radio sector, but also for the internet, there is a question of a different target value. This could be set at -18 LUFS. Depending on the program genre, it may also be reasonable to restrict the LRA value, for example, to 6 LU. An example of this would be programs dominated by spoken contributions.

LUFS Reference and Dynamics

LUFS stands for an absolute measurement unit, Loudness Units relative to Full Scale — that is, the highest digital level value in a 16- or 24-bit word.
When we talk about FS, Full Scale, here is an explanation that is essential when considering the TruePeak value. The upper limit, FS, of a digital word cannot be exceeded. A 24-bit wide signal is characterized by a dynamic range with a value of
2^{24}
or, expressed in the more tangible unit of dB, a range of 144 dB.

Why is the High Dynamic Range So Important?

The target loudness value ā€œ-I-ā€ for a television production is currently set at -23 LUFS in Europe. This allows us to make the following definition for this point and a stereo signal:
A loudness value of -23 LUFS is achieved when each individual channel in a stereo signal carries a 1000 Hz signal at a level of -23 dBFS (ITU 1170-3).

Now, considering the entire dynamic range of a 24-bit signal of 144 dB, setting the working point at -23 dBFS (-23 LUFS/0 LU) is nearly perfect. There is sufficient headroom, meaning an upward gain reserve, and more than enough distance from the lower limit, which is usually defined by the noise floor. Technically speaking, this provides a large available dynamic range.

Therefore, the adjustment to the target loudness can initially occur without any further consideration of levels. Level consideration becomes essential only when loudness target values of > -23 LUFS are required, as is particularly the case with broadcasting and streaming services. This is where TruePeak measurement comes into focus.

TruePeak

The TruePeak measurement, which is also defined within the ITU recommendation, replaces the previous QPPM measurement and provides increased precision in detecting potential overloads in digital audio signals. This method is particularly relevant in applications where operation is conducted very close to the technical clipping limit, such as in CD mastering, to prevent the frequent occurrence of clipping in the D/A converters of CD or MP3 players and other playback devices.

What is TruePeak?

Image 1: Sampling a 3 kHz Signal
Digital audio information consists of consistently timed segments called samples, each with a numerical equivalent to the actual signal voltage. At a sampling rate of 44.1 kHz, such as that used in CD production, each sample lasts 22.68 µsec. This means that every 22.68 µsec, a new numerical value is determined. However, between two such samples, an analog signal can take on an infinite number of possible signal voltages, and of course, their duration approaches an infinitely short interval. Signals of shorter duration become increasingly high-frequency (f = 1/T). 

Digital systems outside of DAWs typically have a finite value system. When the upper limit is reached, the signals are simply clipped, which leads to unwanted distortion.

So, if you were to normalize the sample values to Full Scale in Image 1, the converted result would look like Image 2. It is clear that this has nothing to do with the original signal.
Image 2: 3 kHz waveform after D/A conversion


To avoid such artifacts as much as possible, the TruePeak measurement was introduced. In this method, the signal is sampled at least four times oversampling, which means in our example every 5.5 µsec. Since this still does not provide absolute certainty, a level ceiling for linear audio data formats in television has been set at -1 dB TruePeak.




Do all these requirements also apply within a DAW?

The answer is: In most cases, no. Almost all well-known DAWs operate with floating-point arithmetic, which results in an almost unlimited range of values. (Academics may forgive me for the slight imprecision of my statement at this point.)
Image 3: Level measurement and vectorscope as a plugin in the DAW "clipping."

In a DAW, a level meter usually shows clipping, but, oh wonder, it is not audible or visible in the vectorscope.

If we look at the numerical value above the level meter, we see a value of +10 dBFS, theoretically impossible! Clipping is not visible on the vectorscope, and yet the output signal sounds completely distorted. If we measure the same signal in the digital data stream with a standalone program outside the DAW, it looks like Image 4.

This image now correlates again with what we hear. What is the reason for this phenomenon? The initially described floating-point operation makes it possible. Within the DAW, there is no limited value range for the audio signal. However, as soon as the signal
Image 4: Level measurement and vectorscope outside the DAW showing "clipping."
leaves the "protected" area, the final word length of 16 or 24 bits applies again.

For comparison, here are the screenshots for a signal that is not clipped in the DAW.

This means that within a DAW, level assessment is more than questionable, and a vectorscope for displaying overloads is not useful. The situation is different when this display, for example, is considered as standalone software analyzing the summed signal. Since the finite number range of a 24-bit word is applied here, all artifacts that any converter would produce during playback become visible again.

But even though not all detailed issues have been resolved, the initiated paradigm shift from QPPM to loudness measurement has now become irreversible. Radio is on a good path, and the distribution channels via the Internet are also in focus. However, these present new problems as well.

Depending on the streaming service and the loudness target used there, audio productions sound different when downloaded. One approach to counteract this already during mastering lies in examining the micro-dynamic characteristics of a production. A paper published in New York in 2017 addresses this issue (AES 143 EB 373). Overall, it is to be hoped that the now opened opportunities will be utilized in the interest of dynamic audio transmission.

An overview of current loudness requirements can be found at rtw.com/standards.