Audio-Specific Metadata To Enhance the Quality of Audio Streams and Podcasts

Audio-specific metadata was envisioned several years ago in the MPEG-D standard for Dynamic Range Control. The application of this metadata to online content awaited a newer audio codec and the current generation of mobile operating systems. Now that both are becoming widely available, this paper explains how audio content providers can offer new consumer benefits as well as a more compelling listening experience.

This paper and presentation will explain the types of audio-specific metadata that monitor key characteristics of audio content, such as loudness, dynamic range and signal peaks. From simple to large-scale producers, the metadata sets are added to the content during encoding for real-time distribution?as in streams, or for file storage?as with podcasts.

In the playback device or system, this metadata is decoded along with the audio data frames. Through diagrams the decoder operations are described, providing benefits such as loudness matching across different audio content?ending annoying blasting from some audio and reaching for the volume. It will be shown that audio dynamic range can be controlled according to the noise environment around the listener?quiet parts of a performance can be raised to audibility, but only for those who need it.

An audio demonstration is planned to allow the audience to hear the same encoded program over a range of playout conditions with the same device, from riding public transit to full dynamic range for listeners who want highest fidelity. The workflows in production and distribution to add audio-specific metadata are explained, showing how content producers need to make only one target level for all listeners, rather than one for smart speakers, another for fidelity-conscious listeners, etc.

John Kean | Cavell Mertz & Associates Inc. | Manassas, Virginia, USA
Alex Kosiorek | Central Sound at Arizona PBS | Phoenix, Arizona, USA

Topics

Share This Paper

$15.00