Atsc MH
Atsc MH
ATSC Standard
A/153 Part 8 – HE AAC Audio System
Characteristics
Note: The user's attention is called to the possibility that compliance with this standard may
require use of an invention covered by patent rights. By publication of this standard, no position
is taken with respect to the validity of this claim or of any patent rights in connection therewith.
One or more patent holders have, however, filed a statement regarding the terms on which such
patent holder(s) may be willing to grant a license under these rights to individuals or entities
desiring to obtain such a license. Details may be obtained from the ATSC Secretary and the
patent holder.
Revision History
Version Date
A/153 Part 8:2009, initial version of standard approved 15 October 2009
A/153 Part 8:2012, first revision approved 18 December 2012
2
ATSC A/153 Part 8:2012 HE AAC Audio System Characteristics 18 December 2012
Table of Contents
1. SCOPE .....................................................................................................................................................4
1.1 Organization 4
2. REFERENCES .........................................................................................................................................4
2.1 Normative References 4
2.2 Informative References 4
3. DEFINITION OF TERMS ..........................................................................................................................5
3.1 Compliance Notation 5
3.2 Treatment of Syntactic Elements 5
3.2.1 Reserved Elements 5
3.3 Acronyms and Abbreviation 5
3.4 Terms 6
4. SYSTEM OVERVIEW ...............................................................................................................................6
4.1 Use of SBR and PS 8
5. HE AAC V2 CONSTRAINTS ....................................................................................................................9
5.1 Audio Elementary Stream Configuration 9
5.2 RTP Packetization 10
ANNEX A: RELATIONSHIP BETWEEN MH COMPONENT DATA DESCRIPTOR AND SDP ...11
3
ATSC A/153 Part 8:2012 HE AAC Audio System Characteristics 18 December 2012
1.1 Organization
This document is organized as follows:
• Section 1 – Outlines the scope of this Part and provides a general introduction.
• Section 2 – Lists references and applicable documents.
• Section 3 – Provides a definition of terms, acronyms, and abbreviations for this Part.
• Section 4 – System overview.
• Section 5 – System specifications.
• Annex A – Sample SDP file.
2. REFERENCES
All referenced documents are subject to revision. Users of this Standard are cautioned that newer
editions might or might not be compatible.
4
ATSC A/153 Part 8:2012 HE AAC Audio System Characteristics 18 December 2012
[7] ATSC: “ATSC Mobile/Handheld Digital Television Standard, Part 3 – Service Multiplex and
Transport Subsystem Characteristics,” Doc. A/153 Part 3:2009, Advanced Television
Systems Committee, Washington, D.C., 15 October 2009.
[8] IETF: ”SDP: Session Description Protocol,” Doc. RFC 4566, Internet Engineering Task
Force, Freemont, CA.
[9] ATSC: “Recommended Practice – Techniques for Establishing and Maintaining Audio
Loudness for Digital Television,” Doc. A/85, Advanced Television Systems Committee,
Washington, D.C., 25 July 2011.
3. DEFINITION OF TERMS
With respect to definition of terms, abbreviations, and units, the practice of the Institute of
Electrical and Electronics Engineers (IEEE) as outlined in the Institute’s published standards [2]
shall be used. Where an abbreviation is not covered by IEEE practice or industry practice differs
from IEEE practice, the abbreviation in question is described in Section 3.3 of this document.
5
ATSC A/153 Part 8:2012 HE AAC Audio System Characteristics 18 December 2012
3.4 Terms
The following terms are used within this Part.
AAC core codec – The plain AAC codec with AAC Profile (as specified in ISO/IEC 14496-3
[1] Table 1.3).
AAC core channel – The down-mixed mono audio channel within an HE AAC v2 codec.
LKFS – Loudness, K-weighted, relative to full scale, measured with equipment that implements
the algorithm specified by ITU-R BS.1770 [4]. A unit of LKFS is equivalent to a decibel.
reserved – Set aside for future use by a Standard.
MPEG – Refers to standards developed by the ISO/IEC JTC1/SC29 WG11, Moving Picture
Experts Group. MPEG may also refer to the Group.
4. SYSTEM OVERVIEW
Please see ATSC A/153 Part 1 [6] for an overall description of the M/H system. The ATSC
Mobile/Handheld service (M/H) shares the same RF channel as a standard ATSC broadcast
service described in ATSC A/53 [5]. M/H is enabled by using a portion of the total available
~19.4 Mbps bandwidth and utilizing delivery over IP transport. The overall ATSC broadcast
system including standard (TS Main) and M/H systems is illustrated in Figure 4.1.
6
ATSC A/153 Part 8:2012 HE AAC Audio System Characteristics 18 December 2012
Video Subsystem
Video
Video Source Coding
and Compression
Audio Subsystem
Audio
Audio Source Coding Service
and Compression Multiplex
Video Subsystem
Video IP
Video Source Coding Transport
Service Modulation
and Compression
Multiplex
RTP
And
Audio Subsystem
Audio IP
Audio Source Coding Encapsulation
and Compression
Ancillary Data
Control Data
Figure 4.1 ATSC broadcast system with TS main and M/H services.
This Part relates to the Audio Source Coding and Compression block and specifies audio
coding using MPEG-4 HE AAC v2 as described in ISO/IEC 14496-3 [1], with the constraints
indicated herein. HE AAC v2 is used to code mono or stereo audio. HE AAC v2 is the
combination of three audio coding tools, MPEG-4 AAC, Spectral Band Replication (SBR) and
Parametric Stereo (PS). This furthermore means that HE AAC v2 includes both HE AAC and
AAC as illustrated in Figure 4.2.
7
ATSC A/153 Part 8:2012 HE AAC Audio System Characteristics 18 December 2012
HE AAC v2
HE AAC
AAC SBR PS
Figure 4.2 MPEG-4 Audio tools that together create HE AAC v2.
HE AAC HE AAC
Decoder
SBR-
Encoder
Figure 4.3 Simple block diagram showing how AAC and SBR work
together.
To achieve an even further bitrate reduction, the number of discrete coded audio channels
may be reduced by utilizing the Parametric Stereo tool in HE AAC v2. In this case, the two-
channel input signal is down-mixed to a mono channel (AAC core channel) for coding and a
parametric description of the stereo representation is added to the bit-stream payload. A HE
AAC v2 decoder first creates this one-channel mono output signal and then renders the 2-channel
output by utilizing the additional parametric data. This process is illustrated in Figure 4.4.
1
“Audio spectrum” in this case is the audio bandwidth to be coded, which will vary depending
on the sampling rate being used.
8
ATSC A/153 Part 8:2012 HE AAC Audio System Characteristics 18 December 2012
Figure 4.4 Simple block diagram showing how AAC, SBR and PS work
together.
5. HE AAC V2 CONSTRAINTS
The audio content at the input to the HE AAC v2 encoder shall have a target measured loudness2
value of –14 LKFS.
2
Methods to measure loudness are explained in the ATSC Recommended Practice A/85,
“Techniques for Establishing and Maintaining Audio Loudness for Digital Television” [9].
See particularly Section 5.2.
9
ATSC A/153 Part 8:2012 HE AAC Audio System Characteristics 18 December 2012
The presence of SBR and PS in the audio stream shall be indicated by the usage of explicit
hierarchical signaling. Therefore the audio stream signaling shall be indicated as follows:
• If SBR data is not present, the audioObjectType indicated by the AudioSpecificConfig, shall be
set to the value 2 (indicating AAC LC as given by Table 1.1 of ISO/IEC 14496-3 [1]).
• If SBR data is present, the first audioObjectType indicated by the AudioSpecificConfig, shall be
set to the value 5 (indicating HE AAC as given by Table 1.1 of ISO/IEC 14496-3 [1]).
• If SBR data and PS data are present, the first audioObjectType indicated by the
AudioSpecificConfig, shall be set to the value 29 (indicating HE AAC v2 as given by Table
1.1 of Amendment 2 of ISO/IEC 14496-3 [1]).
10
ATSC A/153 Part 8:2012 A/153 Part 8, Annex A 18 December 2012
Annex A:
Relationship between MH Component Data Descriptor and SDP
ATSC M/H transmits SDP messages according to RFC 4566 [8] for announcement of services.
For signaling of audio codec capabilities, however, the MH_component_descriptor() with the
MH_component_data() structure (for Component Type 37) as defined in Section 7.8.1.3 of A/153
Part 3 [7] is used. The MH_component_data() structure and the SDP messages carry many of the
same parameters. It is strongly recommended to use the MH_component_descriptor() with the
MH_component_data() structure for initialization of the audio decoder because the
MH_component_descriptor() is defined to take precedence over the SDP message.
The following section explains the elements in an example SDP message to help clarify how
the audio signaling in ATSC M/H is related to SDP. Consider the following SDP message:
11
ATSC A/153 Part 8:2012 A/153 Part 8, Annex A 18 December 2012
Data about the audio stream (e.g., language), along with the needed parameters from the SDP
file are placed into a structure forming an octet string. The correspondence is shown in Table A.2
and described below the table. This results in an octet string which is the configuration record of
the MH Component Data for HE AAC v2 (Type 37), as specified in A/153 Part 3 [7].
Backwards-compatible audio extensions (e.g., for multi-channel surround sound) rely on the
possibility of transmitting additional config strings. This is enabled by the specified loop in the
MH Component Data for HE AAC v2 (Component Type 37).
Since the length of config strings is available in ATSC M/H it is possible to either parse all
config strings or skip additional ones with potentially unknown content.
The values in the fields sampling_rate and num_audio_channels may be obtained from the
AudioSpecificConfig embedded in the config field of an SDP message, when an SDP message is part
of the source data flow. If SBR data is not present, the samplingFrequencyIndex parameter in the
MH_component_data() structure corresponds to the sampling_frequency_index parameter from the
hexadecimal AudioSpecificConfig string within the descriptor. If SBR data is present, the
sampling_rate parameter in the MH_component_data() structure corresponds to the
extensionSamplingFrequencyIndex parameter from the hexadecimal AudioSpecificConfig string within the
descriptor. The num_audio_channels parameter in the descriptor indicates the number of audio
channels to be rendered. This information may also be obtained from the channelConfiguration
parameter inside the AudioSpecificConfig in conjunction with the information from the profile_level_id.
The base-16 encoded representation of the config string from the above example SDP message is
directly placed in the config field of the descriptor.
The user should note that an ISO/IEC 14496-3 [1] compliant standard HE AAC v2 decoder is
required to render two channels when a genuine mono stream is signaled and sent.
12