0% found this document useful (0 votes)
430 views173 pages

VDC-M v1.1

Uploaded by

王天
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
430 views173 pages

VDC-M v1.1

Uploaded by

王天
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 173

VESA Display Compression-M (VDC-M)

Standard
www.vesa.org

Version 1.1
11 May, 2018

Purpose
The purpose of this document is to specify the VESA® Display Compression-M (VDC-M) Standard.

Summary
VDC-M is part of the VESA display compression codec family. The algorithm should operate at low bit rate
and in real-time. A rate controller and buffer ensure that pictures do not experience underflow or overflow
issues. The encoder also produces a constant rate bitstream when provided a pixel data stream at a
constant rate.
In most cases, the codec operation is visually lossless. To better ensure interoperability and visually lossless
quality, this Standard normatively specifies both the encoder and decoder.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 1 of 173
Contents

Contents
Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10
Intellectual Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Patents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Support for this Standard. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Section 1 Introduction (Informative) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13


1.1 Document Organization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2 VDC-M C++ Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3 Document Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.1 Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.2 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3.3 Acronyms, Initialisms, and Abbreviations . . . . . . . . . . . . . . . . . . . . . . 19
1.3.4 Glossary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.4 Reference Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Section 2 Theory of Operation (Informative). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .22


2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2 CSC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 Picture Hierarchy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.1 Block Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.2 Slice Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3.3 Picture Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4 Slice Padding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.1 Horizontal Slice Padding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.2 Vertical Slice Padding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5 Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6 Coding Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6.1 Transform Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6.2 BP Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.6.3 MPP Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.6.4 Fallback Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 2 of 173
Contents

2.7 Flatness Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37


2.8 RC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.9 EC – Transform and BP Modes Only . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.10 Slice Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.11 Substream Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

Section 3 Syntax (Normative) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43


3.1 PPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 Frame-level Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3 Slice-level Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.4 Substream-level Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.5 ECG-level Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Section 4 Encoding Process (Normative) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .59


4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2 Block Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.3 CSC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.4 Flatness Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.4.1 Hadamard and Haar Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4.2 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.4.3 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.5 RC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.5.1 Rate BF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.5.2 Target Bit Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.5.3 QP Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.5.4 RD Cost Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.6 Test Coding Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.6.1 Transform Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.6.2 BP Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.6.3 MPP Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.6.4 MPPF Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
4.6.5 BP-SKIP Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 3 of 173
Contents

4.7 Entropy Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123


4.7.1 Component Skip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.7.2 ECG Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
4.7.3 Bit Representations and bitsReq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.7.4 Unary Prefix to Signal bitsReq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
4.7.5 CPEC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.7.6 VEC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.7.7 Rate Buffer Stuffing Bits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
4.7.8 Sign Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
4.8 Quantizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
4.8.1 Fractional Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
4.8.2 Scalar Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
4.9 Encoder Mode Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
4.10 Block Headers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
4.11 Substream Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

Section 5 Decoding Process (Normative) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .150


5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.2 Substream De-multiplexing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.3 Syntax Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
5.3.1 Block Headers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
5.3.2 Syntax Decoding for Modes that Do Not Use EC. . . . . . . . . . . . . . . . 154
5.3.3 Syntax Decoding for Modes that Use EC . . . . . . . . . . . . . . . . . . . . . . 155
5.4 Per-mode Decoding Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
5.4.1 Transform Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
5.4.2 BP Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
5.4.3 MPP Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
5.4.4 MPPF Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
5.4.5 BP-SKIP Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
5.5 Update RC State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

Annex A Rate Buffer Guidance (Normative) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .168

Annex B Derivation of Parameters (Informative) . . . . . . . . . . . . . . . . . . . . . . . . . . .169


B.1 RDO Weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

Annex C Guidance for Rate Buffer Size


and Delays in a Practical Implementation (Informative) . . . . . . . . . . . . . .170

Annex D Main Contributor History (Previous Versions). . . . . . . . . . . . . . . . . . . . . .172

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 4 of 173
Tables

Tables
Table 1: Main Contributors to VDC-M v1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11
Table 2: Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12
Table 1-1: Coding Objects Used by EncodeSlice and DecodeSlice Routines. . . . . . . . . . . . . . . . . . .15
Table 1-2: Virtual Routines Provided by Mode Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15
Table 1-3: Recurring Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16
Table 1-4: C Model Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17
Table 1-5: Highlight and Camel-case Rules Used throughout this Standard . . . . . . . . . . . . . . . . . . . .18
Table 1-6: Acronyms, Initialisms, and Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19
Table 1-7: Glossary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20
Table 1-8: Reference Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21
Table 2-1: Parameters Used to Define Slices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25
Table 2-2: ECGs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39
Table 2-3: Bits Required for Example Group of Samples that Use
SM and 2C Bit Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .40
Table 2-4: Substream Multiplexing Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42
Table 3-1: PPS Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .44
Table 3-2: Frame-level Syntax Fields (Constant Bit Rate Mode) . . . . . . . . . . . . . . . . . . . . . . . . . . . .49
Table 3-3: Slice-level Syntax Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .50
Table 3-4: Substream-level Syntax for Substreams 0 through 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51
Table 3-5: syntaxElementSubstream[0] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .52
Table 3-6: syntaxElementSubstream[1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .54
Table 3-7: syntaxElementSubstream[2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .55
Table 3-8: syntaxElementSubstream[3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .56
Table 3-9: ECG-level Syntax for a Given Component k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .58
Table 4-1: Component Width and Height Parameters for Different Chroma Sampling Formats . . . .60
Table 4-2: Equation-based Implementation of the Color Space Transforms
between RGB and YCoCg Color Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .61
Table 4-3: Flatness Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .61
Table 4-4: Hadamard Shift Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .62
Table 4-5: Normalization of Hadamard Transform Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . .63
Table 4-6: Complexity Measure for Flatness Detection Calculation . . . . . . . . . . . . . . . . . . . . . . . . . .63
Table 4-7: Flatness Detection Previous Block Complexity Calculation. . . . . . . . . . . . . . . . . . . . . . . .64
Table 4-8: Flatness Detection Next Block Flatness Calculation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .64
Table 4-9: Flatness Detection Classification Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .65
Table 4-10: Rate BF Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67
Table 4-11: Mechanisms that Ensure modeRate Is Strictly Greater than
avgBlockBits when underflowPrevention Flag Is Set. . . . . . . . . . . . . . . . . . . . . . . . . . . . .68
Table 4-12: Constants Used for Calculating rcFullness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .69
Table 4-13: rcOffset BF Calculation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .71
Table 4-14: BF Calculation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .72

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 5 of 173
Tables

Table 4-15: Update Target Rate Scaling Factor and Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .72
Table 4-16: targetRateBase Target Rate Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .73
Table 4-17: Constants Used for Calculating targetRateBase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .73
Table 4-18: targetRateDelta Target Rate Calculation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .73
Table 4-19: minQpOffset Calculation for RC QP Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .75
Table 4-20: maxQp Calculation for RC QP Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .75
Table 4-21: QP Update for Flatness Type 0 (Very Flat). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .76
Table 4-22: QP Update for Flatness Type 1 (Somewhat Flat) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .76
Table 4-23: QP Update for Flatness Type 2 (Complex-to-flat) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .76
Table 4-24: QP Update for Flatness Type 3 (Flat-to-complex) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .76
Table 4-25: modeRdCost Calculation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .77
Table 4-26: lambdaFullness Encoder Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .77
Table 4-27: lambdaBitrate Encoder Calculation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .78
Table 4-28: modeDistortion Calculation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .79
Table 4-29: Transform Mode Stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .80
Table 4-30: intraListA as a Function of Block Position within a Slice . . . . . . . . . . . . . . . . . . . . . . . . .82
Table 4-31: Intra Prediction Mode for FBLS Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .82
Table 4-32: Intra Prediction Modes for NFBLS Blocks (4:4:4, Luma Component) . . . . . . . . . . . . . . .84
Table 4-33: Intra Prediction Modes for NFBLS Blocks (4:2:2, Chroma Components) . . . . . . . . . . . . .85
Table 4-34: Intra Prediction Modes for NFBLS Blocks (4:2:0, Chroma Components) . . . . . . . . . . . . .87
Table 4-35: Forward Discrete Cosine Transform Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .88
Table 4-36: Discrete Cosine Transform Pre-shift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .89
Table 4-37: 8-point Forward Discrete Cosine Transform Applied to Selected Row r
of Pre-shifted Residual Block R(i, j) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .91
Table 4-38: 4-point Forward Discrete Cosine Transform Applied to Row r
of Pre-shifted Residual Block R(i, j) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .92
Table 4-39: 2-point Forward Haar Transform for Column C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .93
Table 4-40: Forward Discrete Cosine Transform Post-shift. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .93
Table 4-41: Encoder Intra Predictor Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .94
Table 4-42: NFBLS Block Intra Predictor Signaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .95
Table 4-43: Mapping for Re-ordering Quantized Transform Coefficients
for EC – 8x2 Components Only . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .96
Table 4-44: BPV Search Ranges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .100
Table 4-45: BPV Search Operation for 2x2 and 2x1 Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .101
Table 4-46: BPV Search SAD Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .103
Table 4-47: YCoCg SAD for 2x2 Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .104
Table 4-48: YCoCg SAD for 2x1 Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .104
Table 4-49: Residual Sub-blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .108
Table 4-50: BP Mode Partition Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .111
Table 4-51: mppQp Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .114
Table 4-52: mppMinStepSize for Various Bit Depths and ssm_max_se_size. . . . . . . . . . . . . . . . . . .115
Table 4-53: MPP Mode Midpoint Calculation for Sub-blocks within a Given Component k . . . . . . .116
Table 4-54: MPP Mode Prediction, Quantization, Inverse Quantization,
Reconstruction, and Error Diffusion for a 2x2 Sub-block . . . . . . . . . . . . . . . . . . . . . . . .119
Table 4-55: MPP Mode Quantized Residual Distribution among Substreams 0 through 3 . . . . . . . . .120

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 6 of 173
Tables

Table 4-56: BP and BP-SKIP Mode Syntax Differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .122


Table 4-57: Component Skip Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .123
Table 4-58: ECG Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .124
Table 4-59: ecgDataActive[ecgIdx] for Transform Mode Component . . . . . . . . . . . . . . . . . . . . . . . .125
Table 4-60: ecgDataActive[ecgIdx] for BP Mode Component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .125
Table 4-61: Mapping from bitsReq to Sample Ranges in SM and 2C Bit Representations . . . . . . . . .128
Table 4-62: bitsReq Unary Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .129
Table 4-63: VEC Encoding – Calculate vecCodeSymbol from ECG Samples . . . . . . . . . . . . . . . . . .130
Table 4-64: vecGrK Parameter for VEC Is Determined
by Group’s Bit Representation and Component Index . . . . . . . . . . . . . . . . . . . . . . . . . . .131
Table 4-65: VEC vecCodeNumber Encoding, Using Golomb-Rice Coding . . . . . . . . . . . . . . . . . . . .131
Table 4-66: Example Golomb-Rice Coding Entries – bitsReq = 1, vecGrK = 1 . . . . . . . . . . . . . . . . .132
Table 4-67: Example Golomb-Rice Coding Entries – bitsReq = 1, vecGrK = 2 . . . . . . . . . . . . . . . . .132
Table 4-68: Example Golomb-Rice Coding Entries – bitsReq = 2, vecGrK = 5 . . . . . . . . . . . . . . . . .132
Table 4-69: Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .134
Table 4-70: qpRc to qpMod Mapping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .136
Table 4-71: Transform Mode Fractional Forward Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . .138
Table 4-72: Transform Mode Fractional Inverse Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .138
Table 4-73: BP Mode Fractional Forward Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .140
Table 4-74: BP Mode Fractional Inverse Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .140
Table 4-75: Scalar Quantization of Residual Sample R(i, j). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .141
Table 4-76: Mode Selection Conditions for Enforcing Correct Rate Buffer Behavior
for Given modeRate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .142
Table 4-77: Encoding Process Block Headers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .144
Table 4-78: Mode Header Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .145
Table 4-79: Flatness Header Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .145
Table 4-80: De-multiplexer Model Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .146
Table 4-81: PPS Parameter num_extra_mux_bits Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .147
Table 4-82: Encoder Substream Multiplexer Check for Mux-word Requests, Substream ssmIdx . . .148
Table 4-83: Substream Multiplexer Encoder Update of De-multiplexer Model State
for Substream ssmIdx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .149
Table 5-1: Substream De-multiplexer Mux Word Requests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .151
Table 5-2: Substream Decoder Slice Syntax. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .153
Table 5-3: Decoding Process Block Headers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .154
Table 5-4: Component Skip Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .155
Table 5-5: Vector Entropy Code Decoding Samples from vecCodeSymbol . . . . . . . . . . . . . . . . . . .157
Table 5-6: Inverse Discrete Cosine Transform Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .159
Table 5-7: 2-point Inverse Haar Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .159
Table 5-8: 8-point Inverse Discrete Cosine Transform Applied to Row r of Y(i, j). . . . . . . . . . . . . .161
Table 5-9: 4-point Inverse Discrete Cosine Transform Applied to Row r of Y(i, j) . . . . . . . . . . . . .162
Table 5-10: Inverse Discrete Cosine Transform Post-shift. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .163
Table 5-11: Example BPV Distribution – bpvTable Value 1011 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .164
Table D-1: Main Contributor History (Previous Versions) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .172

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 7 of 173
Figures

Figures
Figure 1-1: Encoder and Decoder Test Model High-level Operation . . . . . . . . . . . . . . . . . . . . . . . . . .14
Figure 1-2: Example VDC-M 8x2 Block with Index (5, 1) Highlighted . . . . . . . . . . . . . . . . . . . . . . . .18
Figure 2-1: High-level Encoding Operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .22
Figure 2-2: High-level Decoding Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .22
Figure 2-3: Forward and Inverse YCoCg Transform Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23
Figure 2-4: Picture Element Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24
Figure 2-5: Blocks within a Slice are Processed in Block-raster Order. . . . . . . . . . . . . . . . . . . . . . . . .24
Figure 2-6: 1080p Frame with sliceHeight = 108, (Left) 2 Slices/line, (Right) 4 Slices/line . . . . . . . .25
Figure 2-7: Example Slice Demonstrating Blocklines within FBLS and NFBLS . . . . . . . . . . . . . . . . .25
Figure 2-8: Example Horizontal Slice Padding Using Pixel Replication . . . . . . . . . . . . . . . . . . . . . . .27
Figure 2-9: Example Slice of Width 1920 with (Top) slicesPerLine = 1,
(Bottom) slicesPerLine = 2; Horizontal Slice Padding Is Not Required. . . . . . . . . . . . . . .27
Figure 2-10: Example Slice of Width 1000 with (Top) slicesPerLine = 2,
(Bottom) slicesPerLine = 4; Horizontal Slice Padding Is Required in Both Cases . . . . . .28
Figure 2-11: Transform Mode Operation from the Encoder Side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30
Figure 2-12: Transform Mode Operation from the Decoder Side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .31
Figure 2-13: BP Mode Operation from the Encoder Side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .32
Figure 2-14: Example Result of BPV Search for the First Two Sub-blocks
of a Block that Are Not within the Slice’s First Blockline . . . . . . . . . . . . . . . . . . . . . . . . .33
Figure 2-15: BP Mode Operation from the Decoder Side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .34
Figure 2-16: Encoder Side MPP Mode Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35
Figure 2-17: Decoder Side MPP Mode Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35
Figure 2-18: Complex-to-flat Transition Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37
Figure 2-19: RC Flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .38
Figure 2-20: Example Configuration with slicesPerLine = 2,
in Which Chunks from Slices 0 and 1 Will Be Multiplexed
into the Bitstream, Followed by Chunks from Slices 2 and 3, etc. . . . . . . . . . . . . . . . . . . .41
Figure 4-1: Encoder Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .59
Figure 4-2: Block Component Sizes for Different Chroma Sampling Formats. . . . . . . . . . . . . . . . . . .60
Figure 4-3: 8-point Forward Hadamard Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .62
Figure 4-4: 4-point Forward Hadamard Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .63
Figure 4-5: RC Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .66
Figure 4-6: rcOffsetInit as a Function of Block Time within a Slice. . . . . . . . . . . . . . . . . . . . . . . . . . .70
Figure 4-7: rcOffset as a Function of Block Time within a Slice . . . . . . . . . . . . . . . . . . . . . . . . . . . . .71
Figure 4-8: RC QP Update Logic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .74
Figure 4-9: QP Update Mapping from diffBits to qpIndex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .74
Figure 4-10: QP Update Mapping from rcFullness to qpUpdateMode . . . . . . . . . . . . . . . . . . . . . . . . . .75
Figure 4-11: Encoder Operations for a Transform Mode Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .81
Figure 4-12: Transform Mode Prediction and Reconstruction Buffers . . . . . . . . . . . . . . . . . . . . . . . . . .81
Figure 4-13: Intra Prediction Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .83
Figure 4-14: Forward Discrete Cosine Transform Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .88
Figure 4-15: Butterfly Structure for 8-point Forward Discrete Cosine Transform . . . . . . . . . . . . . . . . .90

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 8 of 173
Figures

Figure 4-16: Butterfly Structure for 4-point Forward Discrete Cosine Transform . . . . . . . . . . . . . . . . .92
Figure 4-17: Transform Mode ECG Structure – 8x2 Component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .96
Figure 4-18: Transform Mode ECG Structure – 4x2 and 4x1 Components . . . . . . . . . . . . . . . . . . . . . .96
Figure 4-19: Example Transform Component Data Blocks with Corresponding lastSigPos,
ECG Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .97
Figure 4-20: BP Mode Encoder Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .99
Figure 4-21: BP Mode Partitions – 2x2 and 2x1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .99
Figure 4-22: BP Mode Prediction and Reconstruction Buffers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .100
Figure 4-23: BPV Search Range for FBLS and NFBLS Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .100
Figure 4-24: BPV Search Range Candidate 2x2 Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .102
Figure 4-25: BPV Search Range Candidate 2x1 Partitions
for Source Partition in First Line of Sub-block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .102
Figure 4-26: BPV Search Range Candidate 2x1 Partitions
for Source Partition in Second Line of Sub-block. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .103
Figure 4-27: Example 2x2 and 2x1 Source and Candidate Partitions for SAD Calculation . . . . . . . . .104
Figure 4-28: BP Mode Partitions for 4:2:2 and 4:2:0 Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .105
Figure 4-29: BPV Search Range for 4:2:2 and 4:2:0 FBLS and NFBLS Chroma Components . . . . . .106
Figure 4-30: BPV Search Range Positions at Slice Boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .107
Figure 4-31: BP Mode ECG Structure, All Components (4:4:4) or
Luma Component Only (4:2:2 and 4:2:0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .109
Figure 4-32: BP Mode ECG Structure, Chroma Components (4:2:2 and 4:2:0) . . . . . . . . . . . . . . . . . .109
Figure 4-33: Possible Partition Grids (16) for BP Partition Selection . . . . . . . . . . . . . . . . . . . . . . . . . .110
Figure 4-34: MPP Mode Prediction and Reconstruction Buffers . . . . . . . . . . . . . . . . . . . . . . . . . . . . .113
Figure 4-35: MPP Mode Bits per Pixel to bppIndex Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .114
Figure 4-36: MPP Mode Midpoint Is Calculated from the Current Block’s
Reconstructed Spatial Neighbors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .117
Figure 4-37: MPP Mode Error Diffusion within a 2x2 Sub-block. . . . . . . . . . . . . . . . . . . . . . . . . . . . .118
Figure 4-38: MPPF Mode Bits per Component
Is Defined in PPS Parameter mppf_bits_per_comp . . . . . . . . . . . . . . . . . . . . . . . . . . . .121
Figure 4-39: Transform Mode Component ECG Construction Example . . . . . . . . . . . . . . . . . . . . . . .126
Figure 4-40: BP Mode Component ECG Construction Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . .127
Figure 4-41: Entropy Encoding Flowchart to Select EC Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .128
Figure 4-42: VEC Encoding Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .130
Figure 4-43: Substream Multiplexer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .146
Figure 4-44: Substream Multiplexing Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .148
Figure 5-1: Substream De-multiplexer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .150
Figure 5-2: Transform Mode Syntax Parsing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .152
Figure 5-3: BP Mode Syntax Parsing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .152
Figure 5-4: MPP Mode Syntax Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .152
Figure 5-5: MPPF Mode Syntax Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .153
Figure 5-6: BP-SKIP Mode Syntax Parsing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .153
Figure 5-7: Inverse Discrete Cosine Transform Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .159
Figure 5-8: Butterfly Structure for 8-point Inverse Discrete Cosine Transform . . . . . . . . . . . . . . . . .160
Figure 5-9: Butterfly Structure for 4-point Inverse Discrete Cosine Transform . . . . . . . . . . . . . . . . .162

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 9 of 173
Preface
Intellectual Property

Preface
Intellectual Property
Copyright © 2018 Video Electronics Standards Association. All rights reserved.
While every precaution has been taken in the preparation of this Standard, the Video Electronics
Standards Association and its contributors assume no responsibility for errors or omissions
and make no warranties, expressed or implied, of functionality or suitability for any purpose.

Trademarks
VESA is a registered trademark of the Video Electronics Standards Association.
All other trademarks used within this document are the property of their respective owners.

Patents
VESA® draws attention to the fact that compliance with this Standard might involve the use
of a patent or other intellectual property right (collectively, “IPR”) concerning VESA Display
Compression-M (VDC-M). VESA takes no position concerning the evidence, validity, and/or
scope of this IPR. At the time of publication, there are currently no IPRs specific to VDC-M
to list in this Standard.
Attention is drawn to the possibility that some of the elements of this VESA Standard might
be the subject of IPRs external to this Standard. VESA shall not be held responsible for identifying
any or all such IPRs, and has made no inquiry into the possible existence of any such IPRs.
THIS STANDARD IS BEING OFFERED WITHOUT ANY WARRANTY WHATSOEVER,
AND IN PARTICULAR, ANY WARRANTY OF NON-INFRINGEMENT IS EXPRESSLY
DISCLAIMED. ANY IMPLEMENTATION OF THIS STANDARD SHALL BE MADE
ENTIRELY AT THE IMPLEMENTER’S OWN RISK, AND NEITHER VESA, NOR ANY
OF ITS MEMBERS OR SUBMITTERS, SHALL HAVE ANY LIABILITY WHATSOEVER
TO ANY IMPLEMENTER OR THIRD PARTY FOR ANY DAMAGES OF ANY NATURE
WHATSOEVER DIRECTLY OR INDIRECTLY ARISING FROM THE IMPLEMENTATION
OF THIS STANDARD.

Support for this Standard


To obtain the latest Standard and any support documentation, contact VESA.
If you have a product that incorporates VDC-M, ask the company that manufactured your
product for assistance. If you are a manufacturer, VESA can assist you with any clarification
you might require. Submit all comments to [email protected].

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 10 of 173
Preface
Acknowledgments

Acknowledgments
This document would not have been possible without the efforts of the VESA Display
Compression-M Task Group. In particular, Table 1 lists the individuals and their companies
that contributed significant time and knowledge to this version of the Standard.

Table 1: Main Contributors to VDC-M v1.1


Company Name Contribution
Analogix Semiconductor Peter Halenbeck Contributor
Greg Stewart Contributor
Avatar Tech Pubs Trish McDermott Technical Writer
Bitec Spain S.L. Damian Sanchez Contributor
Broadcom Corp. Rick Berard Contributor
Fred Walls Contributor, Reviewer
DisplayLink Corp. Eric Hamaker Contributor
Shumin Tian Contributor
Extron Electronics Alex Petrulian Contributor, Reviewer
Hardent, Inc. Simon Bussières Contributor
Alain Legault Contributor
Avrum Warshawsky Contributor, Reviewer
MediaTek, Inc. Li-Heng Chen Contributor, Reviewer
Tung-Hsing Wu Contributor, Reviewer
Parade Technologies, Ltd. Craig Wiley Contributor
Qualcomm, Inc. James Goel Contributor, Reviewer, Quality Subgroup Chair
Ike Ikizyan Contributor
Natan Jacobson Document Editor, Primary Technical Contributor
Rajan Joshi Primary Technical Contributor
Vijayaraghavan Thirumalai Primary Technical Contributor
Samsung Electronics Co., Ltd. Greg Cook Contributor, Reviewer
Hojun Jung Contributor, Reviewer
Taewoo Kim Contributor, Reviewer
Deoksoo Park Contributor, Reviewer
Dale Stolitzka Task Group Chair, Contributor, Reviewer
Synaptics, Inc. Bruce Chin Contributor
Synopsys Carlos Ferreir Contributor
Pedro Miguel Contributor
VESA David Braun Contributor
Bill Lempesis Contributor
Christine Wentker Contributor

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 11 of 173
Preface
Revision History

Table 1: Main Contributors to VDC-M v1.1 (Continued)


Company Name Contribution
York University Robert Allison Contributor
Matthew Cutone Contributor
Marc Dalecki Contributor
Lesley Deas Contributor
Yuqian Hou Contributor
Aishwarya Sudhama Contributor
Laurie Wilcox Contributor

Revision History

Table 2: Revision History


Date Version Description
May 11, 2018 1.1 • Table 4-16 – Changed p equation
• Table 4-53 – Updated to match normative test model behavior;
Changed bias to a branch statement
• Section 4.6.3.4 – Clarified selection of MPP color space
• Section 4.6.4 – Clarified selection of MPPF color space
• Added changes made to version 1.1 of this Standard by way of the following
VDC-M v1.0 SCRs:
• VDC-M 1.0.0 Rate Buffer Guidance Annex SCR
• VDC-M 1.0.0 guidance for MPPF bits per component SCR
• VDC-M 1.0 C Code RdCost optimization
• VDC-M 1.0.0 12bpc min QP bug fix
• VDC-M 1.0.0 guidance for minimum slice parameters
• VDC-M 1.0 PPS to reduce decoder calculations
• VDC-M 1.0.0 stable sort for intra prediction first stage
• VDC-M 1.0 document and code clipping logic in Co Cg and BP
• VDC-M 1.0.0 guidance for supported bit depths
• VDC-M 1.0 test model QP logic simplification
• VDC-M 1.0 test model transform mode parallel implementation
• VDC-M 1.0 test model MPP clipping modifications
• VDC-M 1.0 test model additional reconstructed YCoCg buffer
• Applied minor grammatical edits
February 9, 2018 1.0 Initial release of the Standard.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 12 of 173
Section 1: Introduction (Informative)
Document Organization

1 Introduction (Informative)
1.1 Document Organization
This Standard is organized into the following sections and annexes:
• Section 1 – Introduction (Informative)
This section defines the high-level industry needs for VDC-M and the resulting technical
objectives that the remaining sections of this Standard are intended to satisfy. This section
also includes document conventions and references.
• Section 2 – Theory of Operation (Informative)
This section describes the codec operation from a high-level perspective.
• Section 3 – Syntax (Normative)
This section describes the VDC-M syntax at three levels – picture, substream, and entropy
coding group.
• Section 4 – Encoding Process (Normative)
This section describes the operations that shall be performed by a VDC-M-compliant encoder.
• Section 5 – Decoding Process (Normative)
This section describes the operations that shall be performed by a VDC-M-compliant decoder.
• Annex A – Rate Buffer Guidance (Normative)
This annex provides guidance for calculating rate buffer size-related PPS parameters.
• Annex B – Derivation of Parameters (Informative)
This annex provides the derivation of certain codec parameters.
• Annex C – Guidance for Rate Buffer Size and Delays in a Practical Implementation
(Informative)
This annex provides guidance for a practical implementation of the rate buffer, including delays
associated with the rate buffer and with substream multiplexing.
• Annex D – Main Contributor History (Previous Versions)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 13 of 173
Section 1: Introduction (Informative)
VDC-M C++ Model

1.2 VDC-M C++ Model


The VDC-M test model is written in C++ and includes a solution file that is compatible with
Microsoft Visual Studio 2013 and later. A makefile is also provided for compilation in Linux. The
model has been tested on x86 and x86-64 platforms. The code is split into two projects, as follows:
• Encoder project – Takes as input a source image and generates a bitstream file. As an option,
the encoder can also generate a reconstructed image.
• Decoder project – Takes as input a bitstream file, and generates a reconstructed image.

Figure 1-1 illustrates the encoder and decoder project high-level code flow.

Encoder Decoder

VdcmEncoder.cpp EncTop.cpp VdcmDecoder.cpp DecTop.cpp

InitAndParseCommandLine() Encode() InitAndParseCommandLine() Decode()

CalculateSlicePadding() Load Source (PPM/DPX/RAW) Decode() Load VDC-M Bitstream

Encode() EncodeSlice() for Each Slice DecodeSlice() for Each Slice

Save VDC-M Bitstream Save Reconstructed Image

Save Reconstructed Image

Figure 1-1: Encoder and Decoder Test Model High-level Operation

Within the code, the routines EncodeSlice and DecodeSlice are responsible for most encoding
and decoding operations. Each time either of these routines is executed, local storage is allocated
for a set of coding objects that execute the per-block coding and decoding processes. Each
coding object is implemented as a class with appropriate constructor, destructor, memory
management, etc. For example, Transform mode is implemented as class TransMode, which
is described in the source files TransMode.(cpp|h). An object of type TransMode is created
in EncodeSlice and DecodeSlice. The TransMode object is used to perform all Transform
mode operations.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 14 of 173
Section 1: Introduction (Informative)
VDC-M C++ Model

Table 1-1 lists the set of coding objects that are used for the encoding/decoding process.

Table 1-1: Coding Objects Used by EncodeSlice and DecodeSlice Routines


Object EncodeSlice DecodeSlice Description
BpMode/BpSkipMode ✔ ✔ Derived from Mode class. Used to test/encode BP mode
and BP-SKIP mode for each block time.
BpvSearchRange ✔ ✔ Member of BpMode/BpSkipMode. Stores the BPV
search range.
EntropyCoder ✔ ✔ Performs quantized transform coefficients (Transform mode)
and quantized residual entropy coding (BP mode).
FlatnessDetection ✔ Performs Hadamard transform and classifies a flatness type
for each block time.
IntraPrediction ✔ ✔ Member of TransMode. Used to calculate intra predicted
blocks for each intra predictor type.
MppMode ✔ ✔ Derived from Mode class. Used to test/encode MPP mode
for each block time.
MppfMode ✔ ✔ Derived from Mode class. Used to test/encode MPPF mode
for each block time.
Pps ✔ ✔ Maintains the picture parameter set state.
RateControl ✔ ✔ Maintains the rate controller state and performs per-block
target bit rate calculation and QP update.
SubStreamMux ✔ ✔ Handles substream multiplexing.
TransMode ✔ ✔ Derived from Mode class. Used to test/encode Transform
mode for each block time.
VideoFrame ✔ Storage for pixel data (source).
✔ ✔ Storage for pixel data (reconstructed).

Each coding mode class (e.g., Transform, BP, MPP, etc.) is derived from the Mode class,
which provides operations common to all modes, such as performing color space transform and
calculating distortion (error between source and reconstructed samples). The Mode class provides
the virtual routines listed in Table 1-2, which are implemented in each of the derived coding
mode classes.

Table 1-2: Virtual Routines Provided by Mode Class


Virtual Routine Description
Decode (decoder only) Decode a block using the specified coding mode.
Encode (encoder only) Encode a block using the specified coding mode. This will occur only for the mode
selected by encoder mode selection.
InitModeSpecific Set up the coding mode.
Test (encoder only) Test the coding mode, providing a rate, distortion, and RD cost. This is performed
for all modes during each block time.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 15 of 173
Section 1: Introduction (Informative)
Document Conventions

1.3 Document Conventions


1.3.1 Symbols
Table 1-3 lists symbols that recur throughout this Standard. The notation X (i, j) is used to refer
to a block of data, where:
• i is the column index
• j is the row index

Table 1-3: Recurring Symbols


Symbol Description Definition
X(i, j) Source block.
P(i, j) Predicted block.
R(i, j) Residual block. R(i, j) = X(i, j) – P(i, j)
Rq(i, j) Quantized residual block. Rq(i, j) = Q [R (i, j)]

R (i, j) Reconstructed residual block. R (i, j) = Q-1 [Rq(i, j)]a

or R (i, j) = DCT-1 [ T (i, j) ]b c


X (i, j) Reconstructed block. X (i, j) = R (i, j) + P(i, j)
T(i, j) Transform coefficients. T (i, j) = DCT [R(i, j)]
Tq(i, j) Quantized transform coefficients. Tq (i, j) = Q [T(i, j)]

T (i, j) Reconstructed transform coefficients. T (i, j) = Q-1 [Tq(i, j)]

a. Q-1 = Inverse quantization.


b. The DCT [•] and DCT -1 [•] functions are fixed-point approximations to the discrete cosine transform,
described in Section 4.6.1 and Section 5.4.1.2, respectively.
c. DCT-1 = Inverse DCT.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 16 of 173
Section 1: Introduction (Informative)
Document Conventions

1.3.2 Notations

1.3.2.1 C Model Operators


Table 1-4 lists operators that are used throughout the C model to calculate certain quantities.

Table 1-4: C Model Operators


Operator Description
| Bitwise or.
|| Logical or.
& Bitwise and.
&& Logical and.
= Assignment operator.
== Comparison operator.
% Modulo.
>> Bit shift right.
<< Bit shift left.
[•] Accessor (zero-based).
! Logical negation.
<, ≤ Less than, Less than or equal to.
>, ≥ Greater than, Greater than or equal to.
+= Addition assignment operator (x += a is equivalent to x = x + a).
–= Subtraction assignment operator (x –= a is equivalent to x = x – a).
>>= Bit shift right assignment operator (x >>= a is equivalent to x = x >> a).
<<= Bit shift left assignment operator (x <<= a is equivalent to x = x << a).
min (a, x) Return the minimum of a, x.
max (a, x) Return the maximum of a, x.
clip (a, b, x) Return the result of clipping x into the range [a, b] (equivalent to min (max (x, a), b).

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 17 of 173
Section 1: Introduction (Informative)
Document Conventions

1.3.2.2 Highlight and Camel-case Rules


Table 1-5 lists the formatting that is used to help distinguish between PPS parameters,
and C model notes, variables, functions/classes, and constants.

Table 1-5: Highlight and Camel-case Rules Used throughout this Standard
Example Description Example
rc_target_rate_scale PPS parameter. Lowercase, underscore delimited, red, bold.
VEC_MAPPING_TABLE C model note. Uppercase, underscore delimited, red, bold.
IntraPrediction C function / C class. Starts with uppercase, camel-case, blue, bold.
flatnessType C variable. Starts with lowercase, camel-case, blue.
dctShift8 C constant. Starts with lowercase, camel-case, red.
bpcTemp C temporary variable. Starts with lowercase, camel-case, black.

1.3.2.3 Block Dimensions


Block dimensions are indicated as MxN, where:
• M denotes the number of columns within the block
• N denotes the number of rows within the block

Matrices are indexed in row-major order with zero-based indexing. In Figure 1-2, an 8x2 block
is displayed with index (5, 1) highlighted.

Figure 1-2: Example VDC-M 8x2 Block with Index (5, 1) Highlighted

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 18 of 173
Section 1: Introduction (Informative)
Document Conventions

1.3.3 Acronyms, Initialisms, and Abbreviations


Table 1-6 lists terms that are used throughout this Standard.

Table 1-6: Acronyms, Initialisms, and Abbreviations


Term Definition
2C 2’s Complement (bit representation).
BF Buffer Fullness (rate buffer).
BP Block Prediction (mode).
bpc bits per component (8, 10, or 12).
bpp bits per pixel.
BP-SKIP Block Prediction Skip (mode).
BPV Block Prediction Vector.
CBR Constant Bit Rate.
CPEC Common-prefix Entropy Code (entropy coding group type).
CSC Color-Space Conversion.
DCT Discrete Cosine Transform (mode).
EC Entropy Coding.
ECG Entropy Coding Group.
FIFO First-In, First-Out buffer.
FBLS First BlockLine in Slice.
FSF Funnel Shifter Fullness.
HAD HADamard transform.
hrd hypothetical reference decoder.
LUT Look-Up Table.
MPP MidPoint Prediction (mode).
MPPF MidPoint Prediction Fallback (mode).
NFBLS Non-First BlockLine in Slice.
QP Quantization Parameter.
RC Rate Control.
RD Rate Distortion (cost function that includes both rate and distortion).
RDO, rdo Rate-Distortion Optimization.
RGB Red/Green/Blue (color space).
SAD Sum of Absolute Differences.
SM Sign/Magnitude (bit representation).
SSM, ssm SubStream Multiplexing.
VEC Vector Entropy Code (entropy coding group type).
YUV Luma/chroma color space (e.g., YCoCg or YCbCr).

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 19 of 173
Section 1: Introduction (Informative)
Document Conventions

1.3.4 Glossary
Table 1-7 lists terms that are used throughout this Standard.

Table 1-7: Glossary


Term Description
4:2:0 Chroma sampling format with subsampling by two across each row and column for
chroma components.
4:2:2 Chroma sampling format with subsampling by two across each row for chroma components.
4:4:4 Chroma sampling format with three samples for each pixel (no subsampling).
balance FIFO FIFO used by substream multiplexer to store bits at encoder side awaiting generation
of mux words.
bitstream Encoder-generated compressed bitstream that must contain syntax as specified in Section 3.
block Fundamental coding unit for VDC-M with a dimension of 8x2 pixels. Each block is coded using
one of the available coding modes.
blockline Set of two raster scanlines that intersect a given 8x2 block.
block time Amount of time or number of clock cycles associated with encoding or decoding a single
8x2 block.
chunk Portion of the bitstream that comprises a set of data bytes. A slice contains as many chunks
as the number of blocklines within the slice.
entropy decoder Process within the decoder for parsing groups of bits that have been entropy-coded.
entropy encoder Process within the encoder for generating groups of bits that have been entropy-coded.
flatness detection Process within the encoder for detecting regions of low/high complexity to allow the codec
to adapt quickly to changing content.
funnel shifter FIFO used by the substream de-multiplexer that receives bits in units of mux words from the
bitstream and provides bits to the entropy decoder and bitstream parser.
inverse quantization Mapping of a quantized value back to its original range.
mux word Unit of bit transmission for substream multiplexing. During each block time, the de-multiplexer
may request up to one mux word per substream.
quantization Mapping of a value from a given range to a smaller range. Information is lost during this process.
rate control Process within the codec for adjusting the QP during each block time, ensuring that the rate buffer
never underflow or overflows, and ensuring that at least one mode is available to the encoder
for each block time.
reconstruction Process within the decoder for determining the final output sample values, and within the encoder
for determining the reconstructed values.
residual Difference between the source sample and predicted sample.
RD cost “Cost” associated with a coding mode that is a function of the mode’s rate (number of bits) and
distortion (difference between original and reconstructed). Used by encoder mode selection.
sample Single component within a pixel.
slice Picture is spatially partitioned into a set of non-overlapping slices. Each slice can be
independently encoded/decoded from all other slices.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 20 of 173
Section 1: Introduction (Informative)
Reference Documents

Table 1-7: Glossary (Continued)


Term Description
slice multiplexing Process of interleaving the data of multiple slices into a bitstream.
substream multiplexing Process of removing mux words of data from the SSM balance FIFOs, and then inserting those
(SSM) mux words into the bitstream in the proper order as expected by the substream de-multiplexer.
syntax element Bits for a single substream that are needed to encode or decode a given block. The number of bits
varies, depending on the block content.

1.4 Reference Documents


Table 1-8: Reference Documents
Document Version/ Publication Date Referenced
Revision As
Recommendation T.50, INTERNATIONAL REFERENCE September, 1992 T.50,
ALPHABET (IRA): INFORMATION TECHNOLOGY – Table 4/T.50
7-BIT CODED CHARACTER SET FOR INFORMATION
INTERCHANGE, Table 4/T.50a
VDC-M Test Modelb 1.1 April 30, 2018

VESA Glossary of Termsb Current Current

VESA Intellectual Property Rights (IPR) Policyb 200D March 27, 2017

a. Published by International Telecommunication Union. See www.itu.int/rec/T-REC-T.50-199209-I/en.


b. See www.vesa.org/vesa-member/downloads/.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 21 of 173
Section 2: Theory of Operation (Informative)
Overview

2 Theory of Operation (Informative)


2.1 Overview
This section describes the codec operation from a high-level perspective. The descriptions provided
in this section are informative and are not intended to specify the exact behavior of the encoder or
the decoder. A compliant encoder must follow the operations described in Section 4, and generate
a bitstream with syntax as specified in Section 3. A compliant decoder must be able to decode
the said bitstream, following the operations as specified in Section 5.
Figure 2-1 and Figure 2-2 illustrate the high-level encoder and decoder process flows, respectively.
Further description is provided for each of the processes in this flow throughout the remainder
of this Standard.

Rate Bitstream
Substream
Flatness Detection Rate Control Entropy Coder Rate Buffer
Multiplexer

QP

CSC Test All Encoder Encode Using


Source Buffer
(RGB to YCoCg) Coding Modes Mode Selection Selected Mode

Reconstructed
Y
Samples
Reconstructed
Source N
Source Buffer
Is RGB ?

Figure 2-1: High-level Encoding Operation

Rate Reconstructed Image


CSC
Rate Control
(YCoCg to RGB)

QP
Y

Bitstream Reconstructed Image


Substream Decode Source
Rate Buffer Entropy Decoder
De-multiplexer Block Mode Is RGB ?
N

Reconstructed
Samples
Reconstructed
Buffer

Figure 2-2: High-level Decoding Operation

The encoder’s computational complexity is greater than that of the decoder due to process
advantages that are typically associated with the encoder. The encoder will make many
comparisons and decisions during the encoding process, the results of which will be signaled
explicitly to the decoder in the bitstream syntax. This allows for the decoder to be implemented
in a relatively smaller design.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 22 of 173
Section 2: Theory of Operation (Informative)
CSC

2.2 CSC
The YCoCg-R lossless color space transform is used for RGB source data. From this point forward,
the term “YCoCg” is used to refer to YCoCg-R. Figure 2-3 illustrates the transform matrix for this
color-space conversion (CSC).

Forward Inverse
Y 1/4 1/2 1/4 R R 1 1/2 -1/2 Y
Co = 1 0 -1 G G = 1 0 1/2 Co
Cg -1/2 1 -1/2 B B 1 -1/2 -1/2 Cg

Figure 2-3: Forward and Inverse YCoCg Transform Matrices

This representation requires the use of one extra bit of precision for the chrominance (chroma)
components (Co and Cg). The representation of samples within the Co and Cg components
is signed, in contrast to the unsigned source data. The following example illustrates the range
of samples after CSC for 8-bpc unsigned RGB source data:
• Y component – For eight unsigned bits, the sample range is [0, 255]
• Co component – For nine signed bits, the sample range is [-255, 255]
• Cg component – For nine signed bits, the sample range is [-255, 255]

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 23 of 173
Section 2: Theory of Operation (Informative)
Picture Hierarchy

2.3 Picture Hierarchy


The codec operates on pictures, where a picture may also be referred to as an “image” or “frame.”
For a given picture, the codec settings must remain constant, as defined in the picture parameter set
(PPS; see Table 3-1). A picture may be further partitioned into a set of non-overlapping slices,
as illustrated in Figure 2-4.

Picture
Slice Slice
Block Block Block Block
Block ... Block ...
Slice
Block Block ...
Block ...

Figure 2-4: Picture Element Hierarchy

Slices are independent, and may be processed in parallel. In this case, slice multiplexing is used
to combine data from multiple slices into the bitstream. Each slice is composed of an integer
number of non-overlapping 8x2 pixel blocks.
The pixels within each block are defined by an unsigned triplet of values, each with a fixed
number of bits. For example, if the source content is 4:4:4 RGB at 10bpc, each pixel contains
a 10-bit unsigned sample for each component.
The following sub-sections describe blocks, slices, and pictures in further detail.

2.3.1 Block Level


Each slice is composed of an integer number of 8x2 blocks, organized in a 2D array, which will
be processed by both the encoder and decoder in block-raster order, as illustrated in Figure 2-5.
The encoder will select a single coding mode for each block during encode operations. In addition,
flatness detection and rate control are both updated each block time.

0 1 ... N–2 N–1

N N+1 ... 2N – 2 2N – 1

2N 2N + 1 ... 3N – 2 3N – 1

... ... ... ... ...

Figure 2-5: Blocks within a Slice are Processed in Block-raster Order

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 24 of 173
Section 2: Theory of Operation (Informative)
Picture Hierarchy

2.3.2 Slice Level


Each picture is partitioned into a set of one or more non-overlapping slices. Dependency is
not allowed between slices such that slices may be processed in parallel, and any single-bit
transmission error will affect one or fewer slices. Table 2-1 lists the two parameters that are
used to define slices.

Table 2-1: Parameters Used to Define Slices


Parameter Definition
sliceHeight Number of scanlines within a slice. Typically, this is selected to evenly partition the frame’s height.
For example, a slice height of 108 pixels will evenly divide a Full HD frame (1920x1080) into
10 vertical slices.
slicesPerLine Horizontal component of the frame resolution that may be evenly divided into multiple slices. For
example, for a Full HD frame with slice height of 108 and 2 slices/line, the dimension of each slice
will be 960x108. (See Figure 2-6, left.) If 4 slices/line are specified, the dimension of each slice will
be 480x108. (See Figure 2-6, right.)

960 960 480 480 480 480


108 108

108 108
1080

1080
... ... ... ... ... ...
108

108
Figure 2-6: 1080p Frame with sliceHeight = 108, (Left) 2 Slices/line, (Right) 4 Slices/line

A distinction is made between the first and non-first blocklines of a slice (FBLS and NFBLS,
respectively) because only NFBLS blocks can use the previous reconstructed line for prediction.
For this reason, more bit rate will be assigned to FBLS blocks to compensate for the reduction
in valid predictors. (See Section 2.8 for further details.) In addition, because this codec uses an
8x2 block size, both raster lines that intersect the current block are denoted as the current blockline.
Therefore, the FBLS is the first blockline, as illustrated in Figure 2-7.
FBLS

sliceHeight
NFBLS

...

Figure 2-7: Example Slice Demonstrating Blocklines within FBLS and NFBLS

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 25 of 173
Section 2: Theory of Operation (Informative)
Picture Hierarchy

Coding performance is better for larger slices than smaller slices. At minimum, each slice should
contain approximately 30k pixels. For example, the following slice configurations meet the
minimum requirement for a 1920x1080 picture:
• Landscape orientation (1920x1080), 2 slices/line – Slice size 960x32
• Landscape orientation (1920x1080), 4 slices/line – Slice size 480x64
• Portrait orientation (1080x1920), 1 slice/line – 1080x32

A larger slice height can be specified to further improve performance. When possible, a sliceHeight
of 108 is recommended to maximize performance.

2.3.3 Picture Level


Each picture has a specified width, height, bit depth, and chroma-sampling format. For example,
a common use case is 3840x2160 RGB 4:4:4 at 10bpc. The encoder is responsible for splitting
the source picture into a set of non-overlapping slices, which are then compressed and inserted
into the bitstream. The decoder will receive the slice data, and then decompress each slice
to generate the reconstructed picture.
Slice padding is used as discussed in Section 2.4 if either of the following conditions is met:
• Slice width is not divisible by block width
• Source height is not divisible by slice height

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 26 of 173
Section 2: Theory of Operation (Informative)
Slice Padding

2.4 Slice Padding


Slice padding is used to ensure the following:
• Slice size is evenly divisible by the block size
• Slice height is uniform for all slices in the picture

The horizontal and vertical padding steps are handled separately, as described in the sub-sections
that follow. Padding itself is accomplished by pixel replication:
• Horizontal padding – Last valid source column will be replicated
• Vertical padding – Last valid source row will be replicated

Figure 2-8 illustrates an example of horizontal padding.

Slice Width (Before Padding)

A B C D E F
... G H I J K L

Block 0 Block N – 2 Block N – 1

Slice Width (After Padding)


A B C D E F F F
... G H I J K L L L
Block 0 Block N – 2 Block N – 1

Figure 2-8: Example Horizontal Slice Padding Using Pixel Replication

2.4.1 Horizontal Slice Padding


If the source width is divisible by (8 × slicesPerLine), horizontal slice padding is not required.
For example, horizontal slice padding is not required for a slice of width 1920 with
slicesPerLine = 1 and slicesPerLine = 2, as illustrated in Figure 2-9.

1920 1920

Slice

1920 1920

960 960 960 960

Slice Slice

Figure 2-9: Example Slice of Width 1920 with (Top) slicesPerLine = 1,


(Bottom) slicesPerLine = 2; Horizontal Slice Padding Is Not Required

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 27 of 173
Section 2: Theory of Operation (Informative)
Slice Padding

If the source width is not divisible by (8 × slicesPerLine), horizontal padding will be identically
added to each slice, as illustrated in Figure 2-10 for a slice of width 1000:
• If slicesPerLine = 2, each initial slice width is:
1000
= 500
2

The final (padded) slice width is determined by:

 8  = 504
500

• If slicesPerLine = 4, each initial slice width is:


1000
= 250
4

The final (padded) slice width is determined by:

 8  = 256
250

1000 1008

500 500 504 504

Slice Slice

4 4

1000 1024

250 250 250 250 256 256 256 256

Slice Slice Slice Slice

6 6 6 6

Figure 2-10: Example Slice of Width 1000 with (Top) slicesPerLine = 2,


(Bottom) slicesPerLine = 4; Horizontal Slice Padding Is Required in Both Cases

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 28 of 173
Section 2: Theory of Operation (Informative)
Quantization

2.4.2 Vertical Slice Padding


Vertical slice padding is performed if the source height is not evenly divisible by the slice height.
This will occur, for example, for a WUXGA frame (1920x1200) with a slice height configured
to 108 lines. In this case, the frame contains 11 full slices and one partial slice of height
1200 – (11 × 108) = 12. This partial slice will be vertically padded to 108 such that
the effective frame height becomes 12 × 108 = 1296. Thus, the total bit allocation for the
compressed frame (assuming 6bpp) will be 1920 × 1296 × 6 = 14,929,920 bits, rather than
1920 × 1200 × 6 = 13,824,000 bits. This extra allocation can be avoided by selecting a slice
height that evenly divides 1200.

2.5 Quantization
This codec uses quantization to achieve the fixed-rate compression goal. The quantization is
controlled by a quantization parameter (QP) that is maintained by the rate control algorithm at both
the encoder and decoder. The rate control algorithm will typically lower the QP for “easy” or “flat”
content, and increase the QP for “difficult” or “complex” content. (For further details regarding QP,
see Section 4.5.3.) This codec uses two different quantization methods, as described in
Section 4.8.1 and Section 4.8.2.

2.6 Coding Modes


For each block within a slice, the encoder tests the following set of coding modes:
• Transform Mode
• BP Mode
• MPP Mode
• Fallback Modes
• MPPF Mode
• BP-SKIP Mode

For each mode, the encoder determines a rate (i.e., the total of all syntax bits needed to transmit
the block using the coding mode) and a distortion. The rate and distortion are used to calculate an
RD cost. Next, the encoder selects the mode, which minimizes the RD cost subject to rate control
constraints that ensure the rate buffer does not underflow or overflow. The selected mode is
transmitted in the bitstream, and the encoder repeats the process for the next block.
Each coding mode is described in the subsections that follow.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 29 of 173
Section 2: Theory of Operation (Informative)
Coding Modes

2.6.1 Transform Mode


Transform mode is similar to the Block Transform mode that is found in modern video codecs
(e.g., AVC/HEVC), with several modifications to allow for high-throughput, low-latency, and
low-complexity operation. The transform itself is a fixed-point approximation to the Discrete
Cosine Transform (DCT).
As with other modes in this codec, this mode is asymmetric in complexity between the encoder and
decoder. The encoder is responsible for testing a set of intra predictors for each block to determine
the intra predictor that produces the lowest RD cost. The selected intra prediction mode is signaled
explicitly in the bitstream, such that the decoder need only parse the information and calculate a
single intra predicted block.
Figure 2-11 illustrates the overall flow of Transform mode from the encoder side. For source
content X(i, j) in the RGB color space, CSC will be used such that Transform mode is evaluated
in the YCoCg space. If the source content is YCbCr, CSC will not be used and Transform mode
will operate natively in the YCbCr color space.

Curren t Block Reconstru cted Neighbo rs


Intra Pr edictio n

1x ( FBL S)
4x ( NFBLS )

FB LS – All steps performed for one intra predictor.


NF BL S – All steps performed for four intra predictors. Entropy
Encode r

Curren t Block
Calculate Inve rse Inve rse Calculate
Transfo rm Qua ntization
Residual Qua ntization Transfo rm RD Cost

Calculate
Distortion

modeRdCost
Sele ct Intra Pred ictor
with Minimum RD Co st

Figure 2-11: Transform Mode Operation from the Encoder Side

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 30 of 173
Section 2: Theory of Operation (Informative)
Coding Modes

The intra predictor will use a fixed intra mode for all FBLS blocks. For NFBLS blocks, eight intra
modes are available. In this case, the intra prediction block will calculate all eight intra predictors,
and then calculate the sum of absolute differences (SAD) between each intra predictor and the
original block. The four intra modes with the lowest distortion will be selected and tested during
the remainder of the mode operation.
For a given intra predictor, the residual block R(i, j) is calculated. This is the difference between
the source block X(i, j) and the intra predicted block P(i, j). The discrete cosine transform (DCT)
is applied to the residual block, resulting in a block of transform coefficients T(i, j). The transform
coefficients are then quantized to produce a quantized transform coefficient block Tq(i, j). These
quantized transform coefficients are transmitted in the bitstream, embedded in entropy coding
groups (ECGs). The size of these ECGs determines the rate for the transform block. Finally, inverse
quantization T (i, j) = Q-1[Tq(i, j)] and inverse transform R (i, j) = DCT-1[ T (i, j) ] are applied such
that the distortion (SAD) can be calculated between the residual and reconstructed residual blocks
(R(i, j) and R (i, j), respectively). The RD cost information is calculated from the rate and distortion.
Figure 2-12 illustrates the overall flow of Transform mode from the decoder side. The intra
prediction mode is parsed from the bitstream.

Reconstru cted Block


Bitstrea m
Entropy Inve rse Inve rse
Reconstru ct
Decoding Qua ntization Transfo rm

Calculate
Intra Pr edicto r

Figure 2-12: Transform Mode Operation from the Decoder Side

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 31 of 173
Section 2: Theory of Operation (Informative)
Coding Modes

2.6.2 BP Mode
In Block Prediction (BP) mode, the current block is spatially predicted from a set of reconstructed
neighboring samples, referred to as the “Block Prediction Vector (BPV) search range.” The BPV
search range consists of 64 valid positions. The encoder is responsible for testing all BPV search
range positions for each sub-block and partition type to find the best match.
Before prediction, the current block is partitioned into four 2x2 sub-blocks. Figure 2-13 illustrates
the BP encoding steps.

4x 16x

Perf ormed for Perf ormed for


Each 2x2 Sub-block Each Partition Grid
Reconstruction
Prediction
Source Block
2 × 32x (FBLS)
2 × 64x (NFBLS) Construct
Reconstruct ed
Partition Grid
Source Block BPV Search Calculate Inverse
Quantization
(2x1 Partitions) Residual Quantization

BPV Search
Range
Quantized Residual Calculate Ent ropy
BPV Coding Rate

Reconstructed
Residual Calculate
Calculate RD Cost Select Minimum
32x (FBLS) Residual Distortion
64x (NFBLS)

BPV Search Calculate Inverse


Quantization
(2x2 Partitions) Residual Quantization

Prediction
Source Block

Reconstruction

Figure 2-13: BP Mode Operation from the Encoder Side

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 32 of 173
Section 2: Theory of Operation (Informative)
Coding Modes

Each sub-block is predicted from the BPV search range using either a 2x2 or 2x1 partition. In the
former case, a single block prediction vector (BPV) from the search range is used to generate
the prediction for the 2x2 sub-block. In the case that the 2x1 partition is selected, the sub-block
will be represented by two different BPVs. The first BPV generates a 2x1 predicted block for the
upper two samples within the sub-block, while the second BPV generates a 2x1 predicted block
for the lower two samples. Figure 2-14 illustrates an example of this for the first two sub-blocks
of a given NFBLS block, where P represents a partition and SB represents a sub-block. The top
illustration shows the 2x2 predicted partitions PA0 and PA1. The bottom illustration shows the
2x1 predicted partitions PB0 through PB3.

 
3$

3$


6% 6% 6% 6%

 3% 3% 3% 



 3%

6% 6% 6% 6%

Figure 2-14: Example Result of BPV Search for the First Two Sub-blocks
of a Block that Are Not within the Slice’s First Blockline

The encoder performs a search to find the BPV which minimizes the prediction residuals for
all 2x2 and 2x1 partitions for each 2x2 sub-block of the current block. The search operation is
independent between the two partition types, and between the four sub-blocks within the current
block. Ultimately, the encoder will independently select a partition type for each of the sub-blocks.
The BPV search result is a set of BPVs and a predicted block P(i, j) for each partition type
for each sub-block. Next, the residual is calculated as R(i, j) = X(i, j) – P(i, j). Because there
are two options for the partition type, two residual sub-blocks are calculated, as follows:
1 One that is associated with the 2x2 partitions
2 One that is associated with the 2x1 partitions

The following steps are performed in parallel for each residual block:
1 Forward quantization is performed on all residual samples, and the quantized residuals Rq(i, j)
are used to calculate the entropy coding cost of each 2x2 sub-block.
2 Inverse quantization is performed to obtain the reconstructed residuals R(i, j) from which each
sub-block’s distortion is calculated.
3 For each 2x2 sub-block, the encoder selects between the 2x2 and 2x1 partitions, based on the
rate/distortion trade-off.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 33 of 173
Section 2: Theory of Operation (Informative)
Coding Modes

BP mode syntax includes the following:


• BPVs for each sub-block
• Entropy-coded quantized residuals

For RGB source content, BP is calculated in the YCoCg color space. If the source content
is YCbCr, BP will be calculated natively in the YCbCr color space.
BP operation from the decoder side is less complicated than BP operation from the encoder side,
as illustrated in Figure 2-15. The quantized residuals are obtained from the bitstream through the
entropy decoder, while the BPV values and partition structure are parsed directly. The BPV search
range is identical between the encoder and decoder, and consists of reconstructed samples that are
causally available.
The partition structure and the BPVs are used to generate the predicted block P(i, j), while the
quantized residuals are inverse quantized to obtain the reconstructed residuals R (i, j) . Finally,
these two are added together to generate the reconstructed block, which is subject to CSC
if necessary.

Reconstru cted Block


Bitstrea m
Entropy Inve rse
Reconstru ct
Decoding Qua ntization

Calculate
Par se BPV s
Pre dicted

Figure 2-15: BP Mode Operation from the Decoder Side

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 34 of 173
Section 2: Theory of Operation (Informative)
Coding Modes

2.6.3 MPP Mode


In Midpoint Prediction (MPP) mode, each sample in the current block is predicted from a single
value (the midpoint predictor), which is calculated as the average of spatially neighboring pixels.
The residual block in MPP mode is treated differently from the residual blocks in Transform and
BP modes. Rather than using a fractional quantizer and entropy coding, MPP mode uses a scalar
quantizer and directly codes quantized residual samples, using a fixed step-size for each sample that
is derived from the QP.
For RGB source content, MPP mode is tested in both the RGB and YCoCg color spaces. The color
space that generates the smaller RD cost is selected, and a color space flag is included in the
bitstream to indicate which color space is used for the MPP mode. For YCbCr source content,
MPP mode is tested only in native YCbCr mode, and the color space selection step is skipped.
In this case, the color space flag is omitted from the bitstream.
Figure 2-16 illustrates the encoder operation for MPP mode.

2x (RGB)
1x (YCbCr)

RG B source – Test ed in RGB and YCoCg color spaces.


YCbCr so urce – Test ed only in YCbCr color space.

Current Bloc k
Calculate Inverse Calculate
Quantizat ion
Residual Quantization Dist ortion

Recons truct ed Neighbors


Calculate Calculate
Calculate Rate
Midpoint RD Cos t

Select Color Spac e modeRdCost


wit h Mini mum
RD Cos t

Figure 2-16: Encoder Side MPP Mode Operation

Figure 2-17 illustrates the decoder operation for MPP mode. The quantized residuals are parsed
directly from the bitstream, and then inverse quantized using a scalar quantizer. These values are
then added to the midpoint predictor, and CSC is performed if required.

Reco nstru cted Blo ck


Bi tstr ea m
Inve rse
Pa rse r Reco nstru ct
Q uan tizati on

Reco nstru cted Neig hb or s


Cal cul ate
Mid po in t

Figure 2-17: Decoder Side MPP Mode Operation

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 35 of 173
Section 2: Theory of Operation (Informative)
Coding Modes

2.6.4 Fallback Modes


In addition to the three modes introduced so far, there are also two fallback modes:
• MPPF Mode
• BP-SKIP Mode

These fallback modes are selected only when all three non-fallback modes are too expensive to be
selected by the rate control mechanism. This may occur, for example, if an entire slice is complex
(e.g., white noise is present in all three color components). The combination of rate control and the
design of the encoder mode selection ensures that buffer overflow does not occur when a fallback
mode is selected. Furthermore, the fallback modes have rates that are strictly less than the average
block rate, which ensures that the rate buffer will not overflow.
Both fallback modes are discussed in the sub-sections that follow.

2.6.4.1 MPPF Mode


Midpoint Prediction Fallback (MPPF) mode is a modification of MPP mode with a reduced number
of bits per sample. The number of bits per sample for the MPPF mode are predetermined, based
on the compressed bit rate, and stored in the PPS because this mode must offer a rate that is strictly
less than the average block rate.
In addition, the number of bits per block required by MPPF mode is used by the encoder to refine
the mode selection mechanism such that during any block time, the number of bits available in the
rate buffer is greater than or equal to the number of bits per block required by MPPF mode.
As with MPP mode, MPPF mode for RGB input is tested in both the RGB and YCoCg color
spaces. These are tested in parallel and the one with the lowest RD cost is selected.
MPPF mode is typically used for slices that contain a significant amount of random noise, where
the rate buffer will be operating in a nearly full state for many block times.

2.6.4.2 BP-SKIP Mode


Block Prediction Skip (BP-SKIP) mode uses the result of the BPV search operation performed
by BP mode, but omits the quantized residuals from the bitstream. As such, the expected rate
is significantly lower than BP mode, while the expected distortion is higher.
For each 2x2 sub-block, the encoder selects between the 2x2 and 2x1 partitions, based on the
rate/distortion trade-off. BP-SKIP mode syntax includes only the set of BPVs.
BP-SKIP mode is useful as a fallback mode, and may offer better image quality than MPPF mode
for certain image content (e.g., graphics content). Additionally, BP-SKIP mode will save bits in the
syntax when BP mode produces a 0 residual in all three color components.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 36 of 173
Section 2: Theory of Operation (Informative)
Flatness Detection

2.7 Flatness Detection


A flatness detection algorithm is run for each block at the encoder to detect changes in image
content so that the QP can quickly adapt to the change. Each block’s flatness is calculated from
a complexity measure that is based on the Hadamard transform coefficients. A small complexity
value is associated with a “flat” block. A large complexity value is associated with a “complex”
block. The four flatness types are:
• Very flat
• Somewhat flat
• Complex-to-flat transition
• Flat-to-complex transition

If none of these flatness types are detected, the current block will not have flatness information.
Rate control uses the flatness type to update the QP value. For example, if the current block is
detected as a complex-to-flat transition (see Figure 2-18), the QP should be immediately decreased
to avoid creating artifacts within the flat portion of the block. Likewise, the QP can be increased for
a flat-to-complex transition.

“Complex” Block Complex-to-flat “Flat” Block

Figure 2-18: Complex-to-flat Transition Example

The flatness type is signaled explicitly in the bitstream as part of the mode header so that the
decoder does not need to perform the flatness detection steps.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 37 of 173
Section 2: Theory of Operation (Informative)
RC

2.8 RC
The rate control (RC) algorithm is responsible for determining the QP for each block time. The
QP is calculated implicitly and identically by both the encoder and decoder based on the RC state
(which must match). Because of this, the QP is not signaled in the bitstream syntax.
In general, RC aims to set a low QP value for easy/flat content and a high QP value for difficult/
complex content. This procedure will maximize image quality, because artifacts are most apparent
in flat regions, while complex regions provide visual masking, making the artifacts less perceptible.
The QP value for each block is determined by the following factors:
• Previous block QP
• Target bit rate
• Number of bits used to code the previous block
• Flatness information

The RC model incorporates a buffer model (the rate buffer) that is present in both the encoder
and decoder. From the encoder perspective, bits are placed into the rate buffer from the substream
multiplexer balance FIFOs. (See Section 2.10.) For constant bit rate (CBR) codec operation, bits
are removed from the rate buffer at a constant rate and placed into the bitstream for transmission.
The encoder must ensure that this rate buffer never underflows or overflows. From the decoder
perspective, bits enter the rate buffer from the bitstream at a constant rate and feed the substream
de-multiplexer’s funnel shifters. (See Section 2.10. For further details regarding the rate buffer
and associated delays, see Annex A.)
Figure 2-19 illustrates the RC model flow. This model is used by both the encoder and decoder
to implicitly derive the QP used by the block. Flatness information, calculated by the encoder,
is signaled explicitly in the bitstream so that both the encoder and decoder use the same flatness
type information for operating the RC model.

Initial Per-slice For Each Block


Init per-slice RC params
RC Parameters in Slice

Update Buffer Fullness


Update Flatness
and Target Rate

Target Rate
Previous QP
QP
Update QP Encoder Mode Selection Encode Best Mode

Previous Block Bits


RD Cost

Lambda
Calculate
Test All Modes
BF Lambda

Figure 2-19: RC Flow

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 38 of 173
Section 2: Theory of Operation (Informative)
EC – Transform and BP Modes Only

In addition to setting the QP, the RC model is responsible for ensuring that the rate buffer never
underflows or overflows. During each block time, a fixed number of bits are removed from the rate
buffer, while a variable number of bits are added depending on the image content and the mode
selected. The rate controller has both long- and short-term mechanisms that are used to prevent
rate buffer underflow and overflow:
• QP is a long-term mechanism that is maintained by RC. The QP is updated during each
block time. To avoid overflow, a high QP is selected if the rate buffer is almost full.
To avoid underflow, a low QP is selected if the rate buffer is almost empty.
• underflowPrevention mode is a short-term mechanism that is used to prevent rate buffer
underflow. This mode is enabled if the rate buffer is sufficiently close to being empty,
and will pad bits into the syntax to ensure that Transform and BP modes have rates that are
strictly greater than avgBlockBits. This will ensure that the rate buffer cannot underflow.
• Presence of the MPPF and BP-SKIP fallback modes is a short-term mechanism that is used to
prevent rate buffer overflow. By design, these modes have a rate that is less than avgBlockBits,
thereby ensuring that the rate buffer will not overflow if they are selected.

The lambda in Figure 2-19 is a Lagrangian parameter that is associated with the RD cost
calculation. (For further details, see Section 4.5.4.1.)
One final step performed by RC is to ensure a fixed number of bits for each block within a slice
(minBlockBits), which is determined by the compressed bit rate. The encoder mode selection will
disable any mode for the current block that causes the available rate per block for the remainder
of the slice to fall below minBlockBits.

2.9 EC – Transform and BP Modes Only


Entropy coding is used only by Transform and BP modes, as follows:
• Transform mode – Uses entropy coding for transmitting quantized transform coefficients
• BP mode – Uses entropy coding for transmitting quantized residuals

The other three modes use the following mechanisms for signaling quantized residuals:
• MPP and MPPF modes – All quantized residuals are coded with fixed-length codes
• BP-SKIP mode – No quantized residuals are present in the syntax

In the entropy coder, groups of samples are combined and then transmitted using a common prefix
to allow for efficient decoding. An entropy-coded group of samples is referred to as an “entropy
coding group” (ECG), and each component will be represented by one or more ECGs. Table 2-2
lists the ECGs that are used by Transform and BP modes.

Table 2-2: ECGs


ECG Description
Common-prefix Entropy Coding Each ECG of N samples is represented by a variable-length prefix, followed
(CPEC) by N fixed-length suffixes.
Vector Entropy Coding Each ECG of N samples is represented by a variable-length prefix, followed
(VEC) by a Golomb-Rice code that uniquely determines a vector of samples.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 39 of 173
Section 2: Theory of Operation (Informative)
EC – Transform and BP Modes Only

Entropy coder input is one of the following:


• Group of quantized residuals, –or–
• Group of quantized transform coefficients

Values within the group are either in the sign/magnitude (SM) or 2’s complement (2C) bit
representation. For a given ECG, the maximum number of bits for all samples within the group
is determined (bitsReq). Note that for SM bit representation, the sign bits will also be signaled
in the bitstream for any sample that has a non-zero value.
For example, suppose that a group of four samples is provided, such as [-8, 0, 14, 5]. Table 2-3 lists
the number of bits that are required for each of those samples. The required bitsReq for the group is
equal to the maximum number of bits required for all the samples within the group. In this example,
bitsReq is 4 bits in the SM bit representation and 5 bits in the 2C bit representation.

Table 2-3: Bits Required for Example Group of Samples that Use
SM and 2C Bit Representations
Sample Number of Bits Required Number of Bits Required
(SM) (2C)
-8 4 4
0 1 1
14 4 5
5 3 4

The entropy coder determines the ECGs for each component. The size of each ECG will depend
on the mode, color component, and chroma sampling format. For example, BP mode uses a
uniform ECG for each component, as follows:
• For 4:4:4, the 16 samples in each color component will be distributed among four ECGs
of four samples each.
• For 4:2:2, the luma component (16 samples) will be distributed among four ECGs
of four samples each. The chroma components (eight samples) will be distributed
among two ECGs of four samples each.
• For 4:2:0, the luma component (16 samples) will be distributed among four ECGs
of four samples each. The chroma components (four samples) will be distributed among
one ECG each.

The distribution of samples within ECGs is non-uniform for Transform mode due to the varying
frequency content that is represented by each transform coefficient. (See Section 4.6.1.5.)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 40 of 173
Section 2: Theory of Operation (Informative)
Slice Multiplexing

2.10 Slice Multiplexing


If slicesPerLine is greater than 1, slice multiplexing will be used to multiplex data from the slices
into the bitstream. This is done in units of chunks, where the chunk’s size (in bytes) is indicated
by PPS parameter chunk_size. (For further details regarding chunks, see Section 3.2.)
A slice’s total compressed bitstream is evenly split into an integer number of chunks that is equal
to the number of lines (raster lines rather than blocklines) within the slice.
In the example illustrated in Figure 2-20, slicesPerLine = 2. This bitstream will contain chunks in
the following order:
(s0, c0), (s1, c0), (s0, c1), (s1, c1), …, (s0, cN – 1), (s1, cN – 1), …
(s2, c0), (s3, c0), (s2, c1), (s3, c1), …, (s2, cN – 1), (s3, cN – 1), …
(s4, c0),…

where:
• A tuple (sx, cy) provides the slice and chunk indices, respectively
• N provides the number of chunks per slice

frame_width

sliceHeight Slice 0 Slice 1

sliceHeight Slice 2 Slice 3

sliceHeight ... ...

Figure 2-20: Example Configuration with slicesPerLine = 2,


in Which Chunks from Slices 0 and 1 Will Be Multiplexed
into the Bitstream, Followed by Chunks from Slices 2 and 3, etc.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 41 of 173
Section 2: Theory of Operation (Informative)
Substream Multiplexing

2.11 Substream Multiplexing


Substream multiplexing (SSM) is implemented to allow for parallel parsing of the compressed
bitstream. For each block, the bitstream syntax is distributed among four substreams, which will
depend on the coding mode. For example, the syntax for Transform mode is distributed between
the four substreams, as follows:
• Substream 0 – Mode header and mode-specific information
• Substream 1 – Component 0 data (Y)
• Substream 2 – Component 1 data (Co/Cb)
• Substream 3 – Component 2 data (Cg/Cr)

Table 2-4 describes SSM-related terms.

Table 2-4: Substream Multiplexing Terms


Term Description
Syntax Element Sum of the bits that comprise the information from one substream for a given block. Size depends
on the block mode, QP, substream index, and other factors:
• Minimum allowable syntax element size, ssmMinSeSize, is fixed at two bits
• Maximum allowable syntax element size, ssm_max_se_size, is set by the PPS, with a default
value of 128 bits
Encoder ensures that no syntax element is greater than ssm_max_se_size. If a coding mode generates
a larger syntax element during testing, the encoder will disable the mode for the current block time.
With one syntax element from each substream, the decoder will have sufficient bits to decode a block.
mux word Size is equal to ssm_max_se_size.
SSM communication between the encoder and decoder occurs through the use of a packet size
of one mux word. This means that the encoder cannot transmit a chunk of data to the decoder
at a granularity that is greater than one mux word.
Bits are removed from the balance FIFO and placed into the bitstream, one mux word at a time.
From the de-multiplexer perspective, a mux word will be requested when the funnel shifter fullness
is strictly less than ssm_max_se_size.
Balance FIFO, Used for each substream at the SSM encoder to ensure that sufficient bits are available to generate
ssmBalanceFifo a mux word when one is requested. Bits are placed in ssmBalanceFifo before being transmitted to the
(Encoder) rate buffer and bitstream. During the initial SSM delay, the balance FIFOs are filled without any bits
being removed. Parameters have been designed such that the balance FIFO does not overflow during
the initial SSM delay.
Funnel Shifter, Included by the SSM de-multiplexer for each substream. When the de-multiplexer receives
ssmFunnelShifter a mux word, the mux word will be placed into the funnel shifter. The de-multiplexer will
(Decoder) request a mux word whenever the ssmFunnelShifter fullness is strictly less than ssm_max_se_size.

The SSM multiplexer at the encoder includes a model of the de-multiplexer such that each mux
word is placed into the bitstream in the correct order.
An additional delay of one block time exists between Substream 0 and the other three substreams.
This is referred to as ssmSkew, and is done such that the decoder will receive mode-specific
information one block time before the data for each block.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 42 of 173
Section 3: Syntax (Normative)
PPS

3 Syntax (Normative)
This section describes the VDC-M syntax at three levels – picture, substream, and entropy
coding group.

3.1 PPS
C source code shall always be trusted when in conflict with the content of this Standard.
The picture parameter set (PPS) has a total byte length of 128 bytes. This PPS shall be transmitted
from the encoder to the decoder, such that both can produce the same model state. The mechanism
for transmitting the PPS is not normative, and may for example be outside the link.
Table 3-1 lists the syntax and size of each PPS field. RESERVED fields shall be given an integer
number that represents the total number of bits, without a designation of the type, and filled with 0s.
Any field listed as “Nu” is an unsigned N-bit integer. For example, the valid range for 7u is [0, 127].
Finally, the version_release field is a one-byte ASCII character within a specified range,
as follows:
• 0x00 – Null (e.g., v1.0 as opposed to v1.0a)
• 0x61 through 0x7a – Lowercase characters (a through z)

Table 3-1 also lists the address of each PPS field in terms of byte and doubleword addresses
(Bx and DWx, respectively). A doubleword contains four bytes; therefore, the total PPS size of
128 bytes corresponds to 32 doublewords. The bytes within a doubleword shall be in big-endian
order (i.e., the first byte in the bitstream shall be DW0[31:24] – the most significant byte).
For a PPS field that is greater than eight bits in size, the value shall be in big-endian order.
For example:
chunk_size = (B22 × 256) + B23

Parameters listed are either independent or dependent. Independent parameters include properties
of the source picture, encoder/decoder settings, LUTs, and tuning. These must be determined first.
Dependent parameters are configured based on the independent parameters. For example, many
of the PPS fields that relate to rate control are configured as a function of the slice size, compressed
bit rate, etc.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 43 of 173
Section 3: Syntax (Normative)
PPS

Table 3-1: PPS Fields


Field Bits Address Independent? Description
RESERVED 1 B0[7]
DW0[31]
version_major 7u B0[6:0] Yes Version major (x.yz).
DW0[30:24] Valid range in decimal
is [1, 127].
RESERVED 1 B1[7]
DW0[23]
version_minor 7u B1[6:0] Yes Version minor (x.yz).
DW0[22:16] Valid range in decimal
is [1, 127].
version_release 8 B2[7:0] Yes Version release (x.yz).
(1x DW0[15:8] ASCII character in
ASCII) {0x0, 0x61 through 0x7a}.
See T.50, Table 4/T.50.
pps_identifier 8u B3[7:0] Yes Unique PPS identifier.
DW0[7:0]
frame_width 16u B4[7:0] – B5[7:0] Yes Frame width (pixels).
DW1[31:16] Original frame width
(i.e., not including padding).
frame_height 16u B6[7:0] – B7[7:0] Yes Frame height (pixels).
DW1[15:0]
slice_width 16u B8[7:0] – B9[7:0] Yes Slice width (pixels).
DW2[31:16] Slice width should be a
multiple of 8 (i.e., includes
slice padding, if necessary)
and shall be at least 64 pixels.
slice_height 16u B10[7:0] – B11[7:0] Yes Slice height (pixels).
DW2[15:0] Slice height shall be at least
16 lines.
slice_num_px 32u B12[7:0] – B15[7:0] No, Total number of pixels
DW3[31:0] slice_num_px = per slice.
slice_width × Slice shall contain at least
slice_height 4096 pixels.
RESERVED 6 B16[7:2]
DW4[31:26]

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 44 of 173
Section 3: Syntax (Normative)
PPS

Table 3-1: PPS Fields (Continued)


Field Bits Address Independent? Description
bits_per_pixel 10u B16[7:0] – B17[1:0] Yes Compressed bit rate.
DW4[25:16] Full 10-bit unsigned
integer value is used
in all calculations.
Value includes four fractional
bits. Divide by 16 to determine
the effective bpp (actual bits/
pixel of the compressed
bitstream), such as in the
examples provided below.
bits_per_pixel Effective
bpp
96 6.0
97 6.0625
128 8.0
RESERVED 8 B18[7:0]
DW4[15:8]
bits_per_component 4u B19[7:4] Yes 0x0 = 8bpc.
DW4[7:4] 0x1 = 10bpc.
0x2 = 12bpc.
0x3 through 0xF
are RESERVED.
source_color_space 2u B19[3:2] Yes Source color space.
DW4[3:2] 0x0 = RGB source.
0x1 = YUV source.
0x2 and 0x3 are RESERVED.
chroma_format 2u B19[1:0] Yes Source chroma sampling
DW4[1:0] format.
0x0 = 4:4:4.
0x1 = 4:2:2.
0x2 = 4:2:0.
0x3 = RESERVED.
RESERVED 16 B20[7:0] – B21[7:0]
DW5[31:16]
chunk_size 16u B22[7:0] – B23[7:0] No, Chunk size used for slice
DW5[15:0] See Section 3.2 multiplexing (bytes).

RESERVED 16 B24[7:0] – B25[7:0]


DW6[31:16]

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 45 of 173
Section 3: Syntax (Normative)
PPS

Table 3-1: PPS Fields (Continued)


Field Bits Address Independent? Description
rc_buffer_init_size 16u B26[7:0] – B27[7:0] No, Initial state for rate buffer
DW6[15:0] See Annex A fullness (bits).

rc_stuffing_bits 8u B28[7:0] No, RC stuffing bits that are


DW7[31:24] See used to prevent underflow
Section 4.7.7 when underflowPrevention
is active (bits).
rc_init_tx_delay 8u B29[7:0] No, During the initial transmission
DW7[23:16] See Annex A delay, the rate buffer is
filled; however, no bits
are transmitted.
Measured in block times.
rc_buffer_max_size 16u B30[7:0] – B31[7:0] No, Rate buffer maximum
DW7[15:0] See Annex A fullness (bits).

rc_target_rate_threshold 32u B32[7:0] – B35[7:0] No, Threshold that is used for


DW8[31:0] See updating the RC target rate
Section 4.5.2 scale factor.
rc_target_rate_scale 8u B36[7:0] No, Scale factor that is used for
DW9[31:24] See calculating the RC target rate.
Section 4.5.2
rc_fullness_scale 8u B37[7:0] No, Scale factor that is used for
DW9[23:16] See calculating rcFullness.
Section 4.5.1.2
rc_fullness_offset_threshold 16u B38[7:0] – B39[7:0] No, Number of blocklines that
DW9[15:0] See are needed to ramp down the
Section 4.5.1.2 RC maximum buffer fullness
at the end of the slice
(shall be within the range
[3, sliceHeight / 2) – 1]).
rc_fullness_offset_slope 24u B40[7:0] – B42[7:0] No, Slope with which to ramp
DW10[31:8] See down the buffer fullness after
Section 4.5.1.2 rc_fullness_offset_threshold
is reached.
RESERVED 4 B43[7:4]
DW10[7:4]
rc_target_rate_extra_fbls 4u B43[3:0] Yes, Additional bits used for FBLS
DW10[3:0] See block target rate (bits/pixel).
Section 4.5.2
flatness_qp_very_flat_fbls 8u B44[7:0] Yes, QP for FBLS block flatType0
DW11[31:24] See flatness classification.
Section 4.5.3
flatness_qp_very_flat_nfbls 8u B45[7:0] Yes, QP for NFBLS block
DW11[23:16] See flatType0 flatness
Section 4.5.3 classification.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 46 of 173
Section 3: Syntax (Normative)
PPS

Table 3-1: PPS Fields (Continued)


Field Bits Address Independent? Description
flatness_qp_somewhat_flat_fbls 8u B46[7:0] Yes, QP for FBLS block flatType1
DW11[15:8] See flatness classification.
Section 4.5.3
flatness_qp_somewhat_flat_nfbls 8u B47[7:0] Yes, QP for NFBLS block
DW11[7:0] See flatType1 flatness
Section 4.5.3 classification.
flatness_qp_lut[0] 8u B48[7:0] Yes, LUT that is used to determine
DW12[31:24] See flatness QP as a function
Section 4.5.3 of buffer fullness.
… 8u …
flatness_qp_lut[7] 8u B55[7:0]
DW13[7:0]
max_qp_lut[0] 8u B56[7:0] Yes, LUT that is used for
DW14[31:24] See determining maximum QP.
Section 4.5.3
… 8u …
max_qp_lut[7] 8u B63[7:0]
DW15[7:0]
target_rate_delta_lut[0] 8u B64[7:0] Yes, LUT that is used for
DW16[31:24] See calculating the RC target rate.
Section 4.5.2
… 8u …
target_rate_delta_lut[15] 8u B79[7:0]
DW19[7:0]
RESERVED 8 B80[7:0]
DW20[31:24]
mppf_bits_per_comp (R/Y) 4u B81[7:4] Yes, Array that contains MPPF
DW20[23:20] See bits/component for each
Section 4.6.4 color component for
mppf_bits_per_comp (G/Cb) 4u B81[3:0] two color spaces.
DW20[19:16] First 12 bits = RGB –or–
mppf_bits_per_comp (B/Cr) 4u B82[7:4] YCbCr step size.
DW20[15:12] Last 12 bits = YCoCg
mppf_bits_per_comp (Y) 4u B82[3:0] step size.

DW20[11:8] See Section 4.6.4.

mppf_bits_per_comp (Co) 4u B83[7:4]


DW20[7:4]
mppf_bits_per_comp (Cg) 4u B83[3:0]
DW20[3:0]
RESERVED 24 B84[7:0] – B86[7:0]
DW21[31:8]

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 47 of 173
Section 3: Syntax (Normative)
Frame-level Syntax

Table 3-1: PPS Fields (Continued)


Field Bits Address Independent? Description
ssm_max_se_size 8u B87[7:0] Yes, Maximum SSM syntax
DW21[7:0] See Section 4.11 element size (SSM mux
word size is also equal
to ssm_max_se_size).
RESERVED 24 B88[7:0] – B90[7:0]
DW22[31:8]
slice_num_bits (upper byte) 8u B91[7:0] No, Total number of bits per slice.
DW22[7:0] slice_num_bits
=
slice_num_bits (lower four bytes) 32u B92[7:0] – B95[7:0] (slice_num_px
DW23[31:0] ×
bits_per_pixel)
>> 4
RESERVED 8 B96[7:0]
DW24[31:24]
chunk_adj_bits 8u B97[7:0] No, Adjustment per blockline
DW24[23:16] See to byte-align the chunk size.
Section 4.5.1.1
num_extra_mux_bits 16u B98[7:0] – B99[7:0] No, Overhead bits used by
DW24[15:0] See Section 4.11 substream multiplexing.

RESERVED 224 B100[7:0] –


B127[7:0]
DW25[31:0] –
DW31[31:0]

3.2 Frame-level Syntax


Slice multiplexing is used for a frame if the number of slices per line is greater than 1. In this case,
the slices are processed in raster order with the data from each blockline interleaved in units of
chunk_size, where the number of bytes within a chunk is equal to:
chunk_size = ( ( (slice_width × bits_per_pixel + 15) >> bppFractionalBits) + 7) >> 3

where:
• bppFractionalBits = 4

A chunk contains one line worth of data, and must be an integer number of bytes. Rounding of bits
up to the nearest byte is accomplished by ( (bits + 7) >> 3).

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 48 of 173
Section 3: Syntax (Normative)
Frame-level Syntax

Table 3-2 defines the chunk order within the bitstream. Note that each slice will contain two chunks
for each blockline.
Table 3-2: Frame-level Syntax Fields (Constant Bit Rate Mode)
Field Size (Bits) Format
for (slice_y = 0; slice_y < frame_height; slice_y += slice_height) {
for (blockline = 0; blockline < (slice_height / 2), blockline ++) {
for (slice_x = 0; slice_x < frame_width; slice_x += slice_width) {
first chunk from blockline of slice (slice_y, slice_x) chunk_size See Section 3.3
}
for (slice_x = 0; slice_x < frame_width; slice_x += slice_width) {
second chunk from blockline of slice (slice_y, slice_x) chunk_size See Section 3.3
}
}
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 49 of 173
Section 3: Syntax (Normative)
Slice-level Syntax

3.3 Slice-level Syntax


The slice-level syntax (see Table 3-3) is composed of mux words from the four substreams. These
mux words are contained within the slice-level syntax, in the order in which the de-multiplexer
will require them. Each mux word is exactly muxWordSize in bits. The de-multiplexer will
request a mux word for each substream for which the funnel shifter fullness is strictly less
than ssm_max_se_size.
The parameter ssmSeSize refers to the actual size (in bits) of a syntax element from a given
substream. From the encoder’s perspective, this will be found in the ssmSyntaxFifo. (See
Section 5.2.) From the decoder’s perspective, this specifies the number of bits that are needed
in a substream to decode the current block. (See Section 4.11.)

Table 3-3: Slice-level Syntax Fields


Field Size (Bits) Format
for (blockIdx = 0; blockIdx < blocksTotal; blockIdx ++) {
for (ssmIdx = 0; ssmIdx < 4; ssmIdx ++) {
if (funnelShifterFullness[ssmIdx] < ssm_max_se_size) {
Mux word from substream ssmIdx muxWordSize Substream data
(See Section 3.4)
funnelShifterFullness[ssmIdx] += muxWordSize
}
funnelShifterFullness[ssmIdx] -= ssmSeSize[ssmIdx][blockIdx]
}
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 50 of 173
Section 3: Syntax (Normative)
Substream-level Syntax

3.4 Substream-level Syntax


For each block within a slice, the compressed syntax is distributed among four substreams, as listed
in Table 3-4. The possible syntax associated with a block for a given substream is a syntax element.
The minimum size of a syntax element is two bits. The maximum size of a syntax element is
specified by PPS parameter ssm_max_se_size. Each substream includes different syntax within
its syntax element, as indicated in Table 3-4’s Format column.
Table 3-4: Substream-level Syntax for Substreams 0 through 3
Field Size (Bits) Format
for (blockIdx = 0; blockIdx < blocksTotal; blockIdx ++) {
syntaxElementSubstream[0] Variable See Table 3-5
(2 to ssm_max_se_size)
}
for (blockIdx = 0; blockIdx < blocksTotal; blockIdx ++) {
syntaxElementSubstream[1] Variable See Table 3-6
(2 to ssm_max_se_size)
}
for (blockIdx = 0; blockIdx < blocksTotal; blockIdx ++) {
syntaxElementSubstream[2] Variable See Table 3-7
(2 to ssm_max_se_size)
}
for (blockIdx = 0; blockIdx < blocksTotal; blockIdx ++) {
syntaxElementSubstream[3] Variable See Table 3-8
(2 to ssm_max_se_size)
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 51 of 173
Section 3: Syntax (Normative)
Substream-level Syntax

Table 3-5: syntaxElementSubstream[0]


Syntax Size (Bits) Format
modeSameFlag 1 Flag
if (modeSameFlag == 0) {
curBlockMode 2 to 3 Unsigned,
See Section 4.10
}
flatnessFlag 1 Flag
if (flatnessFlag == 1) {
flatnessType 2 Unsigned,
See Section 4.10
}
if cur block is Transform mode {
if cur block in NFBLS {
intraPredictor 3 Unsigned
See Section 4.6.1.1
}
}
else if cur block is MPP mode {
if (source_color_space == 0) {
mppColorSpace 1 Flag (RGB source only)
}
mppStepSize 3 bits if bpc ≤ 8 Unsigned
4 bits if bpc > 8
if chroma format is 4:4:4 {
4 samples from Component 0 4 × mppBitsPerComp[0] Signed
4 samples from Component 1 4 × mppBitsPerComp[1] Signed
4 samples from Component 2 4 × mppBitsPerComp[2] Signed
}
else if chroma format is 4:2:2 {
8 samples from Component 0 8 × mppBitsPerComp[0] Signed
}
else if chroma format is 4:2:0 {
6 samples from Component 0 6 × mppBitsPerComp[0] Signed
}
}
else if cur block is MPPF mode {
if (source_color_space == 0) {
mppfColorSpace 1 Flag (RGB source only)
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 52 of 173
Section 3: Syntax (Normative)
Substream-level Syntax

Table 3-5: syntaxElementSubstream[0] (Continued)


Syntax Size (Bits) Format
if chroma format is 4:4:4 {
4 samples from Component 0 4 × mppfBitsPerComp[0] Signed
4 samples from Component 1 4 × mppfBitsPerComp[1] Signed
4 samples from Component 2 4 × mppfBitsPerComp[2] Signed
}
else if chroma format is 4:2:2 {
8 samples from Component 0 8 × mppfBitsPerComp[0] Signed
}
else if chroma format is 4:2:0 {
6 samples from Component 0 6 × mppfBitsPerComp[0] Signed
}
}
else if cur block is BP/BP-SKIP mode {
bpvTable 4 Unsigned
See Section 4.6.2.6
BPV for Sub-block 0 FBLS – 5 or 10 Unsigned
NFBLS – 6 or 12 See Section 4.6.2.6
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 53 of 173
Section 3: Syntax (Normative)
Substream-level Syntax

Table 3-6: syntaxElementSubstream[1]


Syntax Size (Bits) Format
if cur block is Transform mode {
lastSigPos 2 to 4 See Section 4.6.1.5.1
entropy coding groups (4x) 4x (0 to 50) See Section 3.5
}
else if cur block is BP mode {
BPV for Sub-block 1 FBLS – 5 or 10 Unsigned
NFBLS – 6 or 12 See Section 4.6.2.6
entropy coding groups (4x) 4x (0 to 50) See Section 3.5
}
else if cur block is MPP mode {
if chroma format is 4:4:4 {
12 samples from Component 0 12 × mppBitsPerComp[0] Signed
}
else if chroma format is 4:2:2 {
8 samples from Component 0 8 × mppBitsPerComp[0] Signed
}
else if chroma format is 4:2:0 {
6 samples from Component 0 6 × mppBitsPerComp[0] Signed
}
}
else if cur block is MPPF mode {
if chroma format is 4:4:4 {
12 samples from Component 0 12 × mppfBitsPerComp[0] Signed
}
else if chroma format is 4:2:2 {
8 samples from Component 0 8 × mppfBitsPerComp[0] Signed
}
else if chroma format is 4:2:0 {
6 samples from Component 0 6 × mppfBitsPerComp[0] Signed
}
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 54 of 173
Section 3: Syntax (Normative)
Substream-level Syntax

Table 3-7: syntaxElementSubstream[2]


Syntax Size (Bits) Format
if cur block is Transform mode {
ecCompSkip 1 to 2 Flag
If (ecCompSkip == 0) {
lastSigPos 2 to 4 See Section 4.6.1.5.1
}
entropy coding groups (4x) 4x (0 to 50) See Section 3.5
}
else if cur block is BP mode {
BPV for Sub-block 2 FBLS – 5 or 10 Unsigned
NFBLS – 6 or 12 See Section 4.6.2.6
ecCompSkip 1 Flag
entropy coding groups (4x) 4x (0 to 50) See Section 3.5
}
else if cur block is MPP mode {
if chroma format is 4:4:4 {
12 samples from Component 1 12 × mppBitsPerComp[1] Signed
}
else if chroma format is 4:2:2 {
8 samples from Component 1 8 × mppBitsPerComp[1] Signed
}
else if chroma format is 4:2:0 {
4 samples from Component 0 4 × mppBitsPerComp[0] Signed
2 samples from Component 1 2 × mppBitsPerComp[1] Signed
}
}
else if cur block is MPPF mode {
if chroma format is 4:4:4 {
12 samples from Component 1 12 × mppfBitsPerComp[1] Signed
}
else if chroma format is 4:2:2 {
8 samples from Component 1 8 × mppfBitsPerComp[1] Signed
}
else if chroma format is 4:2:0 {
4 samples from Component 0 4 × mppfBitsPerComp[0] Signed
2 samples from Component 1 2 × mppfBitsPerComp[1] Signed
}
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 55 of 173
Section 3: Syntax (Normative)
Substream-level Syntax

Table 3-8: syntaxElementSubstream[3]


Syntax Size (Bits) Format
if cur block is Transform mode {
ecCompSkip 1 to 2 Flag
If (ecCompSkip == 0) {
lastSigPos 2 to 4 See Section 4.6.1.5.1
}
entropy coding groups (4x) 4x (0 to 50) See Section 3.5
}
else if cur block is BP mode {
BPV for Sub-block 3 FBLS – 5 or 10 Unsigned
NFBLS – 6 or 12 See Section 4.6.2.6
ecCompSkip 1 Flag
entropy coding groups (4x) 4x (0 to 50) See Section 3.5
}
else if cur block is MPP mode {
if chroma format is 4:4:4 {
12 samples from Component 2 12 × mppBitsPerComp[2] Signed
}
else if chroma format is 4:2:2 {
8 samples from Component 2 8 × mppBitsPerComp[2] Signed
}
else if chroma format is 4:2:0 {
2 samples from Component 1 2 × mppBitsPerComp[1] Signed
4 samples from Component 2 4 × mppBitsPerComp[2] Signed
}
}
else if cur block is MPPF mode {
if chroma format is 4:4:4 {
12 samples from Component 2 12 × mppfBitsPerComp[2] Signed
}
else if chroma format is 4:2:2 {
8 samples from Component 2 8 × mppfBitsPerComp[2] Signed
}
else if chroma format is 4:2:0 {
2 samples from Component 1 2 × mppfBitsPerComp[1] Signed
4 samples from Component 2 4 × mppfBitsPerComp[2] Signed
}
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 56 of 173
Section 3: Syntax (Normative)
ECG-level Syntax

3.5 ECG-level Syntax


Entropy coding is used for Transform and BP modes. There are four entropy coding groups (ECGs)
for each component. Each ECG contains between 0 to ecgMaxBits bits, where ecgMaxBits is
equal to 50, and is constructed as described in Section 4.7.2. The number of samples within
an ECG (ecgNumSamples) shall depend on the coding mode, chroma format, component skip,
and other factors.
A sample within an ECG will be denoted as ecgSample[x]. The process of entropy coding
is discussed further in Section 4.7. The process of entropy decoding is discussed further in
Section 5.3.3.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 57 of 173
Section 3: Syntax (Normative)
ECG-level Syntax

Table 3-9: ECG-level Syntax for a Given Component k


Syntax Size (Bits) Format
for (ecgIdx = 0; ecgIdx < 4; ecgIdx ++) {
if (ecgDataActive[ecgIdx]) { ecgDataActive is dependent on
coding mode, chroma format, and
lastSigPos for Transform mode
See Section 4.7.2
ecgSkip[ecgIdx] 1 Skip flag
if (ecgSkip[ecgIdx] == 0) {
bitsReq 1 to (bpc + 1) Unary prefix
See Table 4-62
if (BP mode && bitsReq ≤ 2) {
vecEcPrefix 1 to 12 Unary prefix
See Table 4-65
vecEcSuffix 1 to 5 Unsigned, See parameter vecGrK
See Section 4.7.6 (Table 4-64)
} else {
for (x = 0; x < ecgNumSamples; x ++) { ecgNumSamples is dependent on
coding mode, chroma format,
component skip, etc.
See Section 4.7.2
ecgSample[x] bitsReq Unsigned (SM ECG)
Signed (2C ECG)
}
}
}
}
if (underflowPrevention && ecgIdx > 0) { See Section 4.7.7
stuffing bits rc_stuffing_bits Fixed-length stuffing word
}
if (ecgIdx == 3 && ! isCompSkip) {
sign bits 0 to 16 See Section 4.7.8
if (Transform mode && ! (lastSigPos == 0 && k == 0)) {
signLastSigPos 1 See Section 4.6.1.5.1
}
}
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 58 of 173
Section 4: Encoding Process (Normative)
Overview

4 Encoding Process (Normative)


This section describes the operations that shall be performed by a VDC-M-compliant encoder.

4.1 Overview
C source code shall always be trusted when in conflict with the content of this Standard.
The encoder shall perform the encoding process, as illustrated in Figure 4-1, once per block time
within a slice, as follows:
1 Flatness detection shall be updated, based on source data from the current, previous, and next
blocks. The result of flatness detection is an updated flatness type for the current block.
2 Rate controller’s state shall be updated, which shall result in an updated buffer fullness value,
quantization parameter (QP), and target bit rate (targetRate).
3 Using the updated QP value, all available coding modes shall be tested. In the context of this
Standard, testing refers to the encoder’s simulation of the mode in which the reconstructed
block is calculated.
4 Number of required bits (modeRate) and error (modeDistortion) shall be calculated.
Additionally, the RD cost (modeRdCost) shall be computed from the rate and distortion,
as described in Section 4.9.
5 Encoder mode selection block shall select the best mode, based on several criteria.
(See Section 4.9.)
6 Selected mode shall be encoded.
7 Steps 1 through 6 shall be repeated for the next block.

Current Block
Flatness
Detection
Transform modeRate/modeDistortion/modeRdCost

Update
Rate Control State
Block Prediction modeRate/modeDistortion/modeRdCost

Test All Mode Encode


Midpoint Prediction modeRate/modeDistortion/modeRdCost
Coding Modes Selection Selected Mode

Fallback #1
modeRate/modeDistortion/modeRdCost
Midpoint Prediction Fallback

Fallback #2
modeRate/modeDistortion/modeRdCost
Block Prediction Skip

Figure 4-1: Encoder Overview

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 59 of 173
Section 4: Encoding Process (Normative)
Block Dimensions

4.2 Block Dimensions


The encoder will select a mode from the set of available modes for each block, based on the
trade-off between the rate and distortion of each mode. The time associated with performing all
operations required to encode a single block will be denoted as a block time. The dimension of each
component within a block depends on the source content’s chroma-sampling format, as illustrated
in Figure 4-2.

Componen t 0 Componen t 1 Componen t 2

4:4:4

4:2:2

4:2:0

Figure 4-2: Block Component Sizes for Different Chroma Sampling Formats

Table 4-1 defines variables that are used throughout this Standard to describe operations without
having to create special cases for the different chroma sampling formats.

Table 4-1: Component Width and Height Parameters for Different Chroma Sampling Formats
Component Size compSamples compSamplesLog2 compWidthLog2 compHeightLog2
8x2 16 4 3 1
4x2 8 3 2 1
4x1 4 2 2 0

For example, the average of all samples in a component could be written as follows:
X = ( (∑i, j ∈ X X(i, j) ) + (1 << (compSamplesLog2 – 1) ) ) >> compSamplesLog2

Bracket notation is used to obtain the value from Table 4-1 for a specific component. The variable k
is typically used to index between the three color components. For example, k == 0 will index
the red component in RGB, or the luma component in YCbCr/YCoCg. Thus, the three color
components of a block that uses 4:2:2 chroma sampling are determined, as follows:
• compSamples[0] = 16
• compSamples[1] = 8
• compSamples[2] = 8

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 60 of 173
Section 4: Encoding Process (Normative)
CSC

4.3 CSC
A lossless color-space conversion (CSC) is used to convert source content in the RGB color space
to YCoCg. The YCoCg representation requires the use of one extra bit of precision for the
chrominance (chroma) components (Co and Cg). The bit representation of the chroma components
shall be signed. Therefore, if the source content is RGB 8bpc, the luminance (luma) component
of YCoCg will be represented by eight unsigned bits, while the Co and Cg components will be
represented by nine signed bits. Table 4-2 lists implementations of forward and inverse transforms.

Table 4-2: Equation-based Implementation of the Color Space Transforms


between RGB and YCoCg Color Spaces
Color Space Transform Equations
RGB to YCoCg Co = R – B
temp = B + (Co >> 1)
Cg = G – temp
Y = temp + (Cg >> 1)
YCoCg to RGB temp = Y – (Cg >> 1)
G = Cg + temp
B = temp – (Co >> 1)
R = B + Co

In the case of RGB source content, CSC is used by flatness detection as well as by each individual
coding mode. If the source content is YCbCr, this step is skipped, and the codec natively handles
the YCbCr data. The reconstructed frame shall be stored in the same color space as the
source content.

4.4 Flatness Detection


Flatness detection is updated at each block time, based on the previous, current, and next blocks’
complexity. Block complexity is calculated as the sum of the absolute values of the normalized
AC Hadamard transform coefficients, as described in Section 4.4.2. The complexity values are used
to classify the current block into a set of possible flatness types, as listed in Table 4-3. A block that
is not classified as any of the four flatness types shall be denoted as a non-flat block.

Table 4-3: Flatness Types


Flatness Type Description
0 Current block is very flat.
1 Current block is somewhat flat.
2 Current block is a complex-to-flat transition.
3 Current block is a flat-to-complex transition.

The flatness type is signaled explicitly in the bitstream as part of the flatness header. (See
Section 4.10.) For this reason, flatness detection shall not be performed at the decoder side.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 61 of 173
Section 4: Encoding Process (Normative)
Flatness Detection

4.4.1 Hadamard and Haar Transforms


For RGB source content, source data shall be color-space transformed, and the Hadamard transform
shall be applied in the YCoCg color space. This step shall be skipped for YCbCr source content.
Before the Hadamard transform is applied, shift values shall be pre-calculated, as described in
Table 4-4, based on the source content’s bit depth (bpc). The value hadPrecision = 6 bits. These
shifts shall ensure that the Hadamard coefficients are never more than 16 bits in dynamic range.
The parameter hadShiftA shall be applied during the first pass of the transform, which is the
Hadamard transform applied to the block’s rows. The parameter hadShiftB shall be applied during
the second pass, in which the Haar transform is applied to the block’s columns. The total shift
(hadTotalShift) shall be stored for the normalization step.

Table 4-4: Hadamard Shift Parameters

bpcInput = {
bpc,
bpc + 1, component is Co or Cg
otherwise
bpcTemp = hadPrecision + compWidthLog2 + bpcInput
hadShiftA = max (0, bpcTemp – 16)
bpcTemp = min (bpcTemp, 16)
hadShiftB = compHeightLog2 + bpcTemp – 16
hadTotalShift = hadShiftA + hadShiftB

Figure 4-3 illustrates the 8-point forward Hadamard transform. This transform shall be applied
to the rows of each component that has a width of eight samples. This shall be the case for all
components if 4:4:4 chroma subsampling is used. For 4:2:2 and 4:2:0 content, the 8-point transform
shall be applied to the luma component, while the 4-point transform illustrated in Figure 4-4 shall
be applied to the two chroma components.

Symbol Weight
+1
-1

X0 X0 S0 E0 T0 T0

X1 X2 S1 E1 T6 T1

X2 X1 S2 F0 T5 T2

X3 X3 S3 F1 T3 T3

X4 X7 D0 E2 T4 T4

X5 X5 D2 E3 T2 T5

X6 X6 D1 F2 T1 T6

X7 X4 D3 F3 T7 T7

Figure 4-3: 8-point Forward Hadamard Transform

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 62 of 173
Section 4: Encoding Process (Normative)
Flatness Detection

Symbol Weight
+1
-1

X0 S0 T0

X1 S1 T1

X2 D0 T2

X3 D1 T3

Figure 4-4: 4-point Forward Hadamard Transform

The second transform pass is a Haar transform (equivalent to a 2-point forward Hadamard
transform), which is applied to the block’s columns. This step shall be skipped for the chroma
components of 4:2:0 source data. Finally, the Hadamard transform coefficients are normalized,
as described in Table 4-5.

Table 4-5: Normalization of Hadamard Transform Coefficients


shift = hadPrecision + 2 – hadTotalShift
HAD(i, j) = (HAD(i, j) + (1 << (shift – 1) ) ) >> shift

4.4.2 Complexity
The complexity value of each block is calculated from the sum of the absolute value of the
normalized Hadamard transform coefficients. This sum shall exclude the DC coefficient
(HAD(0, 0)). If the input bit depth is greater than 8bpc, an additional scaling is performed.
(See Table 4-6.)

Table 4-6: Complexity Measure for Flatness Detection Calculation


if (bpc == 8) {
absHadCoeffs = ∑i, j ∈ (X – X (0, 0) ) | HAD(i, j) |
} else {
shift = bpc – 8
absHadCoeffs = ∑i, j ∈ (X – X (0, 0) ) | HAD(i, j) + (1 << (shift – 1) ) ) >> shift |
}
temp = (absHadCoeffs + (1 << (compSamplesLog2 – 1) ) ) >> compSamplesLog2

complexity = { (temp + 1) >> 1,


temp,
Co or Cg component
otherwise

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 63 of 173
Section 4: Encoding Process (Normative)
Flatness Detection

4.4.3 Classification
The current block’s flatness is classified into different types, based on the previous, current,
and next blocks’ complexity values. These are denoted as complexityPrev, complexityCur,
complexityNext. The classification is performed as follows:
1 isPrevBlockComplex and isNextBlockFlat values are calculated, as described in Table 4-7
and Table 4-8. Parameter maxLineComplexity represents the maximum complexity of
a block within the current blockline, which is updated during each block time, using the
following logic:
complexityCur, first block in blockline
{
maxLineComplexity = max (maxLineComplexity, complexityCur), otherwise

The maxLineComplexity calculation is done such that the block complexity thresholds can
adapt to changing content throughout a slice.

Table 4-7: Flatness Detection Previous Block Complexity Calculationa


if (first block in blockline) {

isPrevBlockComplex = {
false,
complexityPrev > 90, complexityCur ≤ 50
otherwise
} else if (last block in blockline) {
isPrevBlockComplex = (complexityCur ≤ 50)
} else {
if (chroma format is 444 or 422) {
threshold = (maxLineComplexity >> 1) + (maxLineComplexity >> 3)

{
true, complexityNext ≤ 6
complexityPrev > threshold, complexityNext ≤ 25
isPrevBlockComplex =
false, otherwise
} else if (chroma format is 420) {

isPrevBlockComplex = { true,
false,
complexityNext ≤ 20 && complexityPrev > 40
otherwise
}
}
a. complexityPrev is 0 for the first block within a slice.

Table 4-8: Flatness Detection Next Block Flatness Calculation


if (! isEdgeColumn) {
conditionA = (complexityCur – complexityNext) > (maxLineComplexity >> 4)
conditionB = complexityNext < ( (maxLineComplexity >> 2) + (maxLineComplexity >> 3) )

isNextBlockFlat = { true,
false,
conditionA && conditionB
otherwise
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 64 of 173
Section 4: Encoding Process (Normative)
Flatness Detection

2 Per-block flatness classification is performed, as described in Table 4-9. Note that


isEdgeColumn is true for any block within the first and last column of a given slice.
isFlatToComplex is a three-element Boolean array that is shifted by one position during
each block time, where the three elements represent the previous-previous, previous, and
current block time flat-to-complex indicators, respectively. isFlatToComplex is implemented
such that a one-block delay can be observed between the determination of a flat-to-complex
transition and the associated QP update.
The flatness type assigned to the current block is the flatType from Table 4-9 that has a true
Boolean value and the lowest index. For example, if flatType0 is true, and flatType2 is also
true, flatType0 will be the final flatness type. If all four flatType are false, the current block
will be classified as a non-flat block.

Table 4-9: Flatness Detection Classification Rules


flatType0 = (complexityCur ≤ 1)
flatType1 = (complexityCur > 1) && (complexityCur ≤ 3)
if (isEdgeColumn) {
flatType2 = isPrevBlockComplex
isFlatToComplex[2] = false
} else {
flatType2 = isPrevBlockComplex && isNextBlockFlat
conditionA = (complexityPrev ≤ 3) && (complexityNext ≥ 50)
conditionB = (! isFlatToComplex[0]) && (! isFlatToComplex[1])
isFlatToComplex[2] = conditionA && conditionB
}
flatType3 = isFlatToComplex[1]

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 65 of 173
Section 4: Encoding Process (Normative)
RC

4.5 RC
The rate control algorithm shall perform a sequence of operations for each block time
to ensure the following:
• Number of bits that are used to code a slice is bounded, as follows:
• sliceBits <= (8 × chunk_size × slice_height)
• Rate buffer does not underflow or overflow
• At least one coding mode is available for mode selection, for each block
• Number of bits remaining in the rate buffer at the end of a slice is less than a specified threshold

This Standard discusses the ideal rate buffer. A hardware implementation may require an increased
rate buffer size, however, for proper operation.
At the beginning of each block time, the rate control algorithm shall update several quantities that
will be used by the remainder of operations during that block time, as follows:
1 Rate buffer fullness (bufferFullness, rcFullness) is updated, based on the number of bits that
are used to code the previous block, as well as the number of bits transmitted to the bitstream.
(For further details, see Section 4.5.1.)
2 targetRate for the current block time is calculated, as described in Section 4.5.2.
3 QP value (qp) is updated. (See Section 4.5.3.) This QP will be used by each of the coding
modes during the mode testing phase.

After all modes have been tested, and an RD cost has been calculated for each mode, the encoder
shall select a mode from the available coding modes that minimizes the RD cost while enforcing
proper rate buffer operation. Any mode that violates the rate buffer constraints will be disabled
for the current block time. Finally, the mode selected by rate control shall be encoded, and the
corresponding rate (modeRate) shall be used to update the buffer fullness, target rate, and QP for
the next block time.
Figure 4-5 illustrates the loop that is used for updating the buffer fullness, target rate, and QP.
Each of these loops runs once per block period. The flatness detection algorithm (see Section 4.4)
provides a flatnessType to the QP update logic.

Update Rate Buffer


Fullness

bufferFullness,
rcFullness
Update Target Rate

flatnessType targetRate
Update QP
Encode Best Mode
modeRate
qp

Test All Modes Encoder Mode Selection

Figure 4-5: RC Loop

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 66 of 173
Section 4: Encoding Process (Normative)
RC

4.5.1 Rate BF
The rate controller shall maintain two measures of rate buffer fullness (BF), as described
in Table 4-10.

Table 4-10: Rate BF Measures


Rate BF Measure Description Described In
bufferFullness Number of bits present within the rate buffer. Shall always be within Section 4.5.1.1
the range [0, rc_buffer_max_size].
rcFullness Abstracted version of the buffer fullness that considers the current Section 4.5.1.2
block’s position within the slice. Used to ensure that bufferFullness
is less than rc_buffer_init_size at the end of a slice. A complementary
offset shall be used at the beginning of a slice to ensure that a rate
buffer overflow does not occur during the initial transmission delay.

4.5.1.1 BF
The physical rate buffer’s fullness (bufferFullness) shall be updated at the beginning of each block
time. First, bufferFullness shall be incremented by the number of bits that are used to code the
previous block (modeRate). Next, bits shall be removed from the rate buffer for any block time
after the initial transmission delay, which is determined by PPS parameter rc_init_tx_delay
(measured in block times).
After the initial transmission delay, avgBlockBits bits shall be removed from the rate buffer,
and then placed into the bitstream for each block time. The parameter avgBlockBits gives a block’s
average bit rate, which is calculated from PPS parameter bits_per_pixel, as follows:
avgBlockBits = bits_per_pixel

Because PPS parameter bits_per_pixel is represented with four fractional bits of precision,
and there are 16 pixels in an 8x2 block, avgBlockBits shall always have an integer value.
For the last block time during which a specific chunk is being filled, additional bits may be
removed from the rate buffer and placed into the chunk to byte-align the chunk. The total
number of alignment bits for one blockline (two chunks) shall be calculated, as follows:
chunk_adj_bits = (16 × chunk_size) –
( ( (slice_width << 1) × bits_per_pixel) >> bppFractionalBits)

where blChunkAdjBits has a minimum and maximum value of 0 and 15 bits, respectively.
For an individual chunk, the total adjustment bits will be calculated, as follows:
(blChunkAdjBits + 1) >> 1, even chunk (0, 2, 4, …)
chunkAdjBits = { blChunkAdjBits >> 1, odd chunk (1, 3, 5, …)

Therefore, chunkAdjBits has a maximum value of eight bits (chunkAdjBitsMax = 8).

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 67 of 173
Section 4: Encoding Process (Normative)
RC

4.5.1.1.1 Underflow Prevention


The encoder shall set a flag to activate a rate buffer underflow prevention mechanism for any block
time in which underflow is possible, as follows:
true, bufferFullness < (2 × avgBlockBits + chunkAdjBitsMax)
underflowPrevention = { false, otherwise

where:
• chunkAdjBitsMax = (blChunkAdjBits + 1) >> 1

If the underflowPrevention flag is set, the two mechanisms described in Table 4-11 shall be used
to ensure that modeRate is strictly greater than avgBlockBits for the current block time for all
non-fallback modes, which is sufficient to ensure that at least one coding mode is available to the
encoder mode-selection algorithm. Note that fallback modes shall not be used in this situation
because the fallback mode rates will fall below avgBlockBits.

Table 4-11: Mechanisms that Ensure modeRate Is Strictly Greater than


avgBlockBits when underflowPrevention Flag Is Set
Mode Description
Transform and BP Fixed-size stuffing bits (with value 0) shall be inserted into the ECG syntax
to increase the modeRate. (See Section 4.7.7.)
MPPa QP shall be set equal to flatness_qp_very_flat_fbls. This is done to ensure
that the MPP mode rate is sufficient because fewer samples are present
in the chroma components for 4:2:2 and 4:2:0 content.
a. 4:2:2 and 4:2:0 chroma-sampled content.

4.5.1.1.2 Overflow Prevention


Rate buffer overflow prevention is enforced through use of the mode selection algorithm. (See
Section 4.9.) A mode will be disallowed for the current block time if its selection would cause
a bufferFullness overflow. The presence of fallback modes (MPPF and BP-SKIP) helps to avoid
overflow because these modes have rates that are strictly less than avgBlockBits. That is, in the
worst case, at least one fallback mode will always be available to encoder mode selection that will
cause a decrease in bufferFullness (thus avoiding an overflow).

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 68 of 173
Section 4: Encoding Process (Normative)
RC

4.5.1.2 RC Fullness
The rate controller maintains a second representation of the rate buffer fullness (rcFullness) to
enforce a maximum fullness of the encoder rate buffer at the end of a slice. rcFullness ensures that
an overflow will not occur during the initial transmission delay of the following slice. rcFullness is
represented as an unsigned 16-bit integer, and shall be updated at the beginning of each block time,
using the updated bufferFullness value. Table 4-12 lists the constants that shall be used for
calculating rcFullness.

Table 4-12: Constants Used for Calculating rcFullness


Constant Description
rcFullnessScaleApproxBits = 4 Additional bits of precision.
rcFullnessRangeBits = 16 Dynamic range of rcFullness.
rcOffsetBits = 16 Additional bits of precision that are used in tracking rcOffset.

The rcFullness calculation involves a scale factor rc_fullness_scale, which is pre-calculated and
stored in the PPS, as follows:
1 << (rcFullnessScaleApproxBits + rcFullnessRangeBits)
rc_fullness_scale =  rc_buffer_max_size 
The rc_fullness_scale value shall be represented by eight unsigned bits.
The rcFullness calculation includes two offset parameters (rcOffsetInit and rcOffset), which depend
on the current block’s position within the slice.
Offset parameter rcOffsetInit is used for blocks that occur during the initial transmission delay at
the beginning of a slice. This is to compensate for the fact that during the initial transmission delay,
bits will enter the rate buffer but will not be transmitted to the bitstream. The initial rcOffsetInit
value is calculated, as follows:
rcOffsetInit = rc_init_tx_delay × avgBlockBits

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 69 of 173
Section 4: Encoding Process (Normative)
RC

The rcOffsetInit value shall decrease by avgBlockBits for each block time, until rcOffsetInit is 0.
At this point, rcOffsetInit shall be 0 for all remaining blocks within the slice. This behavior is
illustrated in Figure 4-6.

%XIIHU5DQJH

UF2IIVHW,QLW

UFBEXIIHUBLQLWBVL]H

 %ORFN7LPH
7 7[ 71

(IIHFWLYH%XIIHU5DQJH

2IIVHW

Figure 4-6: rcOffsetInit as a Function of Block Time within a Slice

where:
• T0 is the first block time within the slice
• TN is the last block time within the slice
• Tx is the block time at the end of the initial transmission delay

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 70 of 173
Section 4: Encoding Process (Normative)
RC

Offset parameter rcOffset is an increasing positive offset that is applied throughout the slice, as
illustrated in Figure 4-7 and described in Table 4-13. rcOffset is used to ramp down the effective
rate buffer size so that the rate buffer fullness at the end of the slice is less than rc_buffer_init_size,
which is necessary to ensure correct rate buffer behavior from one slice to the next. The rcOffset
value is 0 for all block times within a slice for which the block index is less than a threshold
rcOffsetStart (indicated as Ty in Figure 4-7). rcOffsetStart is derived from PPS parameter
rc_fullness_offset_threshold, which is the number of blocklines over which rcOffset will ramp up.
The relationship is determined, as follows:
rcOffsetStart = blocksPerSlice – (rc_fullness_offset_threshold × blocksPerBlockLine)

UF2IIVHW
UFBEXIIHUBPD[BVL]H±
UFBEXIIHUBLQLWBVL]H

 %ORFN7LPH
7 7\ 71

(IIHFWLYH%XIIHU5DQJH

2IIVHW

Figure 4-7: rcOffset as a Function of Block Time within a Slice

where:
• T0 is the first block time within the slice
• TN is the last block time within the slice
• Ty is the block time in which rcOffset starts to ramp up from 0, which is controlled
by PPS parameter rc_fullness_offset_threshold

Table 4-13: rcOffset BF Calculation


if (blockIndex ≥ rcOffsetStart) {
bufFracBitsAccum += (rc_fullness_offset_slope & 0xFFFF)
rcOffset = (rc_fullness_offset_slope >> rcOffsetBits) + (bufFracBitsAccum >> rcOffsetBits)
bufFracBitsAccum &= 0xFFFF
if (isLastBlock && bufFracBitsAccum ≥ (1 << (rcOffsetBits – 1) ) ) {
rcOffset += 1
}
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 71 of 173
Section 4: Encoding Process (Normative)
RC

Because rc_fullness_offset_threshold is measured in blocklines, the following bounds represent


the valid range:
slice_height
rc_fullness_offset_threshold ∈ [ ( 3, 2 ) ]
–1

The rate at which rcOffset increases is also pre-calculated and stored in the PPS (using 16 unsigned
bits), as follows:
(rc_buffer_max_size – rc_buffer_init_size) << rcOffsetBits
rc_fullness_offset_slope =  rc_fullness_offset_threshold × blocksPerBlockLine 
rcFullness is then calculated from bufferFullness, rcOffset, and rcOffsetInit, as described
in Table 4-14. This is done once per block time, as illustrated earlier in Figure 4-5.

Table 4-14: BF Calculation


temp = rc_fullness_scale × (bufferFullness + rcOffset + rcOffsetInit)
rcFullness = clip (0, (1 << rcFullnessRangeBits) – 1, temp >> rcFullnessScaleApproxBits)

4.5.2 Target Bit Rate


The rate control algorithm shall calculate the target bit rate (targetRate) at the beginning of each
block time. targetRate is calculated based on the number of bits and pixels that remain within the
slice. At the beginning of a slice, these two values shall be as follows:
B0 = (8 × slice_height × chunk_size)
P0 = (slice_num_px)
where:
• B is the number of bits that remain within the slice
• P is the number of pixels that remain within the slice

targetRate is calculated as the sum of three parameters:


targetRate = targetRateBase + targetRateDelta + targetRateDeltaFbls

Parameter targetRateBase approximates the average bit rate for all remaining blocks within a slice,
and is calculated from the number of bits and pixels that remain within the slice at the current
block time. The rate controller maintains a scaling factor targetRateScale, which is updated
per block time, as described in Table 4-15.

Table 4-15: Update Target Rate Scaling Factor and Threshold


if (P < targetRateThreshold) {
targetRateScale –= 1
targetRateThreshold = 1 << (targetRateScale – 1)
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 72 of 173
Section 4: Encoding Process (Normative)
RC

The targetRateScale value for the first block time within a slice is equal to PPS parameter
rc_target_rate_scale, which is calculated as follows:

rc_target_rate_scale =  log2 (P0)  + 1


where:
• P0 is the total number of pixels within the slice

Additionally, the rate controller maintains a threshold targetRateThreshold to determine when


targetRateScale shall be updated. This threshold is applied as a lower bound to the number of pixels
that remain within the slice. When the number of pixels that remain within the slice is smaller than
targetRateThreshold, both targetRateScale and targetRateThreshold shall be re-calculated. This
logic is described in Table 4-15. A slice’s initial targetRateThreshold value is stored in PPS
parameter rc_target_rate_threshold, which is calculated as follows:
rc_target_rate_threshold = 1 << (rc_target_rate_scale – 1)

After targetRateThreshold and targetRateScale are updated, targetRateBase shall be


calculated as indicated in Table 4-16. Table 4-17 lists the constants that shall be used
for calculating targetRateBase.

Table 4-16: targetRateBase Target Rate Calculation


p = min (63, ( (P << targetRateBaseBits) + (1 << (targetRateScale – 1) ) ) >> targetRateScale)
a = (targetRateScale + targetRateBaseBits)
targetRateBase = (16 × B × targetRateInverseLut[p – 32] + (1 << (a – 1) ) ) >> a

Table 4-17: Constants Used for Calculating targetRateBase


Constant Description
targetRateBaseBits = 6 Additional bits of precision that are used in calculating targetRateBase.
targetRateLutBits = 4 Bits that are used to address target_rate_delta_lut, which has 16 entries.
targetRateInverseLut Array of 32, 8-bit values that are used to store an approximation of 1
for x ∈ [0.5, 1].
x

The offset targetRateDelta is calculated as a function of rcFullness, using look-up table


PPS parameter target_rate_delta_lut, as described in Table 4-18.

Table 4-18: targetRateDelta Target Rate Calculation


shift = rcFullnessRangeBits – targetRateLutBits
clipMax = (1<< targetRateLutBits) – 1
index = clip (0, clipMax, (rcFullness + (1 << (shift – 1) ) ) >> shift)
targetRateDelta = target_rate_delta_lut[index]

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 73 of 173
Section 4: Encoding Process (Normative)
RC

Offset targetRateDeltaFbls applies to blocks within the first blockline of a slice and is calculated,
as follows:
16 × rc_target_rate_extra_fbls, FBLS block
targetRateDeltaFbls = { 0, NFBLS block

4.5.3 QP Update
The QP value shall be updated at the beginning of each block time, after targetRate is updated.
QP is used in the testing phase for all coding modes. Figure 4-8 illustrates the QP update logic.

rcFullness

RC Mode Classification qpUpdateMode


Delta QP LUT
Based on rcFullness

deltaQp
prevBlockRate
diffBits qpIndex prevQp
Calculate diffBits Calculate qpIndex +
flatnessType
targetRate
tempQp
maxQpLut
finalQp
minQp Clip Flatness QP Adjustment

Figure 4-8: RC QP Update Logic

The QP update logic executes, as follows:


1 QP shall be updated based on the difference between the rate that was used for coding the
previous block and the current block’s target rate, as follows:
diffBits = prevBlockRate – targetRate
2 diffBits value shall then map to a qpIndex value, using the mapping illustrated in Figure 4-9.
For example, if diffBits is equal to +40 and the chroma format is 4:4:4, qpIndex is equal to 2.

diffBits ≥ 0 diffBits < 0

qpIndex qpIndex
0 1 2 3 4 5 0 1 2 3 4
(4:4:4) abs(diffBits) (4:4:4) abs(diffBits)
0 10 29 50 60 70 0 10 20 35 65

qpIndex qpIndex
0 1 2 3 4 5 0 1 2 3 4
(4:2:2) abs(diffBits) (4:2:2) abs(diffBits)
0 9 26 45 54 63 0 9 18 31 58

qpIndex qpIndex
0 1 2 3 4 5 0 1 2 3 4
(4:2:0) abs(diffBits) (4:2:0) abs(diffBits)
0 8 23 40 48 55 0 8 16 28 51

Figure 4-9: QP Update Mapping from diffBits to qpIndex

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 74 of 173
Section 4: Encoding Process (Normative)
RC

3 Current rate buffer fullness state will be mapped to a qpUpdateMode value, using the mapping
illustrated in Figure 4-10.

qpUpdateMode 4 3 0 1 2
rcFullness
0 7864 15729 49807 57672 65535

Figure 4-10: QP Update Mapping from rcFullness to qpUpdateMode

4 Parameter deltaQp is accessed from a pre-defined constant LUT that is addressed


by qpUpdateMode and qpIndex, as follows:
qpIncrementTable[qpUpdateMode][qpIndex], diffBits ≥ 0
deltaQp = { -qpDecrementTable[qpUpdateMode][qpIndex], diffBits < 0

5 QP for the current block is calculated as the sum of the previous block time QP (prevQp)
and deltaQp, clipped between a minimum and maximum QP value, as follows:
qp = clip (minQp + minQpOffset, maxQp, prevQp +deltaQp)
where:
• minQp is the minimum allowable QP, which is calculated as a function of bit depth,
as follows:
• 8-bpc source content – minQp is 16
• 10-bpc source content – minQp is 0
• 12-bpc source content – minQp is -16
• minQpOffset – See Table 4-19
• maxQp – See Table 4-20
• This maxQp value overrides the global maximum QP value of 72;
thus, it can also be considered to be an offset from the global maximum QP

Table 4-19: minQpOffset Calculation for RC QP Update


minQpOffset = 8
if ( (bufferFullness ≤ (rc_buffer_init_size >> 2) ) || (rcFullness < 9830) ) {
minQpOffset = 0
}

Table 4-20: maxQp Calculation for RC QP Update


maxQp = max_qp_lut[rcFullness >> 13]
if (rcFullness > 62259) {
maxQp = max_qp_lut[7] + 4
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 75 of 173
Section 4: Encoding Process (Normative)
RC

6 QP will be further updated, based on the current block’s flatness classification, as described
in Table 4-21 through Table 4-24. (For further details regarding flatness detection, see
Section 4.4.) The set of flatness types and the effect of QP update for each are discussed in
the sub-sections that follow. Note that this is the final QP value as maintained by the rate
controller. The quantizer used by each coding mode shall operate on a modified QP value,
which is derived from this rate control QP, while also factoring in the bit depth. (See
Section 4.8 for further details.)

Table 4-21: QP Update for Flatness Type 0 (Very Flat)


X ∈ FLS
qpTarget = { flatness_qp_very_flat_fbls,
flatness_qp_very_flat_nfbls, otherwise
qp = min (qp, qpTarget)

Table 4-22: QP Update for Flatness Type 1 (Somewhat Flat)


X ∈ FLS
qpTarget = { flatness_qp_somewhat_flat_fbls,
flatness_qp_somewhat_flat_nfbls, otherwise
qp = min (qp, qpTarget)

Table 4-23: QP Update for Flatness Type 2 (Complex-to-flat)


qpTarget = flatness_qp_lut [rcFullness >> (rcFullnessRangeBits – flatnessQpLutBits) ]
qp = min (qp, qpTarget)

Table 4-24: QP Update for Flatness Type 3 (Flat-to-complex)


qpTarget = flatness_qp_lut [rcFullness >> (rcFullnessRangeBits – flatnessQpLutBits) ]
qp = max (qp, qpTarget)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 76 of 173
Section 4: Encoding Process (Normative)
RC

4.5.4 RD Cost Calculation


The RD cost value (modeRdCost) is calculated as a function of the rate (modeRate) and distortion
(modeDistortion), as described in Table 4-25. The two Lagrangian parameters (lambdaBitrate and
lambdaFullness) are described in the following sub-sections.

Table 4-25: modeRdCost Calculation


modeRdCost = modeDistortion + ( (modeRate × lambdaBitrate × lambdaFullness) >> 13)

4.5.4.1 BF Lambda – Encoder Only

lambdaFullness is calculated once at the beginning of each block time, using the updated
rcFullness value. Because the RD cost calculation is needed only for encoder mode selection,
the lambdaFullness calculation will not be performed during decoding.
The pre-defined array lambdaFullnessLut is stored by the encoder and has 16 entries. Interpolation
is used to approximate a 6-bit look-up from the stored 4-bit LUT, as described in Table 4-26.
The first step reduces the 16-bit scale of rcFullness down to six bits.

Table 4-26: lambdaFullness Encoder Calculation


idx = clip (0, 63, (rcFullness + 512) >> 10)
idxMod = idx & 0x03
a = lambdaFullnessLut [clip (0, 15, (idx >> 2) ) ]
b = lambdaFullnessLut [clip (0, 15, (idx >> 2) + 1) ]
lambdaFullness = ( (4 – idxMod) × a + idxMod × b + 2) >> 2

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 77 of 173
Section 4: Encoding Process (Normative)
RC

4.5.4.2 Bit Rate Lambda – Encoder Only

lambdaBitrate is calculated for each coding mode during a block time, based on the modeRate
determined during the testing phase. Because the RD cost calculation is needed only for
encoder mode selection, the lambdaBitrate calculation will not be performed during decoding.
The pre-defined array lambdaBitrateLut is stored by the encoder and includes 16 entries.
Interpolation is used to approximate a 6-bit look-up from the stored 16 entries, as described in
Table 4-27. A scaling factor (lambdaBitrateScale) is applied to modeRate, which is calculated
from the worst-case block rate (maxBlockRate), as follows:

lambdaBitrateScale =  (1 << lambdaBitrateBits) + (maxBlockRate >> 1)


maxBlockRate 
where:
• lambdaBitrateBits = 12

The maximum block rate is calculated, as follows:


2
maxBlockRate = maxHeaderRate + k=0
(compSamples[k] × compBitDepth[k])

where:
• maxHeaderRate is 10 bits for 8bpc content, and 11 bits otherwise
• compBitDepth[k] is in the RGB/YCbCr color space

Note: Quantities maxBlockRate and lambdaBitrateScale shall be constant for a given PPS.

The bit rate used for lambda approximates the bit-rate ratio between the compressed and
uncompressed blocks (modeRate and maxBlockRate, respectively). The normalization factor
of maxBlockRate is incorporated in lambdaBitrateScale.

Table 4-27: lambdaBitrate Encoder Calculation


idxMod = clip (0, 63, (modeRate × lambdaBitrateScale) >> 6)
idxMod = idx & 0x03
a = lambdaBitrateLut [clip (0, 15, (idx >> 2) ) ]
b = lambdaBitrateLut [clip (0, 15, (idx >> 2) + 1) ]
lambdaBitrate = 2 × ( ( (4 – idxMod) × a + idxMod × b + 2) >> 2)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 78 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6 Test Coding Modes


The encoder shall test all available coding modes for each block time. For each coding mode,
a rate modeRate (in bits) and distortion modeDistortion (SAD) are calculated, in addition
to an RD cost (modeRdCost), as described in Section 4.5.4.

After testing, the encoder will select a mode for the current block that minimizes modeRdCost,
in addition to satisfying a set of additional constraints. Encoder mode selection is described in
Section 4.9.
A coding mode’s total distortion (modeDistortion) is determined from each component’s SAD,
as listed in Table 4-28. Here, rdoWeight is a rate-distortion optimization (RDO) weight, which is
further described in Annex B.

Table 4-28: modeDistortion Calculation


Color Space in Which modeDistortion Calculation
Mode Is Tested
YCoCg modeDistortion = ∑k ( (rdoWeight[k] × SAD[k] + 128) >> 8)
where:
• rdoWeight = [443, 181, 222] (see Annex B for further details)
RGB or YCbCr modeDistortion = ∑k (SAD[k])

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 79 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.1 Transform Mode


Transform mode consists of three stages, as described in Table 4-29 and illustrated in Figure 4-11.

Table 4-29: Transform Mode Stages


Stage Description Described In
1 All available intra prediction modes (intraListA) are tested such that the encoder Section 4.6.1.1
can downselect to intraListB. intraListB is composed of four or fewer intra
prediction modes, culled from intraListA, using the minimum SAD. In this case,
the down-selection process is not required for FBLS blocks because there is only
one available intra predictor (intraFbls).
2 Block residual is calculated for each intra predictor within intraListB. Transform and Section 4.6.1.2
quantization are applied to each residual, such that the entropy coding cost can be Section 4.6.1.3
calculated from the quantized transform coefficients. Next, inverse quantization
and inverse DCT are applied to calculate the block distortion. Finally, the RD cost Section 5.4.1a
for each intra predictor within intraListB is calculated.
3 Encoder selects the intra predictor from intraListB that produces the lowest RD cost. Section 4.6.1.4
This rate, distortion, and RD cost shall be used for the final Transform mode block.
Finally, the encoder adds the predicted block to the reconstructed residual to generate
the reconstructed block for future prediction.
a. Inverse DCT.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 80 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

Reconstructed Neighbors

Source Block Y
FBLS ? Calculate intraFbls

Select Best Four


Calculate Eight
of Eight Intra
Intra Predictors intraListB
Modes, Using
(intraListA)
Minimum SAD

1x (FBLS)
4x (NFBLS)

Performed For Each Intra


Predictor in intraListB
Calculate
EC Rate

Source Block
Calculate Inverse Inverse
Transform Quantization
Residual Quantization Transform

Calculate Calculate
Distortion RD Cost

Select Intra
Calculate
Predictor with
Reconstructed
Minimum
Block
RD Cost

Figure 4-11: Encoder Operations for a Transform Mode Block

Figure 4-12 illustrates the Transform mode prediction and reconstruction buffers.

Transform Mode Transform Mode

Source Buffer (RGB) CSC Current Block Source Buffer (YCbCr) Current Block

Reconstruction Buffer (RGB) Reconstruction Buffer (YCbCr)


CSC Intra Prediction Intra Prediction
Previous Blockline Previ ous Blockline

Reconstructed Reconstructed

Reconstruction Buffer (RGB)


Current blockline

CSC

Reconstruction Buffer (YCoCg) Reconstruction Buffer (YCbCr)


Current blockline Current Blockline

Figure 4-12: Transform Mode Prediction and Reconstruction Buffers

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 81 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.1.1 Stage 1 – Intra Prediction


The set of available intra predictors (intraListA) depends on whether the block is an FBLS
or NFBLS block, as described in Table 4-30.

Table 4-30: intraListA as a Function of Block Position within a Slice


Block Position intraListA
FBLS intraFbls
NFBLS intraDc, intraVert, intraVertLeft, intraVertRight,
intraDiagLeft, intraDiagRight, intraHorizLeft, intraHorizRight

In the first stage of Transform mode, the encoder tests all intra predictors in intraListA to determine
the SAD between the source block and predicted block. The four intra predictors that generate the
smallest SAD are added to intraListB. If there is a tie in SAD, the encoder shall select the intra
predictors that have the smaller index. This step can be skipped for FBLS blocks, where the only
valid intra predictor is intraFbls, as indicated by Table 4-31.
Table 4-31: Intra Prediction Mode for FBLS Blocks
intraFbls

P(i, j) = { 0,
1 << (bitDepth[k] – 1),
Co or Cg chroma component
otherwise

For intra prediction, the SAD is performed in the YCoCg color space for RGB input; otherwise,
the SAD is performed in the native color space. The SAD is the direct sum of the SAD for each
component. For example, in the YCoCg color space:
SAD = SADY + SADCo + SADCg

Figure 4-13 illustrates the intra prediction modes for NFBLS blocks. Table 4-32 describes the
values of each intra predictor’s predicted block.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 82 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

DC Vertical (V)
A-4 A-3 A-2 A-1 A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A-4 A-3 A-2 A-1 A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11

DC DC DC DC DC DC DC DC

DC DC DC DC DC DC DC DC

Vertical Left (VL) Vertical Right (VR)


A-4 A-3 A-2 A-1 A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A-4 A-3 A-2 A-1 A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11

Diagonal Left (DL) Diagonal Right (DR)


A-4 A-3 A-2 A-1 A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A-4 A-3 A-2 A-1 A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11

Horizontal Left (HL) Horizontal Right (HR)


A-4 A-3 A-2 A-1 A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A-4 A-3 A-2 A-1 A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11

Figure 4-13: Intra Prediction Modes

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 83 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

Table 4-32: Intra Prediction Modes for NFBLS Blocks (4:4:4, Luma Component)
intraDc intraVert
P(i, j) = Ai
DC = ( 
7
Ai) >> 3
i=0

P(i, j) = DC
intraDiagLeft intraDiagRight
P(0, 0) = (A0 + (A1 << 1) + A2 + 2) >> 2 P(0, 1) = (A-3 + (A-2 << 1) + A-1 + 2) >> 2
P(1, 0) = P(0, 1) = (A1 + (A2 << 1) + A3 + 2) >> 2 P(0, 0) = P(1, 1) = (A-2 + (A-1 << 1) + A0 + 2) >> 2
P(2, 0) = P(1, 1) = (A2 + (A3 << 1) + A4 + 2) >> 2 P(1, 0) = P(2, 1) = (A-1 + (A0 << 1) + A1 + 2) >> 2
… …
P(7, 0) = P(6, 1) = (A7 + (A8 << 1) + A9 + 2) >> 2 P(6, 0) = P(7, 1) = (A4 + (A5 << 1) + A6 + 2) >> 2
P(7, 1) = (A8 + (A9 << 1) + A10 + 2) >> 2 P(7, 0) = (A5 + (A6 << 1) + A7 + 2) >> 2
intraVertLeft intraVertRight
P(0, 0) = (A0 + A1 + 1) >> 1 P(0, 0) = (A-1 + A0 + 1) >> 1
P(0, 1) = (A0 + (A1 << 1) + A2 + 2) >> 2 P(0, 1) = (A-2 + (A-1 << 1) + A0 + 2) >> 2
P(1, 0) = (A1 + A2 + 1) >> 1 P(1, 0) = (A0 + A1 + 1) >> 1
P(1, 1) = (A1 + (A2 << 1) + A3 + 2) >> 2 P(1, 1) = (A-1 + (A0 << 1) + A1 + 2) >> 2
… …
P(7, 0) = (A7 + A8 + 1) >> 1 P(7, 0) = (A6 + A7 + 1) >> 1
P(7, 1) = (A7 + (A8 << 1) + A9 + 2) >> 2 P(7, 1) = (A5 + (A6 << 1) + A7 + 2) >> 2
intraHorizLeft intraHorizRight
P(0, 0) = (A1 + (A2 << 1) + A3 + 2) >> 2 P(0, 1) = (A-4 + (A-3 << 1) + A-2 + 2) >> 2
P(1, 0) = P(0, 1) = (A2 + (A3 << 1) + A4 + 2) >> 2 P(0, 0) = P(1, 1) = (A-3 + (A-2 << 1) + A-1 + 2) >> 2
P(2, 0) = P(1, 1) = (A3 + (A4 << 1) + A5 + 2) >> 2 P(1, 0) = P(2, 1) = (A-2 + (A-1 << 1) + A0 + 2) >> 2
… …
P(7, 0) = P(6, 1) = (A8 + (A9 << 1) + A10 + 2) >> 2 P(6, 0) = P(7, 1) = (A3 + (A4 << 1) + A5 + 2) >> 2
P(7, 1) = (A9 + (A10 << 1) + A11 + 2) >> 2 P(7, 0) = (A4 + (A5 << 1) + A6 + 2) >> 2

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 84 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

For 4:2:2 and 4:2:0 chroma components, the chroma samples are considered by the codec as
co-sited with the even luma samples. For this reason, the interpolation angles are adjusted
as detailed in Table 4-33 and Table 4-34 for 4:2:2 and 4:2:0, respectively.

Table 4-33: Intra Prediction Modes for NFBLS Blocks (4:2:2, Chroma Components)
intraDc intraVert
P(i, j) = Ai
DC = ( 
3
Ai) >> 2
i=0

P(i, j) = DC
intraDiagLeft intraDiagRight
AB = (A0 + A1 + 1) >> 1 AB = (A-2 + A-1 + 1) >> 1
BC = (A1 + A2 + 1) >> 1 BC = (A-1 + A0 + 1) >> 1
CD = (A2 + A3 + 1) >> 1 CD = (A0 + A1 + 1) >> 1
DE = (A3 + A4 + 1) >> 1 DE = (A1 + A2 + 1) >> 1
EF = (A4 + A5 + 1) >> 1 EF = (A2 + A3 + 1) >> 1
P(0, 0) = AB P(0, 0) = BC
P(1, 0) = BC P(1, 0) = CD
P(2, 0) = CD P(2, 0) = DE
P(3, 0) = DE P(3, 0) = EF
P(0, 1) = (AB + (A1 << 1) + BC + 2) >> 2 P(0, 1) = (AB + (A-1 << 1) + BC + 2) >> 2
P(1, 1) = (BC + (A2 << 1) + CD + 2) >> 2 P(1, 1) = (BC + (A0 << 1) + CD + 2) >> 2
P(2, 1) = (CD + (A3 << 1) + DE + 2) >> 2 P(2, 1) = (CD + (A1 << 1) + DE + 2) >> 2
P(3, 1) = (DE + (A4 << 1) + EF + 2) >> 2 P(3, 1) = (DE + (A2 << 1) + EF + 2) >> 2

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 85 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

Table 4-33: Intra Prediction Modes for NFBLS Blocks (4:2:2, Chroma Components) (Continued)
intraVertLeft intraVertRight
AB = (A0 + A1 + 1) >> 1 AB = (A-1 + A0 + 1) >> 1
BC = (A1 + A2 + 1) >> 1 BC = (A0 + A1 + 1) >> 1
CD = (A2 + A3 + 1) >> 1 CD = (A1 + A2 + 1) >> 1
DE = (A3 + A4 + 1) >> 1 DE = (A2 + A3 + 1) >> 1
P(0, 0) = (A0 + AB + 1) >> 1 P(0, 0) = (AB + A0 + 1) >> 1
P(1, 0) = (A1 + BC + 1) >> 1 P(1, 0) = (BC + A1 + 1) >> 1
P(2, 0) = (A2 + CD + 1) >> 1 P(2, 0) = (CD + A2 + 1) >> 1
P(3, 0) = (A3 + DE + 1) >> 1 P(3, 0) = (DE + A3 + 1) >> 1
P(0, 1) = (A0 + (AB << 1) + A1 + 2) >> 2 P(0, 1) = (A-1 + (AB << 1) + A0 + 2) >> 2
P(1, 1) = (A1 + (BC << 1) + A2 + 2) >> 2 P(1, 1) = (A0 + (BC << 1) + A1 + 2) >> 2
P(2, 1) = (A2 + (CD << 1) + A3 + 2) >> 2 P(2, 1) = (A1 + (CD << 1) + A2 + 2) >> 2
P(3, 1) = (A3 + (DE << 1) + A4 + 2) >> 2 P(3, 1) = (A2 + (DE << 1) + A3 + 2) >> 2
intraHorizLeft intraHorizRight
AB = (A0 + A1 + 1) >> 1 AB = (A-2 + A-1 + 1) >> 1
BC = (A1 + A2 + 1) >> 1 BC = (A-1 + A0 + 1) >> 1
CD = (A2 + A3 + 1) >> 1 CD = (A0 + A1 + 1) >> 1
DE = (A3 + A4 + 1) >> 1 DE = (A1 + A2 + 1) >> 1
EF = (A4 + A5 + 1) >> 1 EF = (A2 + A3 + 1) >> 1
P(0, 0) = (AB + (A1 << 1) + BC + 2) >> 2 P(0, 0) = (AB + (A-1 << 1) + BC + 2) >> 2
P(1, 0) = (BC + (A2 << 1) + CD + 2) >> 2 P(1, 0) = (BC + (A0 << 1) + CD + 2) >> 2
P(2, 0) = (CD + (A3 << 1) + DE + 2) >> 2 P(2, 0) = (CD + (A1 << 1) + DE + 2) >> 2
P(3, 0) = (DE + (A4 << 1) + EF + 2) >> 2 P(3, 0) = (DE + (A2 << 1) + EF + 2) >> 2
P(0, 1) = (A1 + (BC << 1) + A2 + 2) >> 2 P(0, 1) = (A-2 + (AB << 1) + A-1 + 2) >> 2
P(1, 1) = (A2 + (CD << 1) + A3 + 2) >> 2 P(1, 1) = (A-1 + (BC << 1) + A0 + 2) >> 2
P(2, 1) = (A3 + (DE << 1) + A4 + 2) >> 2 P(2, 1) = (A0 + (CD << 1) + A1 + 2) >> 2
P(3, 1) = (A4 + (EF << 1) + A5 + 2) >> 2 P(3, 1) = (A1 + (DE << 1) + A2 + 2) >> 2

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 86 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

Table 4-34: Intra Prediction Modes for NFBLS Blocks (4:2:0, Chroma Components)
intraDc intraVert
P(i, j) = Ai
DC = ( 
3
Ai) >> 2
i=0

P(i, j) = DC
intraDiagLeft intraDiagRight
AB = (A0 + A1 + 1) >> 1 AB = (A-2 + A-1 + 1) >> 1
BC = (A1 + A2 + 1) >> 1 BC = (A-1 + A0 + 1) >> 1
CD = (A2 + A3 + 1) >> 1 CD = (A0 + A1 + 1) >> 1
DE = (A3 + A4 + 1) >> 1 DE = (A1 + A2 + 1) >> 1
EF = (A4 + A5 + 1) >> 1 EF = (A2 + A3 + 1) >> 1
P(0, 0) = (AB + (A1 << 1) + BC + 2) >> 2 P(0, 0) = (AB + (A-1 << 1) + BC + 2) >> 2
P(1, 0) = (BC + (A2 << 1) + CD + 2) >> 2 P(1, 0) = (BC + (A0 << 1) + CD + 2) >> 2
P(2, 0) = (CD + (A3 << 1) + DE + 2) >> 2 P(2, 0) = (CD + (A1 << 1) + DE + 2) >> 2
P(3, 0) = (DE + (A4 << 1) + EF + 2) >> 2 P(3, 0) = (DE + (A2 << 1) + EF + 2) >> 2
intraVertLeft intraVertRight
AB = (A0 + A1 + 1) >> 1 AB = (A-1 + A0 + 1) >> 1
BC = (A1 + A2 + 1) >> 1 BC = (A0 + A1 + 1) >> 1
CD = (A2 + A3 + 1) >> 1 CD = (A1 + A2 + 1) >> 1
DE = (A3 + A4 + 1) >> 1 DE = (A2 + A3 + 1) >> 1
P(0, 0) = AB P(0, 0) = AB
P(1, 0) = BC P(1, 0) = BC
P(2, 0) = CD P(2, 0) = CD
P(3, 0) = DE P(3, 0) = DE
intraHorizLeft intraHorizRight
BC = (A1 + A2 + 1) >> 1 AB = (A-2 + A-1 + 1) >> 1
CD = (A2 + A3 + 1) >> 1 BC = (A-1 + A0 + 1) >> 1
DE = (A3 + A4 + 1) >> 1 CD = (A0 + A1 + 1) >> 1
EF = (A4 + A5 + 1) >> 1 DE = (A1 + A2 + 1) >> 1
P(0, 0) = BC P(0, 0) = AB
P(1, 0) = CD P(1, 0) = BC
P(2, 0) = DE P(2, 0) = CD
P(3, 0) = EF P(3, 0) = DE

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 87 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

For each intra predictor, the residual is calculated from the source and predicted blocks, as follows:
R(i, j) = X(i, j) – P(i, j) ∀i, j ∈ current block

In the second stage of Transform mode, the intra predictors in intraListB will be tested throughout
Transform mode, as illustrated in Figure 4-11.
Any intra predictor sample that is located to the left or above the slice boundary will be set equal
to half the component’s dynamic range. For example, for 8bpc content in YCoCg color space,
any sample that is located outside the slice will have the following value:
Y = 128, Co = 0, Cg = 0

Any intra predictor sample that is located to the right of the slice boundary will be horizontally
replicated from the last valid sample in the line.

4.6.1.2 Stage 2 – Forward Discrete Cosine Transform


The forward discrete cosine transform (DCT) is applied to the residual values from the intra
prediction step. The forward transform is applied in two passes, depending on the dimensions
of the current block component. (See Figure 4-14 and Table 4-35.) Encoder implementations
shall produce results that match the formulations of each of these functions.

8x2 Block Component


Y(i, j) Z(i, j)
8-point 2-point Forward
Forward DCT Haar Transform

4x2 Block Component


R(i, j) R(i, j) Y(i, j) Z(i, j) T(i, j)
4-point 2-point Forward
Pre-shift Forward DCT Haar Transform
Post-shift

4x1 Block Component

4-point Y(i, j)
Forward DCT

Figure 4-14: Forward Discrete Cosine Transform Overview

Table 4-35: Forward Discrete Cosine Transform Overview


Block Component Description Described In
8x2 • 8-point forward DCT for each row Section 4.6.1.2.2.1
(All 4:4:4 components and luma component) • 2-point forward Haar transform for Section 4.6.1.2.3
each column
4x2 • 4-point forward DCT for each row Section 4.6.1.2.2.2
(4:2:2 chroma component) • 2-point forward Haar transform for Section 4.6.1.2.3
each column
4x1 • 4-point forward DCT for row only Section 4.6.1.2.2.2
(4:2:0 chroma component)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 88 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.1.2.1 Pre-shift
Before the first transform pass is conducted, the intra-predicted residuals R(i, j) are pre-shifted,
as described in Table 4-36. The pre-shifted residuals are denoted as R(i, j). This step increases
the dynamic range such that the error associated with forward transform is minimized. The
following parameters are associated with the pre-shift:
• dctFwShift = 2
• dctFwRound = 2

Table 4-36: Discrete Cosine Transform Pre-shift


for (r = 0; r < numRows; r ++) {
for (c = 0; c < numCols; c ++) {
R(c, r) = R(c, r) << dctFwShift
}
}

4.6.1.2.2 Horizontal Transform Pass


After pre-shift is applied, forward DCT shall be applied to each row within the current block,
as follows:
• 8-point forward DCT – Used for 8x2 components (see Section 4.6.1.2.2.1)
• 4-point forward DCT – Applied to 4x2 and 4x1 components (see Section 4.6.1.2.2.2)

The result of this step is a set of temporary transform coefficients Y(i, j). For a 4:2:0 chroma
component, these transform coefficients are post-shifted, as detailed later in Table 4-40, and
the forward transform step is complete. For all other cases, a vertical transform pass is applied.
(See Section 4.6.1.2.3.)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 89 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.1.2.2.1 8-point Forward Discrete Cosine Transform


The 8-point forward DCT is implemented in a butterfly structure, as illustrated in Figure 4-15 and
described in Table 4-37. Input to the 8-point forward DCT shall be the set of pre-shifted residual
values R(i, j) from intra prediction. Output from the 8-point forward DCT shall be a set of
transform coefficients Y(i, j). The following coefficients shall be used to calculate the 8-point
forward DCT:
• dctA8 = 17 • dctG8 = 111
• dctB8 = 41 • dctZ8 = 64
• dctD8 = 22 • dctShift8 = 7
• dctE8 = 94 • dctShift4 = 6

R0 ea0 eb0 ec0 ec0 Y0

R1 ea1 eb1 ec1 ec1 Y1

R2 ea2 eb2 ec2 ec2 Y2

R3 ea3 eb3 ec3 ec3 Y3

R4 da 0 db0 dc0 dd 0 Y4

R5 da 1 db1 dc1 dc1 Y5

R6 da 2 db2 dc2 dc2 Y6

R7 da 3 db3 dc3 dd 3 Y7

Figure 4-15: Butterfly Structure for 8-point Forward Discrete Cosine Transform

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 90 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

Table 4-37: 8-point Forward Discrete Cosine Transform Applied to Selected Row r
of Pre-shifted Residual Block R(i, j)
// first stage
for (i = 0; i < 4; i ++) {
ea(i, r) = R(i, r) + R(7 – i, r)
da(7 – i, r) = R(i, r) – R(7 – i, r)
}

// second stage, even coefficients


for (i = 0; i < 4; i ++) {
eb(i, r) = ea(i, r) + ea(3 – i, r)
eb(3 – i, r) = ea(i, r) – ea(3 – i, r)
}

// third stage, even coefficients


ec(0, r) = eb(0, r) + eb(1, r)
ec(1, r) = eb(0, r) – eb(1, r)
ec(2, r) = (dctA8 × eb(2, r) + dctB8 × eb(3, r) ) >> dctShift4
ec(3, r) = (dctA8 × eb(3, r) – dctB8 × eb(2, r) ) >> dctShift4

// second stage, odd coefficients


db(4, r) = (dctE8 × da(4, r) + dctZ8 × da(7, r) ) >> dctShift8
db(7, r) = (dctE8 × da(7, r) – dctZ8 × da(4, r) ) >> dctShift8
db(5, r) = (dctG8 × da(5, r) + dctD8 × da(6, r) ) >> dctShift8
db(6, r) = (dctG8 × da(6, r) – dctD8 × da(5, r) ) >> dctShift8

// third stage, odd coefficients


dc(4, r) = db(4, r) + db(6, r)
dc(5, r) = -db(5, r) + db(7, r)
dc(6, r) = -db(6, r) + db(4, r)
dc(7, r) = db(5, r) + db(7, r)

// fourth stage, odd coefficients


dd(4, r) = dc(7, r) – dc(4, r)
dd(7, r) = dc(7, r) + dc(4, r)

// final stage (re-ordering)


Y(0, r) = ec(0, r)
Y(1, r) = dd(3, r)
Y(2, r) = ec(2, r)
Y(3, r) = dc(1, r)
Y(4, r) = ec(1, r)
Y(5, r) = dc(2, r)
Y(6, r) = ec(3, r)
Y(7, r) = dd(0, r)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 91 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.1.2.2.2 4-point Forward Discrete Cosine Transform


The 4-point forward DCT is implemented in a butterfly structure, as illustrated in Figure 4-16 and
described in Table 4-38. Input to the 4-point forward DCT shall be the pre-shifted residual block
R(i, j) from intra prediction. Output from the 4-point forward DCT shall be a set of transform
coefficients Y(i, j). The following coefficients shall be used:
• dctA4 = 17
• dctB4 = 41
• dctShift4 = 6

R0 a0 Y0

R1 a1 Y1

R2 a2 Y2

R3 a3 Y3

Figure 4-16: Butterfly Structure for 4-point Forward Discrete Cosine Transform

Table 4-38: 4-point Forward Discrete Cosine Transform Applied to Row r


of Pre-shifted Residual Block R(i, j)

a(0, r) = R(0, r) + R(3, r)


a(3, r) = R(0, r) – R(3, r)
a(1, r) = R(1, r) + R(2, r)
a(2, r) = R(1, r) – R(2, r)

Y(0, r) = a(0, r) + a(1, r)


Y(2, r) = a(0, r) – a(1, r)
Y(1, r) = (dctA4 × a(2, r) + dctB4 × a(3, r) ) >> dctShift4
Y(3, r) = (dctA4 × a(3, r) – dctB4 × a(2, r) ) >> dctShift4

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 92 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.1.2.3 Vertical Transform Pass (2-point Forward Haar Transform)


In the second pass, the 2-point forward Haar transform shall be applied to each column in Y(i, j),
as described in Table 4-39. The 2-point forward Haar transform (equivalent to a 2-point forward
DCT) is a simple sum/difference of the DCT transform coefficients Y(i, j) in each column.
The result of this operation shall be a set of transform coefficients Z(i, j).

Table 4-39: 2-point Forward Haar Transform for Column C


for (c = 0; c < numCols; c ++) {
Z(c, 0) = Y(c, 0) + Y(c, 1)
Z(c, 1) = Y(c, 0) – Y(c, 1)
}

4.6.1.2.4 Post-shift
The result of the forward Haar transform shall be post-shifted, as described in Table 4-40, using the
same dctFwRound and dctFwShift values defined in Section 4.6.1.2.1. The resulting transform
coefficients T(i, j) are not orthonormal. The normalization step shall be combined with the
quantization step described in Section 4.8.1.3.

Table 4-40: Forward Discrete Cosine Transform Post-shift


for (r = 0; r < numRows; r ++) {
for (c = 0; c < numCols; c ++) {
T(c, r) = (Z(c, r) + dctFwRound) >> dctFwShift
}
}

Note: For 4:2:0 chroma components, the vertical (Haar) pass is skipped, and therefore the
input to the post-shift shall be the block of temporary transform coefficients Y(i, j).

4.6.1.3 Stage 2 – Quantization and Reconstruction


The forward transform normalization and quantization steps are combined, as described in
Section 4.8.1.2. After normalization and quantization, the quantized transform coefficients Tq(i, j)
are used to estimate the entropy coding cost, as described in Section 4.7.3. Inverse quantization
is then applied to determine the reconstructed transform coefficients T (i, j) = Q-1[Tq(i, j)]. Inverse
DCT is then applied to the reconstructed transform coefficients to determine the reconstructed
prediction residuals R (i, j) , which are clipped to the valid residual space. The distortion for the
current intra predictor is determined as the SAD between the residual and reconstructed residual
blocks (R(i, j) and R (i, j) , respectively). Inverse DCT normative behavior is described
in Section 5.4.1.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 93 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.1.4 Stage 3 – Intra Predictor Decision


The second stage of Transform mode, illustrated in Figure 4-11, is performed for each of the
intra predictors within intraListB. Rate and distortion are calculated for each option, as follows:
• modeRate – Sum of all header bits and the entropy coding cost of the quantized transform
coefficients for all components
• modeDistortion – SAD between the residual and reconstructed residual blocks
(R(i, j) and R (i, j) , respectively), as described in Table 4-28

modeRdCost is calculated for each intra predictor in intraListB, as described in Section 4.6.
The encoder shall select the intra predictor, as per Table 4-41. Any intra predictor that generates a
syntax element larger than ssm_max_se_size shall be considered as invalid and disallowed by the
encoder.

Table 4-41: Encoder Intra Predictor Selection


minIntraRdCost = largeInt
minIntraRate = largeInt
bestIntraIdx = -1
for (intraPred = 0; intraPred < numIntraPreds; intraPred ++) {
conditionA = intraRdCost[intraPred] < minIntraRdCost
conditionB = (intraRdCost[intraPred] == minIntraRdCost) && (intraRate[intraPred] < minIntraRate)
if (intraIsValid[intraPred] && (conditionA | | conditionB) ) {
bestIntraIdx = intraPred
minIntraRdCost = intraRdCost[intraPred]
minIntraRate = intraRate[intraPred]
}
}

where:
• largeInt = 999999
• numIntraPreds = 1 for FBLS blocks and 4 for NFBLS blocks
• intraRdCost[intraPred] = Associated intra predictor’s RD cost
• intraRate[intraPred] = Associated intra predictor’s rate

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 94 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

The block is then reconstructed, based on the selected intra predictor. The final rate, distortion,
and RD cost for that intra predictor are assigned as the final modeRate, modeDistortion, and
modeRdCost, respectively, for Transform mode.
Intra predictor signaling, using three bits, shall be conducted for the NFBLS blocks,
as per Table 4-42.

Table 4-42: NFBLS Block Intra Predictor Signaling


Intra Predictor Code Intra Predictor Code
intraDc 000 intraVertRight 100
intraVert 001 intraVertLeft 101
intraDiagRight 010 intraHorizRight 110
intraDiagLeft 011 intraHorizLeft 111

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 95 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.1.5 EC
Entropy coding of Transform mode quantized transform coefficients entropy coding is conducted
as described in Section 4.7. For any 8x2 component, prior to entropy coding, the quantized
transform coefficients are re-ordered so that they are in the correct group order, as illustrated in
Figure 4-17.

Quantized Tran sform Coefficients Quantized Tran sfo rm Coefficients (Re-ordered)

S0 S1 S2 S3 S4 S5 S6 S7 T0 T1 T2 T4 T5 T9 T10 T11

S8 S9 S10 S11 S12 S13 S14 S15 T3 T6 T7 T8 T12 T13 T14 T15

T4 T5 T6 T7 T8 T1 T2 T3 T9 T10 T11 T12 T13 T14 T15 T0

ECG0 ECG1 ECG2 ECG3

Figure 4-17: Transform Mode ECG Structure – 8x2 Component

Table 4-43 describes the mapping between the quantized transform coefficients and
re-ordered coefficients.

Table 4-43: Mapping for Re-ordering Quantized Transform Coefficients


for EC – 8x2 Components Only
ecIndexMappingTransform[16] = {0, 1, 2, 4, 5, 9, 10, 11, 3, 6, 7, 8, 12, 13, 14, 15}
for (i = 0; i < 16; i ++) {
T[ecIndexMappingTransform[i] ] = S[i]
}

where:
• S = Quantized transform coefficients
• T = Re-ordered coefficients

This re-ordering step is not required for 4x2 or 4x1 components, as illustrated in Figure 4-18.

4x2 Component 4x1 Component

S0 S1 S2 S3 S0 S1 S2 S3

S4 S5 S6 S7

S0 S1 S2 S3 S4 S5 S6 S7 S0 S1 S2 S3

ECG0 ECG1 ECG0

Figure 4-18: Transform Mode ECG Structure – 4x2 and 4x1 Components

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 96 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.1.5.1 Last Significant Position


For quantized transform coefficient entropy coding, the last non-zero sample’s position within each
component (lastSigPos) is calculated and signaled in the syntax. Using the entropy coding sample
order, the last significant position is the largest sample index within a component that has a non-
zero value. For an 8x2 component, the ordering will be T. Any entropy coding group for which all
sample indices are greater than lastSigPos can be omitted from the bitstream
(i.e., ecgDataActive[ecgIdx] is false (see Section 4.7.2)). Figure 4-19 illustrates several
example components.

Quantized Transform Coefficients (Re-ordered)

T0 T1 T2 T4 T5 T9 T10 T11 T4 T5 T6 T7 T8 T1 T2 T3 T9 T10 T11 T12 T13 T14 T15 T0

T3 T6 T7 T8 T12 T13 T14 T15


ECG0 ECG1 ECG2 ECG3

-1 0 0 0 0 0 0 0 -1

0 0 0 0 0 0 0 0 lastSigPos = 0, ecgDataActive is true for E CG 3

-5 7 -1 0 0 0 0 0 7 -1 -5

0 0 0 0 0 0 0 0 lastSigPos = 2, ecgDataActive is true for E CG 1, ECG3

2 0 1 -1 0 0 0 0 -1 0 1 0 2

0 0 0 0 0 0 0 0 lastSigPos = 4, ecgDataActive is true for E CG 0, ECG1 , ECG3

0 3 6 -2 0 1 1 -1 -2 0 3 0 -1 3 6 7 1 1 -1 -1 0

7 3 0 -1 -1 0 0 0 lastSigPos = 12, ecgDataActive is true for E CG 0, ECG1 , ECG2, E CG 3

10 0 -2 -5 0 6 8 -10 -5 0 3 0 5 0 -2 3 6 8 -10 -1 0 4 -7 10

3 3 0 5 -1 0 4 -7 lastSigPos = 15, ecgDataActive is true for E CG 0, ECG1 , ECG2, E CG 3

Figure 4-19: Example Transform Component Data Blocks with Corresponding lastSigPos,
ECG Structures

For chroma components, lastSigPos will be 0 if only the first coefficient has a non-zero
value. For the luma component, lastSigPos will be 0 in the following two cases:
• All coefficients are 0 (because there is no component skip flag for Component 0)
• First coefficient is non-zero, and all other coefficients are 0

lastSigPos is signaled explicitly in the bitstream at the beginning of each component, using a fixed
number of bits (bitsLastSigPos), as follows:
bitsLastSigPos = compSamplesLog2

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 97 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

For a given component, all samples must be signaled, up to and including the last significant
sample. All samples after lastSigPos are omitted because their value will be 0. Any ECG for which
all sample indices occur after lastSigPos will also be omitted. For example, if lastSigPos = 8 for
the luma component (see Figure 4-17), ECG2 is omitted because all samples in this group have
an index that is larger than 8.
After lastSigPos is calculated, the encoder shall calculate the sign of the last significant coefficient
(signLastSigPos) to avoid possible ambiguity later, as follows:
1, coeff[lastSigPos] < 0
signLastSigPos = { 0, coeff[lastSigPos] > 0

The coefficient at lastSigPos is then modified because it is known that the original value cannot
have been 0. Modification is performed as follows:
coeff[lastSigPos] + 1, coeff[lastSigPos] < 0
coeff[lastSigPos] = { coeff[lastSigPos] – 1, coeff[lastSigPos] > 0

One final step shall occur if coeff[lastSigPos] == 0 after this modification (i.e., coeff[lastSigPos] ==
±1 before the modification). In this case, the encoder will signal signLastSigPos, using one bit to
avoid possible ambiguity. This bit will be transmitted along with the sign bits in ECG3. (For further
details regarding EC sign bits, see Section 4.7.8.)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 98 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.2 BP Mode
In Block Prediction (BP) mode, the current block is split into partitions, and each partition is
predicted from a set of spatially neighboring samples (the BPV search range). For each source
partition within the block, the encoder searches the BPV search range for a candidate partition that
minimizes distortion. This distortion is calculated per-pixel rather than per-component (i.e., a single
BPV will be used for all three color components of a given partition). The best candidate’s position
within the search range is the block prediction vector (BPV) for that partition. The BPV is signaled
explicitly within the bitstream syntax, such that the decoder can perform the same prediction
without requiring a search operation.
The BP mode syntax includes the BPVs for each partition, in addition to entropy-coded quantized
prediction residuals for each component. Figure 4-20 illustrates BP mode encoder operation.

4x 16x

Perf ormed for Perf ormed for


Each 2x2 Sub-block Each Partition Grid
Reconstruction
Prediction
Source Block
2 × 32x (FBLS)
2 × 64x (NFBLS) Construct
Reconstruct ed
Partition Grid
Source Block BPV Search Calculate Inverse
Quantization
(2x1 Partitions) Residual Quantization

BPV Search
Range
Quantized Residual Calculate Ent ropy
BPV Coding Rate

Reconstructed
Residual Calculate
Calculate RD Cost Select Minimum
32x (FBLS) Residual Distortion
64x (NFBLS)

BPV Search Calculate Inverse


Quantization
(2x2 Partitions) Residual Quantization

Prediction
Source Block

Reconstruction

Figure 4-20: BP Mode Encoder Overview

BP mode uses partitioning to split the current block into two types of non-overlapping partitions,
2x2 and 2x1, as illustrated in Figure 4-21. The BPV search operation is performed for each search
range position for each partition within the block. The encoder selects a partition option for each
2x2 sub-block, based on RD cost. For example, in Figure 4-21, the leftmost 2x2 sub-block can
be represented by a single BPV (A) –or– two BPVs (A0, A1).

2x 2 Block Prediction Partition 2x 1 Block Prediction Partition

A B C D A0 B0 C0 D0

A1 B1 C1 D1
Figure 4-21: BP Mode Partitions – 2x2 and 2x1

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 99 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

Figure 4-22 illustrates the BP mode prediction and reconstruction buffers.

BP Mode BP Mode

Source Buffer (RGB) CSC Current Block Source Buffer (YCbCr) Current Block

Reconstruction Buffer (RGB) Reconstruction Buffer (YCbCr)


CSC BPV Search Range A BPV Search Range A
Previous Blockline Previous Blockline

BPV Search Range B BPV Search Range B

Reconstruction Buffer (RGB)


BPV Search Range C BPV Search Range C
Current Blockline

CSC Reconstructed Reconstructed

Reconstruction Buffer (YCoCg) Reconstruction Buffer (YCbCr)


Current Blockline Current Blockline

Figure 4-22: BP Mode Prediction and Reconstruction Buffers

4.6.2.1 BPV Search


The BPV search range consists of reconstructed samples that neighbor the current block,
as illustrated in Figure 4-23. In total, there are:
• 32 valid positions within the search range for FBLS blocks
• 64 valid positions within the search range for NFBLS blocks

FBLS C0 C1 ... C24 C25 C26 C27 C28 C29 C30 C31 C32
Block C33 C34 ... C57 C58 C59 C60 C61 C62 C63 C64 C65

A0 A1 A2 A3 A4 A5 A6 A7 B0 B1 ... ... B23 B24

NFBLS C0 C1 ... C24 C25 C26 C27 C28 C29 C30 C31 C32
Block C33 C34 ... C57 C58 C59 C60 C61 C62 C63 C64 C65

Figure 4-23: BPV Search Range for FBLS and NFBLS Blocks

The BPV search range is split into three parts, as described in Table 4-44.

Table 4-44: BPV Search Ranges


Search Range Description
bpvSearchRangeA Samples from the previous reconstructed line, above and to the left of the current block.
(A0…A7)
bpvSearchRangeB Samples from the previous reconstructed line, above and to the right of the current block.
(B0…B24)
bpvSearchRangeC Samples from the reconstructed portion of the current blockline, left of the current block.
(C0…C65)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 100 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

For each sub-block, the BPV search operation shall be performed independently for all 2x2 and
2x1 partitions, as described in Table 4-45.

Table 4-45: BPV Search Operation for 2x2 and 2x1 Partitions
Description Candidate Partitions Search Ranges
Compare the 2x2 source partition BPV 0 through 6 bpvSearchRangeA, bpvSearchRangeC
for the sub-block with all candidate BPV 7 bpvSearchRangeA, bpvSearchRangeB,
2x2 partitions within the search range bpvSearchRangeC
(see Figure 4-24).
BPV 8 through 31 bpvSearchRangeB
(vertically replicated)
BPV 32 through 63 bpvSearchRangeC
Compare the top 2x1 source partition BPV 0 through 6 bpvSearchRangeA
for the sub-block with all candidate BPV 7 bpvSearchRangeA, bpvSearchRangeB
2x1 partitions (see Figure 4-25).
BPV 8 through 31 bpvSearchRangeB
BPV 32 through 63 bpvSearchRangeC
(first row)
Compare the bottom 2x1 source BPV 0 through 6 bpvSearchRangeC
partition for the sub-block with (first row)
all candidate 2x1 partitions BPV 8 through 31 bpvSearchRangeB
(see Figure 4-26).
BPV 32 through 63 bpvSearchRangeC
(second row)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 101 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

$ %
%39  %39  %39 
& &

$ $ $ $ $ $ $ $ % %   % %

& &  & & & & & & & & &
& &  & & & & & & & & &

% % % %
%39  %39 
% % % %

$ $ $ $ $ $ $ $ % %   % %

& &  & & & & & & & & &

& &  & & & & & & & & &

$ $ $ $ $ $ $ $ % %   % %

& &  & & & & & & & & &

& &  & & & & & & & & &

%39  %39 

Figure 4-24: BPV Search Range Candidate 2x2 Partitions

ďƉǀ Ϭ ďƉǀ ϳ

Ϭ ϭ Ϯ ϯ ϰ ϱ ϲ ϳ Ϭ ϭ ͘͘͘ ͘͘͘ Ϯϯ Ϯϰ


Ϭ ϭ ͘͘͘ Ϯϰ Ϯϱ Ϯϲ Ϯϳ Ϯϴ Ϯϵ ϯϬ ϯϭ ϯϮ
ϯϯ ϯϰ ͘͘͘ ϱϳ ϱϴ ϱϵ ϲϬ ϲϭ ϲϮ ϲϯ ϲϰ ϲϱ

ďƉǀ ϴ ďƉǀ ϯϭ

Ϭ ϭ Ϯ ϯ ϰ ϱ ϲ ϳ Ϭ ϭ ͘͘͘ ͘͘͘ Ϯϯ Ϯϰ


Ϭ ϭ ͘͘͘ Ϯϰ Ϯϱ Ϯϲ Ϯϳ Ϯϴ Ϯϵ ϯϬ ϯϭ ϯϮ
ϯϯ ϯϰ ͘͘͘ ϱϳ ϱϴ ϱϵ ϲϬ ϲϭ ϲϮ ϲϯ ϲϰ ϲϱ

ďƉǀ ϯϮ ďƉǀ ϲϯ

Ϭ ϭ Ϯ ϯ ϰ ϱ ϲ ϳ Ϭ ϭ ͘͘͘ ͘͘͘ Ϯϯ Ϯϰ


Ϭ ϭ ͘͘͘ Ϯϰ Ϯϱ Ϯϲ Ϯϳ Ϯϴ Ϯϵ ϯϬ ϯϭ ϯϮ
ϯϯ ϯϰ ͘͘͘ ϱϳ ϱϴ ϱϵ ϲϬ ϲϭ ϲϮ ϲϯ ϲϰ ϲϱ

Figure 4-25: BPV Search Range Candidate 2x1 Partitions


for Source Partition in First Line of Sub-block

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 102 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

%39  %39  & & %39 

$ $ $ $ $ $ $ $ % %   % %

& &  & & & & & & & & &
& &  & & & & & & & & &

%39  %39 

$ $ $ $ $ $ $ $ % %   % %

& &  & & & & & & & & &

& &  & & & & & & & & &

$ $ $ $ $ $ $ $ % %   % %


& &  & & & & & & & & &

& &  & & & & & & & & &

%39  %39 

Figure 4-26: BPV Search Range Candidate 2x1 Partitions


for Source Partition in Second Line of Sub-block

For FBLS blocks, the search range shall consist of up to 32 positions because search ranges A and
B will be unavailable. In this case, the BPV shall be signaled in the bitstream, using five bits each.
For NFBLS blocks, the search range shall consist of up to 64 positions, and six bits shall be used
to signal each BPV.
During the BPV search operation, for each 2x2 and 2x1 partition within the current block, the SAD
shall be calculated between the source partition and each candidate partition within the search
range. The candidate partition that generates the minimum SAD when compared with the source
partition shall be selected. The SAD between the source partition and candidate partition shall
be calculated over all samples within the partitions, as per Table 4-46.

Table 4-46: BPV Search SAD Calculation


Color Space BPV Search SAD Calculation
YCoCg SADY + ( (SADCo + SADCg + 1) >> 1)
YCbCr SADY + SADCb + SADCr

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 103 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

Figure 4-27 demonstrates the SAD calculation for a given 2x2 and 2x1 source and candidate
partition in the YCoCg color space:

4:4:4, 2x2 Partition 4:4:4, 2x1 Partition

Source Partition Candidate Partition Source Partition Candidate Partition

AY BY EY FY AY BY CY DY

CY DY GY HY
ACo BCo CCo DCo

ACo BCo ECo FCo


ACg BCg CCg DCg
CCo DCo GCo HCo

ACg BCg ECg FCg

CCg DCg GCg HCg

Figure 4-27: Example 2x2 and 2x1 Source and Candidate Partitions for SAD Calculation

Table 4-47: YCoCg SAD for 2x2 Partition


SADY = | AY – EY | + | BY – FY | + | CY – GY | + | DY – HY |
SADCo = | ACo – ECo | + | BCo – FCo | + | CCo – GCo | + | DCo – HCo |
SADCg = | ACg – ECg | + | BCg – FCg | + | CCg – GCg | + | DCg – HCg |
SAD2x2 = SADY + ( (SADCo + SADCg + 1) >> 1)

Table 4-48: YCoCg SAD for 2x1 Partition


SADY = | AY – CY | + | BY – DY |
SADCo = | ACo – CCo | + | BCo – DCo |
SADCg = | ACg – CCg | + | BCg – DCg |
SAD2x1 = SADY + ( (SADCo + SADCg + 1) >> 1)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 104 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.2.1.1 BPV Search for 4:2:2 and 4:2:0 Chroma Components


For 4:2:2 and 4:2:0 chroma components, each partition will contain fewer samples than in the
4:4:4/luma case, as illustrated in Figure 4-28.

4:2:2, 2x2 Partition 4:2:0, 2x2 Partition

Source Partition Candidate Partition Source Partition Candidate Partition

AY BY EY FY AY BY EY FY

CY DY GY HY CY DY GY HY

ACb ECb ACb ECb

CCb GCb

ACr ECr ACr ECr

CCr GCr

4:2:2, 2x1 Partition; 4:2:0, 2x1 Partition (Top Row) 4:2:0, 2x1 Partition (Bottom Row)

Source Partition Candidate Partition Source Partition Candidate Partition

AY BY CY DY AY BY CY DY

ACb CCb

ACr CCr

Figure 4-28: BP Mode Partitions for 4:2:2 and 4:2:0 Use Cases

For 4:2:2 content, a 2x2 sub-block contains four luma samples, and two samples each for Cb/Cr.
In this case, 2x1 partitions are calculated in the same way for partitions within the current block’s
top and bottom rows.
SAD2x2 = ( | AY – EY | + | BY – FY | + | CY – GY | + | DY – HY | )
+ ( | ACb – ECb | + | CCb – GCb | ) + ( | ACr – ECr | + | CCr – GCr | )
SAD2x1 = ( | AY – CY | + | BY – DY | ) + | ACb – CCb | + | ACr – CCr | )

For 4:2:0 content, a 2x2 sub-block contains four luma samples, and one sample each for Cb/Cr.
Therefore, the SAD for a 2x2 partition shall be calculated as follows:
SAD2x2 = ( | AY – EY | + | BY – FY | + | CY – GY | + | DY – HY | ) + | ACb – ECb | + | ACr – ECr |

For 2x1 partitions in 4:2:0, the chroma sample shall be aligned with the current block’s top row.
Therefore, the SAD for a 2x1 partition in the top row shall contain a chroma sample, while
the SAD for the bottom row shall not. This is necessary to ensure that the error between 2x2
and 2x1 partitions are consistent. The following are the SADs for a 4:2:0 2x1 partition:
SAD2x1 top = ( | AY – CY | + | BY – DY | ) + | ACb – CCb | + | ACr – CCr |
SAD2x1 bottom = ( | AY – CY | + | BY – DY | )

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 105 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

Each portion of the search range shall contain half as many samples for 4:2:2 and 4:2:0 FBLS
and NFBLS chroma components, as illustrated in Figure 4-29.

A0 A1 A2 A3 B0 B1 B2 B3 B4 ... B11 B12

4:2:2 C0 C1 ... C12 C13 C14 C15 C16 4:2:2 C0 C1 ... C12 C13 C14 C15 C16
FBLS C C NFBLS C C
17 18 ... C29 C30 C31 C32 C33 17 18 ... C29 C30 C31 C32 C33

A0 A1 A2 A3 B0 B1 B2 B3 B4 ... B11 B12


4:2:0 4:2:0
C0 C1 ... C12 C13 C14 C15 C16 C0 C1 ... C12 C13 C14 C15 C16
FBLS NFBLS

Figure 4-29: BPV Search Range for 4:2:2 and 4:2:0 FBLS and NFBLS Chroma Components

For a given luma search range position (srPos) within a search range, the chroma search range
position (srPosChroma) is calculated as follows, where the offset in search range C is due
to the one-sample shift of search range C relative to A and B:
(srPos + 1) >> 1, srId == 2
srPosChroma = { srPos >> 1, otherwise

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 106 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.2.1.2 BPV Search Range at Slice Boundaries


The BPV search range shall shift by 8 pixels per block time along with the current block,
as illustrated in Figure 4-30. This is straightforward for blocks that are located within the
middle of the slice. However, for blocks that are located near the left and right slice boundaries,
certain search range positions may change. For example, in the last three blocks
of a blockline, certain search range positions for bpvSearchRangeB will be within the
current blockline’s reconstructed portion, at the slice’s left edge.

Slice Left Edge Slice Right Edge

...
...
...

...
...
...

...
...
...

...
...
...

...
...
...

...
...
...

Current Block bpvSearchRangeA bpvSearchRangeB bpvSearchRangeC

Figure 4-30: BPV Search Range Positions at Slice Boundaries

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 107 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.2.2 Residuals
BP mode shall calculate two residual sub-blocks for each sub-block within the current block,
as described in Table 4-49. The combination of all 2x2 residual sub-blocks will give the
2x2 residual block R2x2(i, j). The combination of all 2x1 residual sub-blocks will give
the 2x1 residual block R2x1(i, j).

Table 4-49: Residual Sub-blocks


Symbol Description Definition
~ Calculated as the difference between the source sub-block and ~
R 2x2(i, j) R 2x2(i, j) = X(i, j) – P2x2(i, j)
the 2x2 candidate partition that is selected for the sub-block
during a BPV search. (See Figure 4-24.)
~ Calculated as the difference between the source sub-block ~
R 2x1(i, j) R 2x1(i, j) = X(i, j) – P2x1(i, j)
and the two candidate 2x1 partitions that are selected for
the sub-block. (See Figure 4-25 and Figure 4-26.)

where:
• ~
R is a temporary BP mode variable

4.6.2.3 Quantization
BP mode shall use the fractional quantizer, as described in Section 4.8.1.3, for 2x2 and
2x1 residual blocks, as follows:
Q [R2x2(i, j)] = Rq2x2(i, j)
Q [R2x1(i, j)] = Rq2x1(i, j)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 108 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.2.4 EC
BP mode quantized residual entropy coding shall be conducted as described in Section 4.7 with the
entropy coding group sample distributions illustrated in Figure 4-31 and Figure 4-32. This shall be
performed for each of the 16 possible partition combinations described in Section 4.6.2.5.

Qua ntized BP Residuals

S0 S1 S2 S3 S4 S5 S6 S7

S8 S9 S10 S11 S12 S13 S14 S15

S0 S1 S8 S9 S2 S3 S10 S11 S4 S5 S12 S13 S6 S7 S14 S15

EC Group 0 EC Group 1 EC Group 2 EC Group 3

Figure 4-31: BP Mode ECG Structure, All Components (4:4:4)


or Luma Component Only (4:2:2 and 4:2:0)

4:2:2 4:2:0

S0 S1 S2 S3 S0 S1 S2 S3

S4 S5 S6 S7

S0 S1 S4 S5 S2 S3 S6 S7 S0 S1 S2 S3

EC Group 0 EC Group 1 EC Group 0

Figure 4-32: BP Mode ECG Structure, Chroma Components (4:2:2 and 4:2:0)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 109 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.2.5 Partition Decision


The encoder shall determine the rate, distortion, and RD cost for each of the 16 possible partition
grids shown in Figure 4-33, where each partition grid is constructed from the corresponding
sub-blocks. For example, the entropy coding rate for Partition Grid 7 shall be calculated based
on the quantized residuals Rq2x1 for Sub-block 0 and the quantized residuals Rq2x2 for the
remaining sub-blocks. The rate shall include the entropy coding rate, as well as the BPV signaling
cost and all header bits. The SAD shall be calculated in the YCoCg color space for RGB input
(see Section 4.6), –or– in the YCbCr color space for YCbCr input.

2x1 2x1 2x1 2x1 2x1 2x1 2x1 2x1 2x1 2x1 2x1 2x1
0 4 2x2 8 2x2 12 2x2 2x2
2x1 2x1 2x1 2x1 2x1 2x1 2x1 2x1 2x1 2x1 2x1 2x1

2x1 2x1 2x1 2x1 2x1 2x1 2x1 2x1


1 2x2 5 2x2 2x2 9 2x2 2x2 13 2x2 2x2 2x2
2x1 2x1 2x1 2x1 2x1 2x1 2x1 2x1

2x1 2x1 2x1 2x1 2x1 2x1 2x1 2x1


2 2x2 6 2x2 2x2 10 2x2 2x2 14 2x2 2x2 2x2
2x1 2x1 2x1 2x1 2x1 2x1 2x1 2x1

2x1 2x1 2x1 2x1


3 2x2 2x2 7 2x2 2x2 2x2 11 2x2 2x2 2x2 15 2x2 2x2 2x2 2x2
2x1 2x1 2x1 2x1

Figure 4-33: Possible Partition Grids (16) for BP Partition Selection

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 110 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

The final partition information is selected by minimizing the RD cost among all 16 possible
combinations, as described in Table 4-50. The minimum rate is used to break ties.

Table 4-50: BP Mode Partition Selection


bestGrid = 0
minGridRdCost = largeInt
minGridRate = largeInt
for (grid = 0; grid < 16; grid ++) {
conditionA = (partitionGridRdCost[grid] < minGridRdCost)
conditionB = (partitionGridRdCost[grid] == minGridRdCost) && (partitionGridRate[grid] < minGridRate)
if (conditionA | | conditionB) {
minGridRdCost = partitionGridRdCost[grid]
minGridRate = partitionGridRate[grid]
bestGrid = grid
}
}

Sub-block 0 = { 2x2,
2x1,
(bestGrid & 0x8) >> 3 ==1
otherwise

Sub-block 1 = { 2x2,
2x1,
(bestGrid & 0x4) >> 2 ==1
otherwise

Sub-block 2 = { 2x2,
2x1,
(bestGrid & 0x2) >> 1 ==1
otherwise

Sub-block 3 = { 2x2,
2x1,
(bestGrid & 0x1) ==1
otherwise

where:
• largeInt = 999999
• partitionGridRdCost[grid] = Associated partition grid’s RD cost
• partitionGridRate[grid] = Associated partition grid’s rate

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 111 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.2.6 BPV Signaling


BPVs shall be signaled explicitly by the encoder. In addition, a table (bpvTable) shall be signaled
to define the partition selection. bpvTable shall be signaled as a fixed-length 4-bit field in
Substream 0. Each bit corresponds to one of the four sub-blocks within the current block.
A signal with a value of 1 corresponds to a 2x2 partition. A signal with a value of 0 corresponds
to a 2x1 partition. For example, consider a block in which the encoder selects the following
partition types for the four sub-blocks:
2x2, 2x2, 2x1, 2x2

In this case, bpvTable shall be signaled as 1101 in Substream 0.


The BPVs for each sub-block shall be signaled in the corresponding substream. The syntax for the
above example shall be as follows:
• Sub-block 0 – Signal one BPV in Substream 0
• Sub-block 1 – Signal one BPV in Substream 1
• Sub-block 2 – Signal two BPVs in Substream 2
• Sub-block 3 – Signal one BPV in Substream 3

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 112 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.3 MPP Mode


In Midpoint Prediction (MPP) mode, each sample in the current block is predicted from a
“midpoint” value (mp) which is calculated from the block’s previously reconstructed neighbors.
As with BP mode, the MPP block is split into 2x2 sub-blocks, with each sub-block being
completely independent. The error in each prediction is diffused to neighboring samples within
the same sub-block. Figure 4-34 illustrates the MPP mode prediction and reconstruction buffers.

MPP Mode (RGB) MPP Mode (YCbCr)

Source Buffer (RGB) Current Block Source Buffer (YCbCr) Current Block

Reconstruction Buffer (RGB) Reconstruction Buffer (YCbCr)


Midpoint (NFBLS) Midpoint (NFBLS)
Previous Blockline Previous Blockline

Midpoint (FBLS) Midpoint (FBLS)

Reconstruction Buffer (RGB)


Reconstructed Reconstructed
Current Blockline

CSC

Reconstruction Buffer (YCoCG) Reconstruction Buffer (YCbCr)


Current Blockline Current Blockline

MPP Mode (YCoCg)

Source Buffer (RGB) CSC Current Block

Reconstruction Buffer (RGB)


CSC Midpoint (NFBLS)
Previous Blockline

Midpoint (FBLS)

Reconstruction Buffer (RGB)


Reconstructed
Current Blockline

CSC

Reconstruction Buffer (YCoCg)


Current Blockline

Figure 4-34: MPP Mode Prediction and Reconstruction Buffers

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 113 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.3.1 MPP Mode Step Size


MPP mode modifies the rate control-maintained QP before the QP is used to calculate the step size.
The steps in Table 4-51 are performed to determine mppQp as a function of the rate control QP
and the current buffer fullness state. The value bppIndex is obtained from the compressed bit rate,
as illustrated in Figure 4-35.

Table 4-51: mppQp Calculation


minBpp = 64
bppIndex = clip (0, 7, (bits_per_pixel – minBpp + 15) >> 4)

if ( (rcFullness >> 13) == 7) {


maxQp = mppMaxQpLut[chromaFormat][bppIndex]
offset = 16
} else if ( (rcFullness >> 13) == 6) {
maxQp = mppMaxQpLut[chromaFormat][bppIndex] – 8
offset = 8
} else {
maxQp = max_qp_lut[rcFullness >> 13]
offset = 0
}
mppQp = clip (minQp, maxQp, qp + offset)
if (underflowPrevention && (chromaFormat == 420 | | chromaFormat == 422) ) {
mppQp = min (mppQp, flatness_qp_very_flat_fbls)
}

bppIndex 0 1 2 3 4 5 6 7

bpp 64 80 96 112 128 144 160

Figure 4-35: MPP Mode Bits per Pixel to bppIndex Mapping

The scalar quantizer used by MPP mode has a step size that is determined by the value
mppStepSize, which will be signaled explicitly in the bitstream, as follows:
mppStepSize = clip (mppMinStepSize, bpc – 1, ( (mppQp – 16) >> 3) + (bpc – 8) )
where:
• mppMinStepSize is determined as described in Table 4-52

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 114 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

Table 4-52: mppMinStepSize for Various Bit Depths and ssm_max_se_size


Bits per Component Chroma Sampling Format
4:4:4 4:2:2 4:2:0
8 0 0 0
1 if ssm_max_se_size == 128
10 0 0
0 if ssm_max_se_size == 144
3 if ssm_max_se_size == 128
12 0 0
1 if ssm_max_se_size == 144

In the RGB color space, mppStepSize is used as the quantization step size for all three color
components. In the YCoCg color space, a remapping is performed, using the fixed tables
mppStepSizeMapCo and mppStepSizeMapCg:

mppStepSizeComp = { mppStepSizeMapCo[mppStepSize],
mppStepSizeMapCg[mppStepSize],
mppStepSize,
mppColorSpace == 1, Co component
mppColorSpace == 1, Cg component
otherwise

After the encoder determines mppStepSize, the encoder shall calculate a midpoint value for each
2x2 sub-block, as described in Section 4.6.3.2.
The number of bits per sample in each component for MPP mode (mppBitsPerComp) is calculated
for each component, as follows:
bpc + 1, Co or Cg component
bitDepth = {
bpc, otherwise
mppBitsPerComp[k] = (bitDepth – mppStepSizeComp[k])

Quantized MPP residuals within the bitstream shall be signaled as unsigned values. After
the quantized residuals are calculated as described in Section 4.6.3.3, they are mapped to unsigned
values by subtracting the value minCode, where minCode = - (1 << (mppBitsPerComp – 1) ).

4.6.3.2 Calculate Midpoints


The encoder shall calculate a midpoint value for each 2x2 sub-block within the current block. This
calculation will depend on the current block’s position within the slice, as described in Table 4-53.
The mapping between reference data and color space for MPP mode is shown in Figure 4-34. CSC
blocks are indicated where necessary to convert reference data to the required color space. There is
one special case for NFBLS blocks that are tested within the YCoCg color space. In this case, the
mean value is calculated from the previous reconstructed line data (within the RGB color space),
and CSC is used to convert the final mean from RGB to YCoCg.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 115 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

Table 4-53: MPP Mode Midpoint Calculation for Sub-blocks within a Given Component k

middle = { 0,
1 << (bitDepth – 1),
Co or Cg component
otherwise
mppStepSize[k] == 0
bias = { 0,
1 << (mppStepSize[k] – 1), otherwise
for (subblock = 0; subblock < 4; subblock ++) {
if (chroma format is 420 or 422 && k > 0 && subblock ≥ 2) {
mean = 0
} else {
sbx = subblock << 1
if (first block in slice) {
mean = middle
} else if (FBLS block) {
if (chroma format is 420 && k > 0) {
mean = ( X prev (sbx, 0) + X prev (sbx + 1, 0) ) >> 1
} else {
mean = ( X prev (sbx, 0) + X prev (sbx + 1, 0) + X prev (sbx, 1) + X prev (sbx + 1, 1) ) >> 2
}
} else if (NFBLS block) {
mean = ( X prevRecLine(sbx, 0) + X prevRecLine(sbx + 1, 0) ) >> 1
if (original color space is RGB and testing in YCoCg) {
perform color space conversion on mean (RGB → YCoCg)
}
}
}
maxClip = min ( (1 << bitDepth) – 1, middle + 2 × bias)
mp = clip (middle, maxClip, mean + 2 × bias)
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 116 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

For blocks that are located within the first blockline of a slice, each sub-block’s mp is calculated
from the corresponding sub-block within the previous reconstructed block. (See Figure 4-36, top.)
For NFBLS blocks, the immediate vertical neighbors of each sub-block are used instead.
(See Figure 4-36, bottom.) For the first block within a slice, no spatial neighbors will be used.

Arec Brec ... A B ...


FBLS
Crec Drec ... C D ...

Previous Reconstructed Block Current Block

Previous
x y ...
Reconstructed Line

A B ...
NFBLS Current Block
C D ...

Figure 4-36: MPP Mode Midpoint Is Calculated from the Current Block’s
Reconstructed Spatial Neighbors

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 117 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.3.3 Prediction, Quantization, Inverse Quantization, Reconstruction,


and Error Diffusion
MPP mode uses a scalar quantizer, which is described in Section 4.8.2. In this example, A, B, C,
and D are the four samples within a 2x2 sub-block, and mp is the corresponding midpoint value.
The prediction error is diffused within each 2x2 sub-block, as illustrated in Figure 4-37, if the
following condition is true:
true, (bpc – mppStepSizeComp) ≤ mppErrorDiffusionThreshold
allowErrorDiffusion = { false, otherwise

where:
• mppErrorDiffusionThreshold = 3

3UHGLFW
$ % $UHF % $UHF % $UHF %UHF
8SGDWH 'LIIXVH(UURU
5HFRQVWUXFWHG
& ' & ' & ' & '

$UHF %UHF $UHF %UHF $UHF %UHF $UHF %UHF

& ' &UHF ' &UHF ' &UHF 'UHF

Figure 4-37: MPP Mode Error Diffusion within a 2x2 Sub-block

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 118 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

Table 4-54 describes the processes of prediction, quantization, inverse quantization, reconstruction,
and error diffusion.

Table 4-54: MPP Mode Prediction, Quantization, Inverse Quantization,


Reconstruction, and Error Diffusion for a 2x2 Sub-block
Ares = A – mp
Aq = Q[Ares]

Arec = Q-1[Aq] + mp
sampleError = A – Arec

B' = { B + ( (sampleError + 1) >> 1),


B,
allowErrorDiffusion
otherwise

C' = {C,
C + ( (sampleError + 1) >> 1), allowErrorDiffusion
otherwise
if (chroma == 444 || comp == 0) {
Bres = B' – mp
Bq = Q[Bres]

Brec = Q-1[Bq] + mp
sampleError = B – Brec

C'' = { C' + ((sampleError + 1) >> 1),


C',
allowErrorDiffusion
otherwise

D' = { D + ((sampleError + 1) >> 1),


D,
allowErrorDiffusion
otherwise
}
if (chroma == 444 || chroma == 422 || comp == 0) {
Cres = C'' – mp
Cq = Q[Cres]

Crec = Q-1[Cq] + mp
sampleError = C – Crec

D'' = { D' + ((sampleError + 1) >> 1),


D',
allowErrorDiffusion
otherwise
}
if (chroma == 444 || comp == 0) {
Dres = D'' – mp
Dq = Q[Dres]

Drec = Q-1[Dq] + mp
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 119 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.3.4 Color Space Selection


For RGB source content, MPP mode is tested in both the RGB and YCoCg color spaces. The
YCoCg reconstructed block will be color-space converted back to RGB so that distortion for both
modes is calculated in the RGB color space. The encoder will select the color space that generates
the minimum distortion (i.e., the RGB color space shall be selected if modeDistortionRgb ≤
modeDistortionYCoCg). A 1-bit flag mppColorSpace is signaled in the bitstream so that the
decoder does not need to compare the two color spaces.
For YCbCr source content, MPP mode is tested only in the YCbCr color space. As such, distortion
comparison is not required, and the syntax does not require the mppColorSpace flag.

4.6.3.5 Substream Mapping


The MPP quantized residuals will be signaled in the bitstream, uniformly distributed among the
substreams, as described in Table 4-55. For example, if MPP mode is being used in the RGB
color space, the first four samples from each component will be signaled in Substream 0, while
the last 12 samples will be signaled in Substreams 1 through 3, respectively.

Table 4-55: MPP Mode Quantized Residual Distribution among Substreams 0 through 3
Chroma Format Substream 0 Substream 1 Substream 2 Substream 3
4:4:4, Component 0[0 – 3] Component 0[4 – 15] Component 1[4 – 15] Component 2[4 – 15]
12 samples/ssm Component 1[0 – 3]
Component 2[0 – 3]
4:2:2, Component 0[0 – 7] Component 0[8 – 15] Component 1[0 – 7] Component 2[0 – 7]
8 samples/ssm
4:2:0, Component 0[0 – 5] Component 0[6 – 11] Component 0[12 – 15] Component 1[2 – 3]
6 samples/ssm Component 1[0 – 1] Component 2[0 – 3]

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 120 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.4 MPPF Mode


MPPF mode is mostly identical to MPP mode, with the exception that the step sizes used are fixed,
rather than being tied to the QP. As with MPP mode, MPPF mode is tested in both the RGB and
YCoCg color spaces for RGB input. The number of bits that need to be allocated to each sample is
derived from the PPS parameter mppf_bits_per_comp value, as illustrated in Figure 4-38. For
example, mppfBitsPerComp for the red component is in DW20[23:20].

DW20

B80 B81 B82 B83

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

RESERVED R/Y G/Cb B/Cr Y Co Cg

Figure 4-38: MPPF Mode Bits per Component


Is Defined in PPS Parameter mppf_bits_per_comp

In the PPS, mppfBitsPerComp must be specified such that the maximum rate is strictly less than
the average block rate (avgBlockBits). An MPPF block’s maximum rate is stored in the parameter
minBlockBits, which the rate controller uses to ensure that at least one mode is available to
encoder mode selection during each block time. (See Section 4.9.) The parameter maxHeaderLen
is eight bits for RGB source content, and seven bits for YCbCr source content. This is a sum of the
maximum mode header (four bits), the maximum flatness header (three bits), and one bit to signal
the MPPF color space (mppfColorSpace) in the case of RGB source content.

minBlockBits = ( k ∈ [0, 2 ]
compSamples[k] × mppfBitsPerComp[k] ) + maxHeaderLen

For RGB source content, mppf_bits_per_comp shall be specified such that the following holds:


k ∈ [0, 2 ]
mppfBitsPerCompRgb[k] ≥ 
k ∈ [0, 2 ]
mppfBitsPerCompYCoCg[k]

The remainder of MPPF mode is identical to MPP mode, with the following mapping between
mppfBitsPerComp and mppfStepSizeComp:
bpc + 1, Co or Cg component
bitDepth = bpc, { otherwise
mppfStepSizeComp[k] = bitDepth – mppfBitsPerComp[k]

As with MPP mode, the encoder shall select the color space that generates the minimum
distortion. For MPPF mode, the RGB color space shall be selected if modeDistortionRgb <
modeDistortionYCoCg. The selected color space (mppfColorSpace) is then signaled explicitly
in the bitstream.
Quantized MPPF residuals within the bitstream shall be signaled as unsigned values. After
the quantized residuals are calculated as described in Section 4.6.3.3, they are mapped to unsigned
values by subtracting the value minCode, where minCode = - (1 << (mppfBitsPerComp – 1) ).

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 121 of 173
Section 4: Encoding Process (Normative)
Test Coding Modes

4.6.5 BP-SKIP Mode


Block Prediction Skip (BP-SKIP) mode is a variant of BP mode. Whereas the quantized residuals
are entropy coded and transmitted in BP mode, they are skipped in BP-SKIP mode. For this reason,
BP-SKIP mode is typically executed as a fallback mode with low rate and high distortion.
Table 4-56 lists the syntax differences between BP and BP-SKIP modes.

Table 4-56: BP and BP-SKIP Mode Syntax Differences


Syntax BP Mode BP-SKIP Mode
BPV Table ✔ ✔
BPV ✔ ✔
ECGs ✔
(Quantized Residuals)

BPV search is not repeated for BP-SKIP mode. Instead, the residuals and quantized residuals
determined in BP mode may be directly re-used. The procedure described in Section 4.6.2.5 for
BP partition selection is repeated for BP-SKIP; however, in this case, the rate is determined solely
by header bits and BPV signaling.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 122 of 173
Section 4: Encoding Process (Normative)
Entropy Encoding

4.7 Entropy Encoding


Syntax for the following modes shall include ECGs that are comprised as follows:
• Transform mode – ECGs contain quantized transform coefficients
• BP mode – ECGs contain quantized residual

There are four ECGs for each component, indexed by ecgIdx. The contents of each ECG shall
depend on the coding mode, chroma format, component skip, component data, and whether
rate buffer underflow prevention is enabled. Each ECG contains between 0 to ecgMaxBits bits
where ecgMaxBits is equal to 50, and is constructed as described in Section 4.7.2.
Each component’s ECGs shall be transmitted within a separate substream, as follows:
• Substream 0 – No ECGs
• Substream 1 – ECGs for Component 0
• Substream 2 – ECGs for Component 1
• Substream 3 – ECGs for Component 2

4.7.1 Component Skip


If all samples within a component are 0, a component skip flag shall be signaled within the
bitstream. The component skip flag’s value shall depend on the block mode, as described
in Table 4-57. Transform mode’s component skip flag shall be one bit larger to ensure that
ssmMinSeSize is equal to two bits.

Table 4-57: Component Skip Flags


Mode Component Skip Inactive Component Skip Active
Transform 0 (One bit) 11 (Two bits)
BP 0 (One bit) 1 (One bit)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 123 of 173
Section 4: Encoding Process (Normative)
Entropy Encoding

4.7.2 ECG Construction


Each ECG contains between 0 to ecgMaxBits bits, where ecgMaxBits is equal to 50. During the
encoder’s testing phase, any mode that generates an ECG that is larger than ecgMaxBits shall
be invalidated for the current block time. In Transform mode, this will invalidate only the current
intra predictor that is being tested, and others may still be used as long as they do not violate this
constraint. Several factors allow for a 0-bit ECG:
• Component skip is active
• Component is a 4:2:2 or 4:2:0 chroma component
• Transform mode component with a small lastSigPos value (see Section 4.6.1.5.1)

The bits that comprise each ECG are divided into three categories, as described in Table 4-58.

Table 4-58: ECG Construction


Category Bit Description
Data Required if at least one bit of information needs to be signaled for the samples within the group.
The indicator ecgDataActive is used to determine whether the data portion is necessary.
(See Table 4-59 and Table 4-60.) If ecgDataActive[ecgIdx] is true, the ECG’s data portion
will be signaled using some combination of the following:
• Group Skip flag (see Section 4.7.3)
• Unary prefix (see Section 4.7.4)
• Common-prefix Entropy coding (CPEC; see Section 4.7.5)
• Vector Entropy coding (VEC; see Section 4.7.6)
Stuffing Bits If underflowPrevention is enabled, fixed-length stuffing words are present in the last three ECGs
within each component. (See Section 4.7.7.)
Sign Bits Sign bits for all non-zero samples within an SM ECG are grouped in ECG3. (See Section 4.7.8.)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 124 of 173
Section 4: Encoding Process (Normative)
Entropy Encoding

Table 4-59: ecgDataActive[ecgIdx] for Transform Mode Component


Case ecgDataActive[0] ecgDataActive[1] ecgDataActive[2] ecgDataActive[3]
ecCompSkip False False False False
4:4:4, True True True True
lastSigPos ∈ [9, 15]
4:4:4, True True False True
lastSigPos ∈ [4, 8]
4:4:4, False True False True
lastSigPos ∈ [1, 3]
4:4:4, False False False True
lastSigPos == 0
4:2:2, k > 0, True False False False
lastSigPos == 0
4:2:2, k > 0, True True False False
lastSigPos ∈ [1, 7]
4:2:0, k > 0, True False False False
lastSigPos ∈ [0, 3]

Table 4-60: ecgDataActive[ecgIdx] for BP Mode Component


Case ecgDataActive[0] ecgDataActive[1] ecgDataActive[2] ecgDataActive[3]
ecCompSkip False False False False
4:4:4 True True True True
4:2:2, k == 0 True True True True
4:2:0, k == 0 True True True True
4:2:2, k > 0 True True False False
4:2:0, k > 0 True False False False

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 125 of 173
Section 4: Encoding Process (Normative)
Entropy Encoding

Figure 4-39 illustrates example ECG constructions for a Transform mode color component.
Note that any data in ecgIdx == 3 shall be coded using 2’s complement (2C) representation,
while all other ECGs shall be coded using sign/magnitude (SM). Table 4-59 describes the possible
ecgDataActive[ecgIdx] values for Transform mode.

ECG0 ECG1 ECG2 ECG3

4:4:4, Data 5 Samples (SM) 3 Samples (SM) 7 Samples (SM) 1 Sample (2C)
k = any,
ecCompSkip = 0, Stuffing Bits Yes Yes Yes
underflowPrevention = 1,
lastSigPos = 15 Sign Bits Yes

Data – – – –
4:4:4,
k > 0,
Stuffing Bits Yes Yes Yes
ecCompSkip = 1,
underflowPrevention = 1
Sign Bits –

Data – – – –
4:4:4,
k > 0,
Stuffing Bits – – –
ecCompSkip = 1,
underflowPrevention = 0
Sign Bits –

4:4:4, Data 3 Samples (SM) 3 Samples (SM) – 1 Sample (2C)


k = any,
ecCompSkip = 0, Stuffing Bits – – –
underflowPrevention = 0,
lastSigPos = 6 Sign Bits Yes

4:4:4, Data – – – 1 Sample (2C)


k = 0,
ecCompSkip = 0, Stuffing Bits – – –
underflowPrevention = 0,
lastSigPos = 0 Sign Bits –

4:4:4, Data – – – 1 Sample (2C)


k > 0,
ecCompSkip = 0, Stuffing Bits – – –
underflowPrevention = 0,
lastSigPos = 0 Sign Bits Yes

4:2:2, Data 1 Sample (SM) 7 Samples (SM) – –


k > 0,
ecCompSkip = 0, Stuffing Bits – – –
underflowPrevention = 0,
lastSigPos = 7 Sign Bits Yes

4:2:0, Data 3 Samples (SM) – – –


k > 0,
ecCompSkip = 0, Stuffing Bits – c – –
underflowPrevention = 0,
lastSigPos = 2 Sign Bits Yes

Figure 4-39: Transform Mode Component ECG Construction Example

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 126 of 173
Section 4: Encoding Process (Normative)
Entropy Encoding

Figure 4-40 illustrates example ECG constructions for a BP mode color component. As with
Transform mode, data shall be coded using 2C bit representation if ecgIdx == 3, and coded using
SM bit representation otherwise. Table 4-60 describes the possible ecgDataActive[ecgIdx] values
for BP mode.

ECG0 ECG1 ECG2 ECG3

Data 4 Samples (SM) 4 Samples (SM) 4 Samples (SM) 4 Samples (2C)


4:4:4,
k = any,
Stuffing Bits Yes Yes Yes
ecCompSkip = 0,
underflowPrevention = 1
Sign Bits Yes

Data – – – –
4:4:4,
k > 0,
Stuffing Bits Yes Yes Yes
ecCompSkip = 1,
underflowPrevention = 1
Sign Bits –

Data – – – –
4:4:4,
k > 0,
Stuffing Bits – – –
ecCompSkip = 1,
underflowPrevention = 0
Sign Bits –

Data 4 Samples (SM) 4 Samples (SM) 4 Samples (SM) 4 Samples (2C)


4:4:4,
k = any,
Stuffing Bits – – –
ecCompSkip = 0,
underflowPrevention = 0
Sign Bits Yes

Data 4 Samples (SM) 4 Samples (SM) – –


4:2:2,
k > 0,
Stuffing Bits Yes Yes Yes
ecCompSkip = 0,
underflowPrevention = 1
Sign Bits Yes

Data 4 Samples (SM) – – –


4:2:0,
k > 0,
Stuffing Bits Yes Yes Yes
ecCompSkip = 0,
underflowPrevention = 1
Sign Bits Yes

Figure 4-40: BP Mode Component ECG Construction Example

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 127 of 173
Section 4: Encoding Process (Normative)
Entropy Encoding

4.7.3 Bit Representations and bitsReq


Each ECG with valid data shall either be coded using group skip, Common-prefix Entropy
Coding (CPEC), –or– Vector Entropy Coding (VEC), as determined by the logic illustrated
in Figure 4-41. (For details regarding CPEC and VEC, see Section 4.7.5 and Section 4.7.6,
respectively.)

Y Y
BP mode && Code ECG
Calculate bitsReq bitsReq > 0 ?
bitsReq ≤ 2 ? Usin g V EC

N N

Code ECG Code ECG


Usin g G roup Skip Usin g CPE C

Figure 4-41: Entropy Encoding Flowchart to Select EC Type

The number of bits required for each sample, bitsReq, depends on whether the current group uses
SM or 2C bit representation. This is determined using the following rule:
• SM bit representation – Used if ecgIdx < 3
• 2C bit representation – Used if ecgIdx == 3

The encoder determines bitsReq as the minimum value (as described in Table 4-61) that can fully
represent all samples within the current ECG. Note that for SM representation, the sample’s
magnitude shall be used to determine bitsReq. Therefore, the effective range on the sample itself
would be symmetric around 0 (e.g., [-3, 3]). For example, an ECG that uses SM bit representation
where the magnitude of all samples is within the range [0, 7] shall correspond with a bitsReq of 3.
Likewise, an ECG that uses 2C bit representation shall have a bitsReq of 3 if all samples are within
the range [-4, 3].

Table 4-61: Mapping from bitsReq to Sample Ranges in SM and 2C Bit Representations
bitsReq Range Range
(SM) (2C)
1 [0, 1] [-1, 0]
2 [0, 3] [-2, 1]
3 [0, 7] [-4, 3]
4 [0, 15] [-8, 7]
5 [0, 31] [-16, 15]
… … …
N [0, 2N – 1] [-2N – 1, 2N – 1 – 1]

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 128 of 173
Section 4: Encoding Process (Normative)
Entropy Encoding

4.7.4 Unary Prefix to Signal bitsReq


The syntax for all non-skipped ECGs consists of two parts:
• Unary prefix – Used to signal bitsReq
• Suffix – Used to signal the group samples (CPEC or VEC (see Section 4.7.5 or Section 4.7.6,
respectively))

The unary prefix shall be generated, as described in Table 4-62. The case of bitsReq == 0 does not
correspond with a unary prefix because such a group would be coded using a group skip flag
instead. For bitsReq < 6, the unary prefix depends on the coding mode, as detailed Table 4-62.
For bitsReq ≥ 6, the unary prefix shall be the same for all modes, signaled as (bitsReq – 1) 1s
followed by a 0.
For a non-skipped group, first the groupSkip flag shall be signaled as 0 to specify that group skip
is disabled. Next, the unary prefix shall be transmitted as described above. The suffix is the final
step of entropy coding and shall be handled differently, depending on whether the current group
uses CPEC or VEC.

Table 4-62: bitsReq Unary Coding


Unary Prefix bitsReq bitsReq bitsReq
(Transform Mode, (Transform Mode, (BP Mode)
Luma Component) Chroma Components)
0 2 2 1
10 3 1 2
110 4 3 3
1110 5 4 4
11110 1 5 5
111110 6 6 6
1111110 7 7 7
… … … …

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 129 of 173
Section 4: Encoding Process (Normative)
Entropy Encoding

4.7.5 CPEC
Each sample within a CPEC ECG will be signaled as a fixed-length field, where the length is
bitsReq. For SM bit representation, the magnitude shall be signaled in the fixed-length field, and
the sign information will be signaled later. (See Section 4.7.8.) For 2C bit representation, each
sample shall be directly coded using the 2C, which includes both sign and magnitude information.

4.7.6 VEC
VEC is handled in three stages, as illustrated in Figure 4-42.
• ECG samples to scalar vecCodeSymbol mapping
• vecCodeSymbol to vecCodeNumber mapping, using an LUT
• vecCodeNumber signaling, using Golomb-Rice coding

[S0 , S1 , S2, S3 ] vecCodeSymbol vecCodeNumber Bits


Vector -> Scalar Mapping (LUT) VEC Code

Figure 4-42: VEC Encoding Process

First, the four samples within the ECG shall be mapped to a scalar value vecCodeSymbol, using the
logic described in Table 4-63. This mapping is one-to-one; thus, each possible sample vector will
generate a unique vecCodeSymbol.

Table 4-63: VEC Encoding – Calculate vecCodeSymbol from ECG Samples


bitMask = (1 << bitsReq) – 1
vecCodeSymbol = 0
for (sample s in ECG) {
vecCodeSymbol = (vecCodeSymbol << bitsReq) | (s & bitMask)
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 130 of 173
Section 4: Encoding Process (Normative)
Entropy Encoding

vecCodeSymbol to vecCodeNumber mapping shall be performed, using a table look-up. The


table itself is stored by both the encoder and decoder, and is a one-to-one mapping between
vecCodeSymbol and vecCodeNumber. (For the table definition, see VEC_MAPPING_TABLE
in the C model.) vecCodeNumber is then encoded into the bitstream, using a Golomb-Rice
code with fixed parameter vecGrK, where vecGrK is determined by the group’s component,
representation, and bitsReq, as described in Table 4-64. Table 4-65 describes the encoding process.

Table 4-64: vecGrK Parameter for VEC Is Determined


by Group’s Bit Representation and Component Index
Bit Representation Component bitsReq vecGrK
SM Luma, Chroma 1 2
2C Luma, Chroma 1 1
SM, 2C Luma, Chroma 2 5

Table 4-65: VEC vecCodeNumber Encoding, Using Golomb-Rice Coding


prefix = vecCodeNumber >> vecGrK
suffix = vecCodeNumber & ( (1 << vecGrK) – 1)
while (prefix > 0) {
signal "1" bit
prefix –= 1
}
signal "0" bit
if (vecGrK > 0) {
signal suffix using vecGrK bits
}

Each Golomb-Rice code contains a unary prefix, followed by a fixed-length suffix of length
vecGrK. The coding size is calculated as 24 × bitsReq because each vector contains four samples,
each represented by bitsReq bits. If the prefix reaches the maximum prefix length, the ending
0 bit in the unary prefix can be omitted because the decoder can infer the ending 0 bit.

Example coding entries are provided in the following tables:


• bitsReq = 1, vecGrK = 1 – See Table 4-66
• bitsReq = 1, vecGrK = 2 – See Table 4-67
• bitsReq = 2, vecGrK = 5 – See Table 4-68

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 131 of 173
Section 4: Encoding Process (Normative)
Entropy Encoding

Table 4-66: Example Golomb-Rice Coding Entries – bitsReq = 1, vecGrK = 1


x x >> 1 x%2 Prefix Suffix Size
(Bits)
0 0 0 0 0 2
1 0 1 0 1 2
2 1 0 10 0 3
3 1 1 10 1 3
8 4 0 11110 0 6
13 6 1 1111110 1 8
14 7 0 1111111 0 8
15 7 1 1111111 1 8

Table 4-67: Example Golomb-Rice Coding Entries – bitsReq = 1, vecGrK = 2


x x >> 2 x%4 Prefix Suffix Size
(Bits)
0 0 0 0 00 3
1 0 1 0 01 3
2 0 2 0 10 3
3 0 3 0 11 3
4 1 0 10 00 4
8 2 0 110 00 5
11 2 3 110 11 5
12 3 0 111 00 5
15 3 3 111 11 5

Table 4-68: Example Golomb-Rice Coding Entries – bitsReq = 2, vecGrK = 5


x x >> 5 x % 32 Prefix Suffix Size
(Bits)
0 0 0 0 00000 6
31 0 31 0 11111 6
32 1 0 10 00000 7
128 4 0 11110 00000 10
176 5 16 111110 10000 11
223 6 31 1111110 11111 12
224 7 0 1111111 00000 12
255 7 31 1111111 11111 12

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 132 of 173
Section 4: Encoding Process (Normative)
Entropy Encoding

4.7.7 Rate Buffer Stuffing Bits


The rate control algorithm shall enable Underflow Prevention mode (underflowPrevention) for
block times in which the following condition is met:
bufferFullness < avgBlockBits

To prevent the rate buffer from underflowing, the syntax includes stuffing bits for this block
time to ensure that the block rate is strictly greater than avgBlockBits. The stuffing bits shall
be composed of nine stuffing words (rateBufferStuffingWord), allocated as three stuffing words
each to Substreams 1 through 3. The stuffing word’s size is determined by PPS parameter
rc_stuffing_bits. These stuffing words shall be included in the last three ECGs for each substream.
Note that these stuffing words shall be transmitted regardless of whether ecgDataActive is false
for a given ecgIdx.
The rc_stuffing_bits size shall be determined by the average block rate, as follows:

rc_stuffing_bits =  avgBlockBits
9
–8

The quantity (avgBlockBits – 8) accounts for the largest required stuffing, given a block with the
minimum syntax element size of two bits for all four substream (eight bits). This stuffing is divided
into nine equal words because the rate buffer underflow prevention stuffing will be present in
three ECGs per component.
For example, if avgBlockBits == 96 (6 bits/pixel), each stuffing word shall contain the following:

 969– 8  = 10 bits
4.7.8 Sign Bits
The final entropy coding step shall signal the sign bits for all non-zero samples within groups that
are coded using SM representation. These sign bits shall be grouped together and signaled as part
of the ecgIdx == 3 syntax.
For Transform mode, signLastSigPos shall be signaled as the final sign bit, if required, as described
in Section 4.6.1.5.1.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 133 of 173
Section 4: Encoding Process (Normative)
Quantizer

4.8 Quantizer
Quantization is used by each mode, as described in Table 4-69. The amount of quantization used
is tied to the QP. The QP value used by each type of quantizer will be derived from the QP value
that is maintained by the rate control model. This is described in Section 4.8.1.1 and Section 4.8.2
for the fractional and scalar quantizers, respectively.

Table 4-69: Quantization


Mode Quantizer Used Described In
Transform and BP Fractional Section 4.8.1
MPP and MPPF Scalar Section 4.8.2
BP-SKIP Not applicable –
(does not use quantization because no residuals are transmitted)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 134 of 173
Section 4: Encoding Process (Normative)
Quantizer

4.8.1 Fractional Quantization


The fractional quantizer used for Transform and BP modes has increased granularity with respect
to a scalar quantizer. The octave size of the VDC-M fractional quantizer is 8, meaning that
a QP of n + 8 will quantize with a step size of twice that of a QP of n. The minimum allowable
QP for 8bpc content is 16, and the maximum allowable QP is 72. For content with a bit depth
greater than 8, the maximum QP is maintained while the minimum QP is reduced by a factor
of (bpc – 8) << 3, as follows:

{
16, bpc == 8
0, bpc == 10
minQp =
-16, bpc == 12

The fractional quantizer’s behavior is different for Transform and BP modes because of the
difference in dynamic range between transformed residuals T(i, j) (Transform mode) and residuals
R(i, j) (BP mode). In addition, a transform normalization factor is present in the quantization
of Transform mode that is not present for BP mode.
The QP value that is used by the fractional quantizer is derived from the rate control QP,
as described in Section 4.8.1.1.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 135 of 173
Section 4: Encoding Process (Normative)
Quantizer

4.8.1.1 QP Mapping
At the beginning of each block time, the rate control algorithm shall update the QP as described
in Section 4.5.3. During this block time, the fractional quantizer shall use a modified QP value
(qpMod), which is determined from the rate control QP (qpRc), coding mode, component index,
and bit depth, as described in Table 4-70. Outside of this section, general discussion with regard
to the QP shall refer only to qpRc.

Table 4-70: qpRc to qpMod Mapping

{
4, 4:4:4, FBLS
2, 4:4:4, NFBLS
offset =
0, otherwise

qpTempA = { qpRc,
clip (minQp, maxQp, qpRc + offset),
Transform mode
BP mode
if (color space is RGB) {
qpTempB = qpTempA
} else if (color space is YCbCr) {

qpTempB = { qpTempA,
quantStepChroma[qpTempA – 16],
qpTempA ≤< 16 | k == 0
otherwise
} else if (color space is YCoCg) {
if (qpTempA < 16) {

qpTempB = { qpTempA,
qpTempA + 8,
k == 0
otherwise
} else {

{
qpTempA, k == 0
quantStepCo[qpTempA – 16], k == 1
qpTempB =
quantStepCg[qpTempA – 16], k == 2
if (k > 0 && FBLS && qpTempA ≤ maxQp) {
qpTempB = clip (minQp, 72, qpTempB)
}
}
}
qpMod = qpTempB + ( (bpc – 8) << 3)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 136 of 173
Section 4: Encoding Process (Normative)
Quantizer

4.8.1.2 Fractional Quantization for Transform Mode


In Transform mode, the set of transform coefficients T(i, j) is quantized, using the fractional
quantizer to generate a set of quantized transform coefficients Tq(i, j). Inverse quantization
is performed on Tq(i, j) to generate the reconstructed transform coefficients T (i, j) .
The forward transform normalization step and quantization are combined into a single step, using
a pre-defined matrix. The matrix for forward quantization is quantTableForwardDct. The matrix
for inverse quantization is quantTableInverseDct. Each of these matrices shall be defined separately
for 8x2, 4x2, and 4x1 block components, and the dimension of each quantization table matrix is
(compWidth >> 1) × 8. Therefore, there will be a total of six matrices defined for forward/inverse
normalization and quantization, as follows:
• quantTableForwardDct8x2 / quantTableInverseDct8x2 (dimension is 4x8)
• quantTableForwardDct4x2 / quantTableInverseDct4x2 (dimension is 2x8)
• quantTableForwardDct4x1 / quantTableInverseDct4x1 (dimension is 2x8)

Each row of the quantization table specifies the coefficients for a given fractional quantization step
(qpRem = QP & 0x07). A mapping is then defined between each sample position within a row and
the associated quantization coefficient in the quantization table. Both rows of the current block use
the same mapping.
• quantTableMapping8x2 = [0, 1, 2, 3, 0, 3, 2, 1]
• quantTableMapping4x2 = [0, 1, 0, 1]
• quantTableMapping4x1 = [0, 1]
From the above matrices and mapping array, the forward and inverse quantization procedures
are described in Table 4-71 and Table 4-72, respectively. The specific quantization table and
mapping tables used will be the ones associated with the current block component dimension.
For an 8x2 block component:
• quantTableForwardDct = quantTableForwardDct8x2
• quantTableMapping = quantTableMapping8x2

In addition, the following parameters define the shift associated with the Transform mode
quantization coefficients and dead zone.
• dctQuantBits = 8
• dctQuantDeadZone = 102

where:
• Dead zone size is 102 / 256 ≈ 0.4

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 137 of 173
Section 4: Encoding Process (Normative)
Quantizer

Table 4-71: Transform Mode Fractional Forward Quantization


qpScale = qpMod >> 3
qpRem = qpMod & 0x07
for (r = 0; r < numRows; r ++) {
for (c = 0; c < numCols; c ++) {
q = quantTableForwardDct[qpRem][quantTableMapping[c] ]
s = sign (T(c, r) )
Tq(c, r) = s × ( (q × | T(c, r) | + (dctQuantDeadZone << qpScale) ) >> (dctQuantBits + qpScale) )
}
}

Table 4-72: Transform Mode Fractional Inverse Quantization


qpScale = qpMod >> 3
qpRem = qpMod & 0x07
for (r = 0; r < numRows; r ++) {
for (c = 0; c < numCols; c ++) {
q = quantTableInverseDct[qpRem][quantTableMapping[c] ]

T (c, r) = (Tq(c, r) × q) << qpScale


}
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 138 of 173
Section 4: Encoding Process (Normative)
Quantizer

4.8.1.3 Fractional Quantization for BP Mode


In BP mode, the set of predicted residuals R(i, j) is quantized, using the fractional quantizer
to generate a set of quantized residuals Rq(i, j). Inverse quantization is performed on Rq(i, j) to
generate the reconstructed residuals R (i, j) .
The forward and inverse quantization coefficients for BP mode are pre-defined in the following
matrices (all entries are unsigned 8-bit values):
• bpQuantScales
• bpInvQuantScales

From these two matrices, the forward and inverse quantization procedures are described in
Table 4-73 and Table 4-74, respectively.
The following parameters define the shifts associated with the BP mode quantization and inverse
quantization coefficients and quantization dead zone:
• bpQuantBits = 6
• bpInvQuantBits = 9
• bpQuantDeadZone = 22

where:
• Dead zone size is (bpQuantDeadZone >> bpQuantBits) =
~ 0.35

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 139 of 173
Section 4: Encoding Process (Normative)
Quantizer

Table 4-73: BP Mode Fractional Forward Quantization


qpScale = qpMod >> 3
qpRem = qpMod & 0x07
q = bpQuantScales[qpRem]
for (r = 0; r < numRows; r ++) {
for (c = 0; c < numCols; c ++) {
shift = bpQuantBits + qpScale
Rq(c, r) = sign (R(c, r) ) × ( (q × | R(c, r) | + (bpQuantDeadZone << qpScale) ) >> shift)
}
}

Table 4-74: BP Mode Fractional Inverse Quantization


qpScale = qpMod >> 3
qpRem = qpMod & 0x07
iq = bpInvQuantScales[qpRem]
shift = bpInvQuantBits – qpScale
for (r = 0; r < numRows; r ++) {
for (c = 0; c < numCols; c ++) {

R (c, r) = { sign (Rq(c, r) ) × ( (iq × | Rq(c, r) | + (1 << (shift – 1) ) ) >> shift,


iq × Rq(c, r) ) << | shift |,
shift > 0
otherwise
}
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 140 of 173
Section 4: Encoding Process (Normative)
Quantizer

4.8.2 Scalar Quantization


The scalar quantizer is used by MPP mode and MPPF mode. (See Section 4.6.3 and Section 4.6.4,
respectively.) The scalar quantizer is controlled by the parameter stepSize, which determines the
number of bit-planes to code for a given sample. The stepSize parameter for each color component
(stepSizeComp) is determined by mppQp for MPP mode (see Section 4.6.3.1), and provided
directly for MPPF mode (see Section 4.6.4). minBpc = 8.

{
( (mppQp – 16) >> 3) + (bitDepth – minBpc), k == 0 | | RGB/YCbCr
mppStepSizeMapCo[stepSizeComp[0] ], k is Co component
stepSizeComp[k] =
mppStepSizeMapCg[stepSizeComp[0] ], k is Cg component

Given stepSize and the current component’s bit depth, forward quantization shall be calculated
as described in Table 4-75.

Table 4-75: Scalar Quantization of Residual Sample R(i, j)

bias = { 0,
1 << (stepSizeComp – 1),
stepSizeComp == 0
otherwise
codeMin = -(1 << (bitDepth – stepSizeComp – 1) )
codeMax = (1 << (bitDepth – stepSizeComp – 1) ) – 1

R'q(i, j) = { (R(i, j) + bias) >> stepSizeComp,


- ( (bias – R(i, j) ) >> stepSizeComp),
R(i, j) ≥ 0
R(i, j) < 0
Rq(i, j) = clip (codeMin, codeMax, R'q(i, j) )

Inverse quantization for scalar-quantized samples is calculated simply as:

R (i, j) = Rq(i, j) << stepSizeComp

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 141 of 173
Section 4: Encoding Process (Normative)
Encoder Mode Selection

4.9 Encoder Mode Selection


Mode selection is performed at the encoder side, after testing of all coding modes. The encoder
selects the mode with the minimum modeRdCost, subject to the following constraints:
• Rate buffer shall not overflow from encoder side.
• Rate buffer shall have at least avgBlockBits after coding current mode, to avoid underflow
from encoder side
• After coding the block, the remainder of the slice shall have at least minBlockBits available
for each block that remains within the slice
• Encoder shall disallow for the block time any mode or portion thereof (e.g., intra predictor) that
generates an ECG greater than ecgMaxBits or a syntax element greater than ssm_max_se_size

avgBlockBits denotes the number of bits that shall be transferred from the rate buffer to the
bitstream, per block time. (See Section 4.5.1.1.) minBlockBits is the worst-case cost for MPPF
mode. (See Section 4.6.4.)
Table 4-76 describes the mode selection conditions for enforcing correct rate buffer behavior
for a given modeRate. If any of these conditions are not met, the mode selection algorithm
shall disallow a coding mode for the current block time.

Table 4-76: Mode Selection Conditions for Enforcing Correct Rate Buffer Behavior for Given modeRate

xmitBits = { 0,
avgBlockBits,
blkIdx < rc_init_tx_delay
otherwise
conditionA: (bufferFullness + modeRate – xmitBits) ≤ rc_buffer_max_size – (rcOffset + rcOffsetInit)
conditionB: (bufferFullness + modeRate – xmitBits) ≥ avgBlockBits + chunkAdjBitsMax
conditionC: (sliceBitsRemaining – modeRate) ≥ (numBlocksRemaining × minBlockBits)

where:
• sliceBitsRemaining is the number of bits that remain within the slice
• numBlocksRemaining is the number of blocks that remain within the slice
• chunkAdjBitsMax = 8 (see Section 4.5.1.1)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 142 of 173
Section 4: Encoding Process (Normative)
Encoder Mode Selection

Encoder mode selection shall then proceed according to the following four steps, for the set
of modes that has not been invalidated for the current block time:
1 From the following set of coding modes, select the mode that has the lowest modeRdCost:
{Transform mode, BP mode, MPP mode}.
2 If all the modes listed in step 1 are invalidated, select the fallback mode (MPPF –or– BP-SKIP)
that has the lowest modeRdCost.
3 If BP mode is selected as the best mode, select BP-SKIP mode instead if the following
conditions exist:
a underflowPrevention is disabled.
b BP mode has zero quantized residual for all three components.
4 If MPP mode is selected as the best mode, select BP-SKIP mode instead if the following
conditions exist:
a underflowPrevention is disabled.
b Chroma format is 4:4:4 or 4:2:2.
c modeRdCost for BP-SKIP mode is lower than the modeRdCost for MPP mode.
d QP ≥ (bpc << 3) – 16.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 143 of 173
Section 4: Encoding Process (Normative)
Block Headers

4.10 Block Headers


The block header, contained in Substream 0, shall be the first syntax encoded for each block that
contains a mode header, which designates the selected mode for the current block, and a flatness
header, which designates the flatness type. Each is described in Table 4-77. Additional syntax may
be signaled in Substream 0, depending on the block mode. This is discussed in the encoding process
section of each coding mode.

Table 4-77: Encoding Process Block Headers


Block Header Field Description
Type
Mode modeSameFlag • One-bit fixed-length field.
• First field signaled for the current block.
• If modeSameFlag is signaled as 1, the current block mode is the same as the
previous block mode. In this case, mode header signaling is complete.
• If modeSameFlag is signaled as 0, the mode is different from the previous
block mode. Field curBlockMode shall be signaled in the bitstream to
indicate the current block mode, as per Table 4-78. At this point, mode
header signaling is complete.
curBlockMode • Dependent two- or three-bit truncated binary field.
• Signaled in the bitstream per Table 4-78 when modeSameFlag is signaled
as 0.
Flatness flatnessFlag • One-bit fixed-length field.
• If encoder flatness detection has assigned a flatness type to the current block,
flatnessFlag shall be signaled as 1, followed by flatnessType to designate
the flatness type as per Table 4-79.
• If encoder flatness detection has not assigned a flatness type to the current
block, flatnessFlag shall be signaled as 0 and signaling of the flatness header
signaling is complete.
flatnessType • Dependent two-bit fixed-length field.
• Signaled in the bitstream per Table 4-79 when flatnessFlag is signaled as 1.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 144 of 173
Section 4: Encoding Process (Normative)
Block Headers

Table 4-78: Mode Header Syntax


Mode Syntax
Transform 00
BP 01
MPP 10
MPPF 110
BP-SKIP 111

Table 4-79: Flatness Header Syntax


Flatness Type Syntax
Very Flat 00
Somewhat Flat 01
Complex-to-flat transition 10
Flat-to-complex transition 11

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 145 of 173
Section 4: Encoding Process (Normative)
Substream Multiplexing

4.11 Substream Multiplexing


The substream multiplexer shall place groups of bits (mux words) into the rate buffer, in the
order expected by the substream de-multiplexer at the decoder side. A model of the substream
de-multiplexer is included in the encoder side such that the encoder can infer the correct mux
word order. During each block time, SSM may transmit up to one mux word per substream.
If multiple mux words are transmitted during the same block time, they shall be transmitted
by numerical order of substream index.
Each substream has a balance FIFO (ssmBalanceFifo), which is used to ensure that sufficient bits
are available to construct a mux word whenever the de-multiplexer model requests one. Table 4-80
describes the two elements that comprise the de-multiplexer model.
Table 4-80: De-multiplexer Model Elements
Element Description
ssmSyntaxFifo FIFO that is used to store the size of each syntax element during the
initial SSM delay.
ssmFunnelShifterFullness De-multiplexer funnel shifter state, used to determine when each
substream will issue a request.

Figure 4-43 illustrates the overall structure of substream multiplexing, including


the four substreams. Bracket notation is used to specify a substream element
(e.g., ssmBalanceFifo[0] refers to the balance FIFO for Substream 0).

ssm[0]
Entropy Encoder ,
ssmBalanceFifo[0]
Funnel Sh ifter

ssm[1]
Entropy Encoder ,
ssmBalanceFifo[1]
Funnel Sh ifter Bitstrea m
Sub stream
Rate B uffer
ssm[2] Multiplexer
Entropy Encoder ,
ssmBalanceFifo[2]
Funnel Sh ifter

ssm[3]
Entropy Encoder ,
ssmBalanceFifo[3]
Funnel Sh ifter

ssmSe Size Reque sts

De-multiplexer
Model

Figure 4-43: Substream Multiplexer

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 146 of 173
Section 4: Encoding Process (Normative)
Substream Multiplexing

SSM requires a small portion of the total slice rate to be set aside to guarantee proper behavior
at the end of the slice. This quantity is provided by PPS parameter num_extra_mux_bits and is
calculated as described in Table 4-81. The portion of the slice rate that is used by the rate controller
shall be as follows:
slice_num_bits – num_extra_mux_bits

Table 4-81: PPS Parameter num_extra_mux_bits Calculation


num_extra_mux_bits = 4 × (2 × ssm_max_se_size – 2)
while ((slice_num_bits – num_extra_mux_bits) % ssm_max_se_size) {
num_extra_mux_bits –=1
}

SSM is subject to a delay at the beginning of the slice, such that ssmBalanceFifo has a chance to
fill up before any mux words are transmitted. This delay is specified by ssmDelay. An additional
fixed delay of one block time is applied to Substreams 1 through 3, relative to Substream 0,
to ease timing at the de-multiplexer. This means that the de-multiplexer shall effectively receive
Substream 0 one block time earlier than the other substreams. This additional delay (ssmSkew)
is specified, as follows:
0, ssmIdx == 0
{
ssmSkew[ssmIdx] = 1, ssmIdx > 0

Therefore, the full value of ssmDelay for each substream is calculated, as follows:
ssmDelay[ssmIdx] = ssmDelayBase + ssmSkew[ssmIdx]

where:
• ssmDelayBase = ssm_max_se_size:

ssmDelayBase =  2 × ssm_max_se_size
ssmMinSeSize
–1
 =  2 × ssm_max_se_size
2
–1
 = ssm_max_se_size
Figure 4-44 illustrates ssmDelay’s impact on the substream multiplexer. For any block time less
than ssmDelay, bits shall be placed into ssmBalanceFifo; however, bits shall not be removed and
placed into the rate buffer. When the block time is greater than or equal to ssmDelay, bits shall
be placed into ssmBalanceFifo, and the de-multiplexer model shall be checked to determine
whether any mux words need to be removed from ssmBalanceFifo and placed into the rate buffer.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 147 of 173
Section 4: Encoding Process (Normative)
Substream Multiplexing

SSM Delay

Encode r
0 1 2 3 4 5 6 ... N N+1 N+2 N+3 ...
Block Inde x

... ...
Add synta xEle me nt to Add synta xEle me nt to
ssmBa lanceFifo[ssmIdx], ssmBa lanceFifo[ssmIdx],
Mux Word
Reque st ?
Add ssmSe Size[ssmIdx] to Add ssmSe Size[ssmIdx] to
ssmSyntaxFifo[ssmIdx] ssmSyntaxFifo[ssmIdx] Y

ssmFunne lShifte rFu llness[ssmIdx] –= ssmFunne lShifte rFu llness[ssmIdx] +=


ssmSyntaxFifo[ssmIdx][0] ssmMuxWord

Incr ement Blo ck Inde x Pop ssmSyntaxFifo[ssmIdx][0]

Figure 4-44: Substream Multiplexing Delay

For each block time during ssmDelay:


• Syntax elements of data are placed into ssmBalanceFifo
• Syntax element sizes are placed into ssmSyntaxFifo

The substream multiplexer is updated during each block time after encoder mode
selection. The selected mode’s syntax shall consist of at least ssmMinSeSize = 2 bits and
at most ssm_max_se_size bits for each substream. The substream multiplexer shall perform
the following steps once per block time, per substream:
1 If the current block time is greater than or equal to ssmSkew, all block bits for the current
substream are added to ssmBalanceFifo. For example, during the first block time, all bits
from Substream 0 of Block 0 shall be added to ssmBalanceFifo[0]. During Block Time 1,
Substream 0 of Block 1 shall be added to ssmBalanceFifo[0], followed by Substreams 1
through 3 of Block 0 to ssmBalanceFifo[1] through ssmBalanceFifo[3]. This shall continue
for the remainder of the slice. The number of bits added to ssmBalanceFifo per block time
shall also be placed at the end of ssmSyntaxFifo.
2 After ssmBalanceFifo has been updated, the de-multiplexer model shall be checked to
determine whether any mux words shall be requested as per Table 4-82. The de-multiplexer
shall request a mux word if the funnel shifter fullness for that substream is strictly less than
ssm_max_se_size.

Table 4-82: Encoder Substream Multiplexer Check for Mux-word Requests, Substream ssmIdx

decReqMuxWord[ssmIdx] = { true,
false,
ssmFunnelShifterFullness[ssmIdx] < ssm_max_se_size
otherwise

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 148 of 173
Section 4: Encoding Process (Normative)
Substream Multiplexing

3 De-multiplexer model state is updated as per Table 4-83. If a mux word is requested during
the current block time, ssmFunnelShifterFullness is incremented by ssm_max_se_size. Current
block decoding is simulated by reducing ssmFunnelShifterFullness by the value at the front of
ssmSyntaxFifo, which is then popped.

Table 4-83: Substream Multiplexer Encoder Update of De-multiplexer Model State


for Substream ssmIdx
if (decReqMuxWord[ssmIdx]) {
ssmFunnelShifterFullness[ssmIdx] += ssm_max_se_size
}
ssmFunnelShifterFullness[ssmIdx] –= ssmSyntaxFifo[ssmIdx][0]
pop ssmSyntaxFifo[ssmIdx]

This concludes the discussion of encoder operation.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 149 of 173
Section 5: Decoding Process (Normative)
Overview

5 Decoding Process (Normative)


This section describes the operations that shall be performed by a VDC-M-compliant decoder.

5.1 Overview
C source code shall always be trusted when in conflict with the content of this Standard.
The decoder shall perform the decoding process, once per block time within a slice, as follows:
1 Substream de-multiplexer shall be responsible for requesting mux words and ensuring that the
four decoder funnel shifters always have a minimum of ssm_max_se_size bits available for
each block time.
2 Parser shall remove bits from the funnel shifters, which shall be used later during
block decoding.
Note: The decoder does not perform any testing between the different modes; instead, the mode
for each block and all associated information is contained explicitly within the syntax.
3 Rate controller’s state shall be updated.
4 Current block shall be decoded.
5 Steps 1 through 4 shall be repeated for the next block.

5.2 Substream De-multiplexing


The substream de-multiplexer shall be responsible for removing bits from the bitstream and
placing them into a set of decoder funnel shifters. Each substream shall have a funnel shifter
(ssmFunnelShifter) of fixed size, as follows:
ssmFunnelShifterSize = 2 × ssm_max_se_size – 1

Figure 5-1 illustrates the substream de-multiplexer’s overall structure. Bracket notation is used
to specify a substream element. For example, ssmFunnelShifter[0] refers to the Substream 0
funnel shifter.

ssm[0]
ssmFunnelShifter[0] Par ser

ssm[1]
ssmFunnelShifter[1] Par ser
Bitstrea m
Sub stream
Rate B uffer
De-multiplexer ssm[2]
ssmFunnelShifter[2] Par ser

ssm[3]
ssmFunnelShifter[3] Par ser

Figure 5-1: Substream De-multiplexer

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 150 of 173
Section 5: Decoding Process (Normative)
Substream De-multiplexing

During each block time, the substream de-multiplexer may request up to one mux word from each
of the four substreams. A mux word shall be requested for a substream ssmIdx if the following
condition is met:
fullness (ssmFunnelShifter[ssmIdx] ) < ssm_max_se_size

It is possible for all four substreams to request a mux word during the same block time. Should this
occur, the requests shall be fulfilled in numerical order (e.g., Substream 1 would receive a mux
word from the bitstream before Substream 2). It is also possible that no requests will be made.
There shall be a delay of one block time between the transmission of Substream 0 and the
remaining substreams. This shall be captured by parameter ssmSkew, as follows:
0, ssmIdx == 0
{
ssmSkew[ssmIdx] = 1, ssmIdx > 0

The mux word request progression for the four substreams shall proceed as described in Table 5-1.
Requests shall not be made for any substream if that substream’s decoder block index is less than
ssmSkew. Therefore, during Block Time 0, Substream 0 shall be the only substream to request a
mux word. During Block Time 1, Substreams 1 through 3 shall issue requests. For all other block
times, a substream shall request a mux word only if that substream’s funnel shifter fullness is
strictly less than ssm_max_se_size.

Table 5-1: Substream De-multiplexer Mux Word Requestsa


Decoder Substream 0 Substream 1 Substream 2 Substream 3
blockIdx
0 Request mux word – – –
1 Request mux word if Request mux word Request mux word Request mux word
FSF < ssm_max_se_size
2 Request mux word if Request mux word if Request mux word if Request mux word if
FSF < ssm_max_se_size FSF < ssm_max_se_size FSF < ssm_max_se_size FSF < ssm_max_se_size
… Request mux word if Request mux word if Request mux word if Request mux word if
FSF < ssm_max_se_size FSF < ssm_max_se_size FSF < ssm_max_se_size FSF < ssm_max_se_size
a. FSF = Funnel shifter fullness.

The bits present in the decoder funnel shifters after SSM requests have been fulfilled are necessary
for decoding the current block. At this point, the decoder parser and entropy decoder process
information directly from the SSM funnel shifters.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 151 of 173
Section 5: Decoding Process (Normative)
Syntax Parsing

5.3 Syntax Parsing


VDC-M slice syntax parsing shall be handled after substream de-multiplexing is updated for the
current block time. Each of the four SSM substreams includes an instance of the parser, which shall
read data from the decoder funnel shifter. SSM shall ensure that all bits that are needed to decode
the current block are present in the decoder funnel shifters prior to parsing. Figure 5-2 through
Figure 5-6 illustrate this procedure for Transform, BP, MPP, MPPF, and BP-SKIP modes,
respectively.

Substream Mode Header, Flatness Header,


Funnel Shifter Parser
De-multiplexer Intra Predictor (NFBLS)

Substream
Funnel Shifter Parser Entropy Decoder Quantized Transform Coefficients
De-multiplexer

Bitstream

Substream
Funnel Shifter Parser Entropy Decoder Quantized Transform Coefficients
De-multiplexer

Substream
Funnel Shifter Parser Entropy Decoder Quantized Transform Coefficients
De-multiplexer

Figure 5-2: Transform Mode Syntax Parsing

Substream Mode Header, Flatness Header,


Funnel Shifter Parser
De-multiplexer bpvTable, BPV for Sub-block 0

Substream
Funnel Shifter Parser Entropy Decoder BPV for Sub-block 1, Quantized Residuals
De-multiplexer

Bitstream

Substream
Funnel Shifter Parser Entropy Decoder BPV for Sub-block 2, Quantized Residuals
De-multiplexer

Substream
Funnel Shifter Parser Entropy Decoder BPV for Sub-block 3, Quantized Residuals
De-multiplexer

Figure 5-3: BP Mode Syntax Parsing

Substream Mode Header, Flatness Header,


Funnel Shifter Parser
De-multiplexer mppStepSize, Quantized Residuals

Substream
Funnel Shifter Parser Quantized Residuals
De-multiplexer

Bitstream

Substream
Funnel Shifter Parser Quantized Residuals
De-multiplexer

Substream
Funnel Shifter Parser Quantized Residuals
De-multiplexer

Figure 5-4: MPP Mode Syntax Parsing

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 152 of 173
Section 5: Decoding Process (Normative)
Syntax Parsing

Substream Mode Header, Flatness Header,


Funnel Shifter Parser
De-multiplexer mppfColorSpace, Quantized Residuals

Substream
Funnel Shifter Parser Quantized Residuals
De-multiplexer

Bitstream

Substream
Funnel Shifter Parser Quantized Residuals
De-multiplexer

Substream
Funnel Shifter Parser Quantized Residuals
De-multiplexer

Figure 5-5: MPPF Mode Syntax Parsing

Substream Mode Header, Flatness Header,


Funnel Shifter Parser
De-multiplexer bpvTable, BPV for Sub-block 0

Substream
Funnel Shifter Parser BPV for Sub-block 1
De-multiplexer

Bitstream

Substream
Funnel Shifter Parser BPV for Sub-block 2
De-multiplexer

Substream
Funnel Shifter Parser BPV for Sub-block 3
De-multiplexer

Figure 5-6: BP-SKIP Mode Syntax Parsing

In addition to the syntax parser, an entropy decoder shall be included in Substreams 1 through 3.
During entropy decoding, bits shall be removed from the funnel shifter and quantized coefficients
or residuals shall be generated. The number of entropy coding groups (ECGs) per substream shall
be determined by the chroma sampling format. Table 5-2 describes the slice syntax distribution
among the four substreams.

Table 5-2: Substream Decoder Slice Syntax


Substream Syntax Contents
0 Block header, mode-specific information.
1 Block data for Component 0 (Y/R).
2 Block data for Component 1 (Co/Cb/G).
3 Block data for Component 2 (Cg/Cr/B).

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 153 of 173
Section 5: Decoding Process (Normative)
Syntax Parsing

5.3.1 Block Headers


The block header, contained in Substream 0, shall be the first syntax parsed for the current block
time. The block header shall contain the following headers, further described in Table 5-3:
• Mode – Designates the selected mode for the current block
• Flatness – Designates the flatness type

Table 5-3: Decoding Process Block Headers


Header Field Description
Mode modeSameFlag • One-bit fixed-length field.
• First field parsed for the current block.
• If modeSameFlag == 1, the current block mode is the same as the previous
block mode. In this case, mode header parsing shall be complete.
• If modeSameFlag == 0, the mode is different from the previous block mode.
Field curBlockMode shall be parsed in the bitstream to indicate the current
block mode, as per Table 4-78. At this point, mode header parsing shall
be complete.
curBlockMode • Dependent two- or three-bit truncated binary field.
• Parsed in the bitstream per Table 4-78 when modeSameFlag == 0.
Flatness flatnessFlag • One-bit fixed-length field.
• If flatnessFlag == 0, the current block has no flatness type and parsing of the
flatness header shall be complete.
• If flatnessFlag == 1, flatnessType shall be parsed as per Table 4-79.
flatnessType • Dependent two-bit fixed-length field.
• Parsed in the bitstream per Table 4-79 when flatnessFlag == 1.

In addition to the block header, additional syntax may be included in Substream 0, dependent on the
block mode. (See Section 5.4 for further details.)

5.3.2 Syntax Decoding for Modes that Do Not Use EC


The following modes do not use entropy coding:
• MPP Mode
• MPPF Mode
• BP-SKIP Mode

All syntax specific to these modes can be parsed by reading fixed-length fields in Substreams 0
through 3, as described in their respective sections.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 154 of 173
Section 5: Decoding Process (Normative)
Syntax Parsing

5.3.3 Syntax Decoding for Modes that Use EC


The following modes use entropy coding:
• Transform Mode
• BP Mode

To parse the ECG data, first the component skip flag ecCompSkip shall be parsed for
Components 1 and 2. (See Section 5.3.3.1.) ecCompSkip shall be disabled for Component 0
and shall not be included in the bitstream. For Transform mode, a field lastSigPos shall then
be parsed if component skip is inactive. This field shall be of length compSamplesLog2.
Following this, the entropy decoder shall process four ECGs per component. Depending on
the chroma format, component index, and the value of lastSigPos, certain ECG can be skipped
(i.e., ecgDataActive[ecgIdx] is false). This is determined by table look-up, as described in
Section 4.7.2. For any ECG with a non-skipped data portion, the decoding process shall be
conducted as described in Section 5.3.3.2.

5.3.3.1 Component Skip Flag


For Components 1 and 2 (chroma components), a flag ecCompSkip shall be parsed from the
bitstream. The flag’s length shall depend on whether the current block is Transform or BP mode
to ensure that ssmMinSeSize is two bits, as described in Table 5-4.

Table 5-4: Component Skip Flags


Mode Component Skip Inactive Component Skip Active
Transform 0 (One bit) 11 (Two bits)
BP 0 (One bit) 1 (One bit)

If ecCompSkip is active, all quantized samples for the component shall be equal to 0, and
ecgDataActive[ecgIdx] is false for all four ECGs.
If ecCompSkip is inactive, there is at least one ECG within the component for which
ecgDataActive[ecgIdx] is true.
For any ECG in which ecgDataActive[ecgIdx] is true, a one-bit flag groupSkip shall be parsed to
indicate whether the group is skipped. If groupSkip == 1, all quantized samples in the group shall
be equal to 0, and parsing of the data portion of the ECG shall be complete. If groupSkip == 0,
the ECG shall be further parsed as described in Section 5.3.3.2.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 155 of 173
Section 5: Decoding Process (Normative)
Syntax Parsing

5.3.3.2 Entropy Decoding


The entropy decoder shall process each non-skipped ECG within a component. The number of
ECGs to parse per component shall be a function of the chroma-sampling format. For 4:4:4 content,
four ECGs shall be contained within each of the three color components. The first field to be parsed
within each ECG is the bitsReq prefix, which is a unary-coded field that indicates the number of
bits per sample within the ECG. The test model implementation of decoding the bitsReq prefix
is located within DECODE_EC_GROUP_PREFIX in the C model. For Transform mode,
additional mapping shall occur between the parsed unary prefix and actual bitsReq, as described
in Table 4-61. For BP mode, the unary prefix shall be mapped directly to bitsReq.
Next, the Common-prefix entropy coding (CPEC) or Vector entropy coding (VEC) group shall be
parsed, with sign/magnitude (SM) or 2’s complement (2C) bit representation. The parser shall infer
the ECG type and bit representation, using the following set of rules:
• SM bit representation shall be used for any ECG with ecgIdx < 3
• 2C bit representation shall be used for any ECG with ecgIdx == 3
• For Transform mode, CPEC shall be used for all ECGs
• For BP mode:
• CPEC shall be used for all ECGs when bitsReq > 2
• VEC shall be used for all ECGs when bitsReq ≤ 2

5.3.3.3 Common-prefix Entropy Code Decoding


CPEC ECGs consist of a unary prefix (bitsReq) and N fixed-length fields composed of bitsReq bits.
The number of samples per ECG is determined by the block mode, as follows:
• Transform mode – Number of samples per ECG depends on ecgIdx and lastSigPos
(see Section 4.6.1.5)
• BP mode – Each ECG consists of exactly four samples

After bitsReq is extracted from the bitstream, a fixed number of bits shall be parsed from the
bitstream, as follows:
bits = N × bitsReq

where:
• N is the number of samples in the ECG

The entropy decoder shall convert these bits into samples, as follows:
• SM bit representation – Magnitude shall be contained within the fixed-length field and
the sign bits shall be parsed separately
• 2C bit representation – Sample shall be determined directly from the fixed-length field

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 156 of 173
Section 5: Decoding Process (Normative)
Syntax Parsing

5.3.3.4 Vector Entropy Code Decoding


VEC shall be used for BP mode when bitsReq is less than or equal to 2. VEC ECG decoding shall
be performed, as follows:
1 VEC parameter vecGrK shall be determined by look-up from the bit representation and
bitsReq value, as per Table 4-64.
2 VEC shall be parsed from the bitstream.
The VEC is composed of two fields – a unary prefix (vecEcPrefix) and a fixed-length suffix
(vecEcSuffix) – in which the fixed length is equal to vecGrK. The parser shall perform unary
decoding to determine the vecEcPrefix value and parse vecEcSuffix as a fixed-length field.
3 Entropy decoder shall map vecEcPrefix and vecEcSuffix to a unique value vecCodeNumber,
as follows:
vecCodeNumber = (vecEcPrefix << vecGrK) | vecEcSuffix
4 vecCodeNumber shall be mapped to the quantity vecCodeSymbol, using a predefined LUT
depending on the bit representation, component index, and bitsReq.
5 Quantized samples in the VEC shall be obtained algorithmically from vecCodeSymbol,
using the procedure described in Table 5-5.

Table 5-5: Vector Entropy Code Decoding Samples from vecCodeSymbol


bitMask = (1 << bitsReq) – 1
shift = 3 × bitsReq
thresh = 1 << (bitsReq – 1)
offset = 1 << bitsReq
for (index i in current ECG) {
temp = (vecCodeSymbol >> shift) & bitMask

sample[i] = { temp,
temp – offset,
temp < thresh
otherwise
shift –= bitsReq
}

5.3.3.5 Rate Buffer Stuffing Bits


If underflowPrevention is enabled, stuffing words of length rc_stuffing_bits shall be parsed from
each component for ecgIdx ∈ [1, 3]. These bits shall be included after the data portion of the ECG,
if present.

5.3.3.6 Sign Bits


The sign bits shall be parsed from ecgIdx == 3, if required. One sign bit shall be parsed for each
non-zero sample within the first three ECGs.
If the current mode is Transform mode and coeff[lastSigPos] == 0, signLastSigPos shall be parsed
from ecgIdx == 3, using one bit. (See Section 4.6.1.5.1.)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 157 of 173
Section 5: Decoding Process (Normative)
Per-mode Decoding Process

5.4 Per-mode Decoding Process


For each block, the decoder shall first parse the Substream 0 mode and flatness headers.
The decoding process shall then diverge, depending on the mode.

5.4.1 Transform Mode


The following sub-sections describe the Transform mode decoding process.

5.4.1.1 Parsing, Entropy Decoding, and Inverse Quantization


For a Transform mode block located within FBLS, this step is skipped and the intra predictor
(curIntraPredictor) shall default to intraFbls. For a Transform mode block located within
NFBLS, curIntraPredictor shall be parsed from Substream 0 as a fixed-length 3-bit field.
The remainder of Transform mode syntax consists of ECGs that are distributed among
Substreams 1 through 3. These ECGs contain quantized transform coefficients Tq(i, j).
These groups are entropy-decoded, following the procedures provided in Section 5.3.3.
After the quantized transform coefficients are entropy decoded, the coefficients are inverse
quantized, using the fractional quantizer described in Section 4.8.1.2 to produce the set of
reconstructed transform coefficients T (i, j) .

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 158 of 173
Section 5: Decoding Process (Normative)
Per-mode Decoding Process

5.4.1.2 Inverse Transform


The inverse transform is applied to the set of reconstructed transform coefficients T (i, j) to
generate the set of reconstructed residuals R (i, j). The inverse transform is applied in two passes,
depending on the current block component’s dimensions. (See Figure 5-7 and Table 5-6.)
Decoder implementations shall produce results that match the formulations of each of
these functions.

8x2 Block Component

2-point Inverse Y(i, j) 8-point Z(i, j)


Haar Transform Inverse DCT

4x2 Block Component


T(i, j) Y(i, j) Z(i, j) R(i, j)
2-point Inverse 4-point
Haar Transform Inverse DCT
Post-shift

4x1 Block Component

4-point Z(i, j)
Inverse DCT

Figure 5-7: Inverse Discrete Cosine Transform Overview

Table 5-6: Inverse Discrete Cosine Transform Overview


Block Component Description Described In
8x2 • Inverse Haar transform (2-point inverse Section 5.4.1.2.1
(All 4:4:4 components and luma component) DCT) for each column
• 8-point inverse DCT for each row Section 5.4.1.2.2.1
4x2 • Inverse Haar transform (2-point inverse Section 5.4.1.2.1
(4:2:2 chroma component) DCT) for each column
• 4-point inverse DCT for each row Section 5.4.1.2.2.2
4x1 • 4-point inverse DCT for only row Section 5.4.1.2.2.2
(4:2:0 chroma component)

5.4.1.2.1 Vertical Inverse Transform Pass


In the first pass of the inverse transform, the 2-point inverse Haar transform is applied to the
columns of T (i, j) , as described in Table 5-7. This pass is skipped if the block component is 4x1.

Table 5-7: 2-point Inverse Haar Transform


for (c = 0; c < numCols; c ++) {
Y(c, 0) = T (c, 0) + T (c, 1)
Y(c, 1) = T (c, 0) – T (c, 1)
}

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 159 of 173
Section 5: Decoding Process (Normative)
Per-mode Decoding Process

5.4.1.2.2 Horizontal Inverse Transform Pass


In the second pass of the inverse transform, inverse DCT shall be applied to the rows of Y(i, j),
as follows:
• 8-point inverse DCT – Used for 8x2 components (see Section 5.4.1.2.2.1)
• 4-point inverse DCT – Applied to 4x2 and 4x1 components (see Section 5.4.1.2.2.2)

5.4.1.2.2.1 8-point Inverse Discrete Cosine Transform


The 8-point inverse DCT is implemented in a butterfly structure, as illustrated in Figure 5-8 and
described in Table 5-8. Input to the 8-point inverse DCT shall be Y(i, j) from the inverse Haar
transform. Output from the 8-point inverse DCT shall be a set of reconstructed residuals Z(i, j) that
are scaled as part of the post-shift operation. (See Section 5.4.1.2.3.) The following coefficients
shall be used to calculate the 8-point inverse DCT:
• dctA8 = 17 • dctG8 = 111
• dctB8 = 41 • dctZ8 = 64
• dctD8 = 22 • dctShift8 = 7
• dctE8 = 94 • dctShift4 = 6

Y0 Z0 ea0 eb0 eb0 Z0

Y1 Z1 ea1 eb1 eb1 Z1

Y2 Z2 ea2 eb2 eb2 Z2

Y3 Z3 ea3 eb3 eb3 Z3

Y4 Z4 da0 db0 dc0 Z4

Y5 Z5 Z5 db1 dc1 Z5

Y6 Z6 Z6 db2 dc2 Z6

Y7 Z7 da3 db3 dc3 Z7

Figure 5-8: Butterfly Structure for 8-point Inverse Discrete Cosine Transform

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 160 of 173
Section 5: Decoding Process (Normative)
Per-mode Decoding Process

Table 5-8: 8-point Inverse Discrete Cosine Transform Applied to Row r of Y(i, j)
// first stage (re-order)
mapping = [0, 4, 2, 6, 7, 3, 5,1]
for (c = 0; c < numCols; c ++) {
Y(c, r) = Y(mapping[c], r)
}

// second stage, even coefficients


ea(0, r) = Y(0, r) + Y(1, r)
ea(1, r) = Y(0, r) – Y(1, r)
ea(2, r) = (dctA8 × Y(2, r) – dctB8 × Y(3, r) ) >> dctShift4
ea(3, r) = (dctB8 × Y(2, r) – dctA8 × Y(3, r) ) >> dctShift4

// third stage, reorder coefficients


eb(0, r) = ea(0, r) + ea(3, r)
eb(1, r) = ea(1, r) + ea(2, r)
eb(2, r) = ea(1, r) – ea(2, r)
eb(3, r) = ea(0, r) – ea(3, r)

// second stage, odd coefficients


da(0, r) = Y(7, r) – Y(4, r)
da(3, r) = Y(7, r) – Y(4, r)

// third stage, odd coefficients


db(0, r) = da(0, r) + Y(6, r)
db(1, r) = da(3, r) – Y(5, r)
db(2, r) = da(0, r) – Y(6, r)
db(3, r) = da(3, r) + Y(5, r)

// fourth stage, odd coefficients


dc(0, r) = (dctE8 × db(0, r) – dctZ8 × db(3, r) ) >> dctShift8
dc(1, r) = (dctG8 × db(1, r) – dctD8 × db(2, r) ) >> dctShift8
dc(2, r) = (dctD8 × db(1, r) + dctG8 × db(2, r) ) >> dctShift8
dc(3, r) = (dctZ8 × db(0, r) + dctE8 × db(3, r) ) >> dctShift8

// final stage
Z(0, r) = eb(0, r) + dc(3, r)
Z(1, r) = eb(1, r) + dc(2, r)
Z(2, r) = eb(2, r) + dc(1, r)
Z(3, r) = eb(3, r) + dc(0, r)
Z(4, r) = eb(3, r) – dc(0, r)
Z(5, r) = eb(2, r) – dc(1, r)
Z(6, r) = eb(1, r) – dc(3, r)
Z(7, r) = eb(0, r) – dc(3, r)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 161 of 173
Section 5: Decoding Process (Normative)
Per-mode Decoding Process

5.4.1.2.2.2 4-point Inverse Discrete Cosine Transform


The 4-point inverse DCT is implemented in a butterfly structure, as illustrated in Figure 5-9
and described in Table 5-9. Input to the 4-point inverse DCT shall be Y(i, j) for a 4x2 block
component and T (i, j) for a 4x1 block component. Output from the 4-point inverse DCT shall
be a set of reconstructed residuals Z(i, j) that are scaled as part of the post-shift operation. (See
Section 5.4.1.2.3.) The following coefficients shall be used to calculate the 4-point inverse DCT:
• dctA4 = 17
• dctB4 = 41
• dctShift4 = 6

Y0 a0 Z0

Y1 a1 Z1

Y2 a2 Z2

Y3 a3 Z3

Figure 5-9: Butterfly Structure for 4-point Inverse Discrete Cosine Transform

Table 5-9: 4-point Inverse Discrete Cosine Transform Applied to Row r of Y(i, j)

a(0, r) = Y(0, r) + Y(2, r)


a(1, r) = Y(0, r) – Y(2, r)
a(2, r) = (dctA4 × Y(1, r) – dctB4 × Y(3, r) ) >> dctShift4
a(3, r) = (dctB4 × Y(1, r) + dctA4 × Y(3, r) ) >> dctShift4

Z(0, r) = a(0, r) + a(3, r)


Z(1, r) = a(1, r) + a(2, r)
Z(2, r) = a(1, r) – a(2, r)
Z(3, r) = a(0, r) – a(3, r)

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 162 of 173
Section 5: Decoding Process (Normative)
Per-mode Decoding Process

5.4.1.2.3 Post-shift
The result of the inverse transforms shall be the set of reconstructed residuals Z(i, j). The scale shall
be adjusted before the final reconstructed residuals ( R (i, j) ) can be obtained. Table 5-10 describes
this adjustment, using the following coefficients:
• dctInvShift = 8
• dctInvRound = 128

Table 5-10: Inverse Discrete Cosine Transform Post-shift


for (r = 0; r < numRows; r ++) {
for (c = 0; c < numCols; c ++) {

sign = {
-1,
+1,
Z(c, r) < 0
otherwise
R (c, r) = sign × ( ( | Z(c, r) | + dctInvRound) >> dctInvShift)
}
}

5.4.1.3 Reconstruction
The intra predicted block is calculated, using the tables in Section 4.6.1.1 for the current block’s
parsed intra predictor type. The reconstructed residuals are then added to the predicted block and
clipped to the available range. If the source color space is RGB, Transform mode is calculated in
the YCoCg space. In this case, color-space conversion (CSC) is used to convert the reconstructed
samples back to RGB color space. This step is not required for YCbCr source content.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 163 of 173
Section 5: Decoding Process (Normative)
Per-mode Decoding Process

5.4.2 BP Mode
In BP mode, the block prediction vectors (BPVs) are parsed from Substream 0. This shall occur
in two steps:
1 Decoder shall parse a fixed-length four-bit field (bpvTable) from Substream 0. These four bits
shall define the partition structure of each 2x2 sub-block. For each bit in bpvTable:
• 0 signals a 2x1 partition (2x BPV)
• 1 signals a 2x2 partition (1x BPV)
The total number of BPV for each sub-block shall either be one or two.
2 The BPV for each sub-block shall be parsed from the corresponding substream (i.e., the one
or two BPVs associated with Sub-block i shall be parsed from Substream i).

For FBLS blocks, each BPV shall be represented by five bits because only 32 search range
positions are valid in this region. For NFBLS blocks, each BPV shall be represented by six bits
(64 possible positions). For example, if the current block is located within NFBLS and bpvTable
is 1011, the BPV shall be distributed as described in Table 5-11.

Table 5-11: Example BPV Distribution – bpvTable Value 1011


1x BPV (six bits) for Sub-block 0 in Substream 0
2x BPV (12 bits) for Sub-block 1 in Substream 1
1x BPV (six bits) for Sub-block 2 in Substream 2
1x BPV (six bits) for Sub-block 3 in Substream 3

After the decoder parses the BPVs from the bitstream, the remaining syntax for BP mode consists
of a set of ECGs distributed among Substreams 1 through 3. These ECGs shall contain quantized
prediction residuals for all three color components. Following the procedure for entropy decoding
described in Section 5.3.3, the decoder shall receive the complete set of quantized residuals.
These residuals shall be inverse quantized, using the fractional quantizer described in Section 4.8.1
to generate the reconstructed residuals.
The predicted block shall be generated from the set of received BPV for each partition, using the
mapping between BPV index and BPV search range position described in Section 4.6.2.1. Next,
the reconstructed samples shall be generated by adding the predicted block to the reconstructed
residuals and clipping to the available range.
If the source color space is RGB, BP mode is calculated in the YCoCg color space. In this case,
CSC is used to convert the reconstructed samples back to RGB color space. This step is not
required for YCbCr source content.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 164 of 173
Section 5: Decoding Process (Normative)
Per-mode Decoding Process

5.4.3 MPP Mode


In MPP mode, two additional fields shall be parsed from Substream 0. The first field
(mppColorSpace) shall specify whether the current block is encoded in the RGB or YCoCg color
space, as follows:
RGB, mppColorSpace == 0
curBlockColorSpace = { YCoCg, mppColorSpace == 1

If the input format is YCbCr, mppColorSpace is not included in the bitstream, and
curBlockColorSpace = YCbCr. Next, the step size (mppStepSize) is parsed. mppStepSize
shall be a fixed 3-bit unsigned field if bpc == 8, –or– a 4-bit unsigned field if bpc > 8.
The decoder shall then parse the quantized residuals from the bitstream in Substreams 0 through 3.
Each of the four substreams contain an equal number of samples, as described in Table 4-55.
Quantized MPP residuals within the bitstream are represented as unsigned values. After
parsing this unsigned value, the decoder shall add the value minCode to each sample,
where minCode = - (1 << (mppBitsPerComp – 1) ).
The number of bits for each quantized residual is determined by the component index and
mppStepSize. mppStepSizeComp denotes the step size for a given component, calculated
as a function of the source color space and mppColorSpace flag:

{
mppStepSizeMapCo[mppStepSize], mppColorSpace == 1, Co component
mppStepSizeMapCg[mppStepSize], mppColorSpace == 1, Cg component
mppStepSizeComp =
mppStepSize, otherwise

The number of bits per quantized residual for a given component (mppBitsPerComp) shall then
be determined, as follows:
mppBitsPerComp = compBitDepth – mppStepSizeComp
where:
bpc +1, YCoCg chroma components
compBitDepth = { bpc, otherwise

Reconstructed residuals shall be calculated by applying the scalar inverse quantizer (see
Section 4.8.2) to all quantized residuals. The final reconstructed pixels shall be obtained by first
calculating the midpoint in the same way as calculated on the encoder side (see Section 4.6.3),
and then adding the midpoint to the reconstructed residuals before clipping to the available range.
If mppColorSpace == 1, CSC shall be used to convert the block from YCoCg to RGB color space.
At this point, MPP block decoding shall be complete.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 165 of 173
Section 5: Decoding Process (Normative)
Per-mode Decoding Process

5.4.4 MPPF Mode


In MPPF mode for RGB source content, using the values programmed in PPS parameter
mppf_bits_per_comp, a one-bit field (mppfColorSpace) shall be parsed from Substream 0
to determine the following (see Figure 4-38 for further details):
• Color space of the MPPF samples
• Number of bits per sample for each color component (mppfBitsPerComp)

mppfBitsPerComp shall then be mapped to mppfStepSizeComp, as follows:


bpc + 1, Co or Cg component
bitDepth = {
bpc, otherwise
mppfStepSizeComp[k] = bitDepth – mppfBitsPerComp[k]

After mppfStepSizeComp is determined for each component, the quantized residuals shall
be parsed from the bitstream, using the same procedure as in MPP mode. (See Section 5.4.3.)

Quantized MPPF residuals within the bitstream are represented as unsigned values. After
parsing this unsigned value, the decoder shall add the value minCode to each sample,
where minCode = - (1 << (mppfBitsPerComp – 1) ).
Reconstructed residuals shall be calculated by applying the scalar inverse quantizer (see
Section 4.8.2) to all quantized residuals. The final reconstructed pixels shall be obtained by first
calculating the midpoint in the same way as calculated on the encoder side (see Section 4.6.3),
and then adding the midpoint to the reconstructed residuals before clipping to the available range.
If mppfColorSpace is YCoCg, CSC shall be used to convert the reconstructed block from YCoCg
to RGB color space.
At this point, MPPF block decoding shall be complete.

5.4.5 BP-SKIP Mode


In BP-SKIP mode, the BPVs shall be parsed from Substream 0. This procedure is identical
to the BPV parsing for BP mode described in Section 5.4.2. No other syntax is needed for
BP-SKIP mode.
As in BP mode, the predicted block shall be generated from the set of received BPV for each
partition, using the mapping between BPV index and BPV search range position described
in Section 4.6.2.1.
Because there are no residuals in BP-SKIP mode, the predicted block shall also be the
reconstructed block.
If the source color space is RGB, BP-SKIP mode shall be calculated in the YCoCg color space.
In this case, CSC shall be used to convert the reconstructed samples back to RGB color space.
This step is not required for YCbCr source content.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 166 of 173
Section 5: Decoding Process (Normative)
Update RC State

5.5 Update RC State


The rate controller state shall be updated during each block time at the decoder. The following
three steps are performed:
1 Update the rate buffer fullness (bufferFullness, rcFullness) – see Section 4.5.1.
2 Update targetRate – see Section 4.5.2.
3 Update qp – see Section 4.5.3.

The flatness type used by QP update shall be directly parsed from the bitstream.

This concludes the discussion of decoder operation.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 167 of 173
Annex A: Rate Buffer Guidance (Normative)

A Rate Buffer Guidance (Normative)

This annex provides guidance for calculating rate buffer size-related PPS parameters.
The physical rate buffer size (rc_buffer_max_size) shall be calculated from the compressed bit
rate and slice dimensions, as follows:
rc_buffer_max_size = (2 × rc_buffer_init_size) + (2 × slice_width × rc_target_rate_extra_fbls)

PPS parameter rc_buffer_init_size defines a constraint on the number of bits that can remain
within the rate buffer at the end of a slice during the encoding procedure. This means that when
encoding for Slice N is complete, there are still rc_buffer_init_size or fewer bits in the rate
buffer that need to be transmitted. The same shall be true for the start of the encoding process
for Slice N + 1. That is, at the beginning of a slice, the rate buffer shall be filled to at least
rc_buffer_init_size before Slice N + 1 transmission can begin.
First, rcBufferInitSizeTemp is calculated as a function of the slice width:

{
4096, slice_width ≤ 720
8192, 720 < slice_width ≤ 2048
rcBufferInitSizeTemp =
10752, otherwise

The delay associated with filling the rate buffer shall be denoted as rc_init_tx_delay (measured
in block times). rc_init_tx_delay is calculated, as follows:
rcBufferInitSizeTemp
rc_init_tx_delay =  avgBlockRate 
where:
• avgBlockRate = bpp, because bpp is stored with four bits of precision, and there
are 16 pixels per block

Note that rc_init_tx_delay is in addition to the delay associated with filling the SSM balance
FIFOs. (See Section 4.11.) During the initial transmission delay period, bits shall be removed from
the SSM balance FIFOs and placed into the rate buffer. After the initial transmission delay period
has concluded, transmission shall begin from the rate buffer into the bitstream.
The initial buffer size shall then be modified to be a multiple of avgBlockRate, as follows:
rc_buffer_init_size = avgBlockRate × rc_init_tx_delay

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 168 of 173
Annex B: Derivation of Parameters (Informative)
RDO Weights

B Derivation of Parameters (Informative)

This annex provides the derivation of certain codec parameters.

B.1 RDO Weights


The encoder mode selection algorithm selects the best mode for each block from several available
coding modes, based in part on the rate distortion (RD) cost of each mode. This RD cost is
calculated from the rate and distortion of each coding mode, in addition to a Lagrangian parameter.
It is important to use a consistent measure of distortion when making this comparison. To
accomplish this for modes that are calculated in different color spaces, a set of rate-distortion
optimization (RDO) weights is calculated such that distortion in YCoCg color space can be
compared with distortion in RGB color space.
For any mode calculated in RGB color space, the distortion will be calculated directly by the sum
of absolute differences (SAD). For a mode in YCoCg color space, the RDO weights indicated
below will be used to appropriately scale the distortion. The RDO weights are calculated from the
color-space conversion (CSC) transformation matrix (YCoCg to RGB), using the Euclidean norm
of each column, as follows:
√2 √3
[
rdoWeights = √3 , 2 , 2 ]
Next, the weights are approximated, using fixed-point math. For example, the integer weights that
use 8-bit fractional precision become the following:
443 181 222
rdoWeights = [ , ,
256 256 256 ]
The final SAD in YCoCg color space will be modified as follows, using the above weights:

SADR = (SADY × 443 + 128) >> 8

SADG = (SADCo × 181 + 128) >> 8

SADB = (SADCg × 222 + 128) >> 8

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 169 of 173
Annex C: Guidance for Rate Buffer Size and Delays in a Practical Implementation (Informative)

C Guidance for Rate Buffer Size


and Delays in a Practical Implementation
(Informative)

This annex provides guidance for a practical implementation of the rate buffer, including delays
associated with the rate buffer and with substream multiplexing.
The rate buffer discussed within the normative section of this Standard is an idealized rate buffer,
and is therefore the minimum required size of a practical rate buffer in a hardware implementation.
A practical rate buffer will likely grow from the idealized size to accommodate various factors,
such as horizontal blanking, pipeline delays, and so forth.
The initial transmission delay (rc_init_tx_delay) discussed within the normative section of this
Standard determines the number of block times between the compressed data first entering the
rate buffer from the SSM balance FIFOs and transmission from the rate buffer to the bitstream.
The total encoder delay of a practical implementation, at minimum, is the sum of the
following terms:
• rc_init_tx_delay
• Delay associated with filling the SSM balance FIFOs (see Section 4.11)
• Delay of one block time to account for the SSM skew between Substream 0 and the
other substreams (Substreams 1 through 3)
• Additional implementation-specific delays

Additional encoder buffering requirements that are needed to compensate for the above delays are
largely implementation-specific. However, this additional buffering must, at a minimum, include
ssmTransmitStartBufferAdj bits to reconcile the start of transmission of the SSM mux words
into the rate buffer with the constant bit rate transmission from the rate buffer.
ssmTransmitStartBufferAdj = (8 × ssm_max_se_size) – (2 × avgBlockRate)

In a practical encoder, the compressed bits can be distributed in one of the following, as long
as the encoding process proceeds as defined in this Standard:
• SSM balance FIFOs
• Rate buffer
• Combination of SSM balance FIFOs and rate buffer

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 170 of 173
Annex C: Guidance for Rate Buffer Size and Delays in a Practical Implementation (Informative)

A practical implementation of the decoder should include an initial decode delay period during
which bits accumulate within the rate buffer from the bitstream before any blocks are decoded
(i.e., removed from the rate buffer and placed into the SSM funnel shifters). To prevent an input
buffer underflow, this delay must, at a minimum, include initDecodeDelay. initDecodeDelay
is specified by the total buffer size and initial transmission delay, such that the total system
delay (totalDelay) through the encoder and decoder is a constant.

totalDelay = rc_buffer_max_size
bits_per_pixel 

initDecodeDelay = totalDelay – rc_init_tx_delay

The PPS parameter rc_init_tx_delay value and consequential delays discussed within this
annex are measured in block times. The equivalent delay in pixel times is a delay value times
the number of pixels per block, which is 16, divided by the encoder or decoder throughput
in pixels per clock. For example, if a delay value is 100 block times and the decoder throughput
is 4 pixels per clock, the equivalent delay value is 400 pixel clocks.

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 171 of 173
Annex D: Main Contributor History (Previous Versions)

D Main Contributor History


(Previous Versions)

Table D-1: Main Contributor History (Previous Versions)


Company Name Contribution Version
Analogix Semiconductor Peter Halenbeck Contributor 1.0
Greg Stewart Contributor 1.0
Avatar Tech Pubs Trish McDermott Technical Writer 1.0
Bitec Spain S.L. Damian Sanchez Contributor 1.0
Broadcom Corp. Rick Berard Contributor 1.0
Fred Walls Contributor, Reviewer 1.0
DisplayLink Corp. Eric Hamaker Contributor 1.0
Shumin Tian Contributor 1.0
Extron Electronics Alex Petrulian Contributor, Reviewer 1.0
Hardent, Inc. Simon Bussières Contributor 1.0
Alain Legault Contributor 1.0
Avrum Warshawsky Contributor, Reviewer 1.0
MediaTek, Inc. Li-Heng Chen Contributor, Reviewer 1.0
Tung-Hsing Wu Contributor, Reviewer 1.0
Parade Technologies, Ltd. Craig Wiley Contributor 1.0
Qualcomm, Inc. James Goel Contributor, Reviewer, 1.0
Quality Subgroup Chair
Ike Ikizyan Contributor 1.0
Natan Jacobson Document Editor, Primary 1.0
Technical Contributor
Rajan Joshi Primary Technical Contributor 1.0
Vijayaraghavan Thirumalai Primary Technical Contributor 1.0
Samsung Electronics Co., Ltd. Greg Cook Contributor, Reviewer 1.0
Hojun Jung Contributor, Reviewer 1.0
Taewoo Kim Contributor, Reviewer 1.0
Deoksoo Park Contributor, Reviewer 1.0
Dale Stolitzka Task Group Chair, 1.0
Contributor, Reviewer
Synaptics, Inc. Bruce Chin Contributor 1.0
Synopsys Carlos Ferreir Contributor 1.0
Pedro Miguel Contributor 1.0

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 172 of 173
Annex D: Main Contributor History (Previous Versions)

Table D-1: Main Contributor History (Previous Versions) (Continued)


Company Name Contribution Version
VESA David Braun Contributor 1.0
Bill Lempesis Contributor 1.0
Christine Wentker Contributor 1.0
York University Robert Allison Contributor 1.0
Matthew Cutone Contributor 1.0
Marc Dalecki Contributor 1.0
Lesley Deas Contributor 1.0
Yuqian Hou Contributor 1.0
Aishwarya Sudhama Contributor 1.0
Laurie Wilcox Contributor 1.0

VDC-M Standard Version 1.1


Copyright © 2018 Video Electronics Standards Association. All rights reserved. Page 173 of 173

You might also like