VHDL Implementation of H264 Video Coding Standard
VHDL Implementation of H264 Video Coding Standard
Corresponding Author:
Haresh Suthar,
Departement of Electronics & Commmunications Engineering,
Parul institute Technology,
PO - Lomda, Ta-Waghodia, Vadodara, India.
Email: [email protected]
1. INTRODUCTION
H.264 video coding standard has the same basic functional elements as previous standards
(MPEG-1, MPEG-2, MPEG-4 part 2, H.261, and H.263), like, (1) transform for reduction of spatial, (2)
quantization for bit rate control, (3) motion compensated prediction for reduction of temporal correlation
and (4) entropy encoding for reduction of statistical correlation. Inordered to improve coding performance,
the following important changes in H.264 standard occur by (1) including intra-picture prediction, (2) a
new 4x4 integer transform, (3) multiple reference pictures, (4) variable block sizes, (5) a de-blocking filter,
and (6) improved entropy coding [2].
Improved coding efficiency comes at the expense of added complexity to the coder/decoder.
H.264 utilizes different methods to reduce the implementation complexity like, Multiplier-free integer
transform is introduced. Multiplication operation for the exact transform is combined with the
multiplication of quantization to reduce the quantization step size to improve PSNR to levels that can be
considered visually lossless.
The noisy channel conditions like the wireless networks obstruct the perfect reception of coded
video bit stream in the decoder. Incorrect decoding by the lost data degrades the subjective picture quality
and propagates to the subsequent blocks or pictures. H.264 utilizes different methods to exploit error
resilience to network noise. The parameter setting, flexible macroblock ordering, switched slice, redundant
slice methods are added to the data partitioning, used in previous standards. Depending upon applications,
H.264 defines the Profiles and Levels specifying restrictions on bit streams like some of the previous video
standards. Seven Profiles are defined to cover the various applications from the wireless networks to digital
cinema.In this paper, presented the VHDL design for the H.264 Standards. Algorithm is designed in Xilinx
for H.264 and tested using TEXTIO Package of FPGA.
The paper is organized as follows. Section 2 provides overview of H.264 standards. Section 3
describes algorithm and proposed design for VHDL implementation. Section 4 is Results and Section 5 is
Conclusion.
Figure 1 shows the basic block diagram of H.264, which contains blocks of transform,
quantization, Intra prediction, Inter prediction, and CAVLC. Source pictures and prediction residuals both
have high spatial redundancies. H.264 Standard is based on the use of a block-based transform for spatial
redundancy removal. H.264 uses an adaptive transform block size, 4x4 and 8x8 (High Profiles only),
whereas previous video coding standards were using the 8x8 Discrete Cosine Transform (DCT). The
smaller block size leads to a significant reduction in ringing artifacts. The 4x4 transform has the additional
benefit of removing the need for multiplications. Transform convert spatial domain to frequency domain
[4].
The H.264 extends the quantization step sizes QP by two additional octaves, by redefining the
values (which contain value of Mf and Vi) and allowing QP to vary from 0 to 51. In general, transform and
quantization require several multiplications resulting in high complexity for implementation. So, for simple
implementation, the exact transform process is modified to avoid the multiplications. Then the transform
and quantization are combined by the modified integer forward transform, quantization and scaling [4].
. . ∙ ⁄2 /
∙1 (1)
2
1 1 1 1
2 1 1 2
Where: C
1 1 1 1
1 2 2 1
The complete inverse transforms and scaling process becomes:
∙ ∙ ∙2 ∙ ∙ (2)
1 1 1 1
1 1/2 1/2 1
Where: C
1 1 1 1
1/2 1 1 1/2
Intra prediction predicts blocks form leftmost and top blocks of same frame. If a block or macro
block is encoded in intra mode, a prediction block is formed based on previously encoded and
reconstructed blocks. There are total of 9 optional prediction modes for each 4x4 luma blocks shown in
Figure 2. Same as 4 optional modes for a 16x16 luma block and 8x8 chromablocks [5].
In successive frames of video, most of the information is same. So, Intra prediction predicts the
current frame from the previous frame and finds motion vectors. Intra prediction uses motion estimation
and motion compensation [6].
A filter can be applied to every decoded macro block in order to reduce blocking distortion. The
de-blocking filter is applied after the inverse transform in the encoder (before reconstructing and storing the
macro block for future predictions) and in the decoder (before reconstructing and displaying the macro
block). The filter has two benefits: (1) block edges are smoothed, improving the appearance of decoded
images (particularly at higher compression ratios) and (2) the filtered macro block is used for motion-
compensated prediction of further frames in the encoder, resulting in a smaller residual after prediction
[7].Filtering is applied to vertical or horizontal edges of 4x4 blocks in a macroblock, in the following order
as shown in Figure 3:
1. Four filters for vertical boundaries of the luma component.
2. Four filters for horizontal boundaries of the luma component.
3. Two filters for vertical boundaries of each chroma component.
4. Two filters horizontal boundaries of each chroma component.
H.264 usees entropy codding for matchhing a symboll to a code, baased on the coontext characteristics.
All syyntax elemennts, except forr the residuall data are enccoded by thee Exp-Golombb codes. To read r the
residuual data (quanntized transform m coefficientss), zigzag 1scaan (interlacedd) or alternate scan (no interrlaced or
field) is used. For coding the reesidual data, a more sophisticated method called Conntext-Based Adaptive A
Variabble Length Coding
C (CAV VLC) is empployed. Conteext-based Adaptive Binaryy Arithmetic Coding
(CAB BAC) is also employed
e in Main and High Profiles, CABACC has more
m coding efficiency bu ut higher
complexity compaared to Conteext-Based Adaaptive Variab ble Length Coding (CAVL LC). A coded d H.264
stream
m or an H.264 file consistts of a series of coded sym mbols. These symbols makke up the syn ntax and
includde parameterss, identifiers, delimiting
d coddes, predictionn types, differrentially codeed motion vecctors and
transfform coefficiennts [ 8 ] [ 9 ].
3. DESIGN AND
D D DEVELOP PMENT OF H.264 H STAND DARD
The basic block diagraam of Figure 4 consist inteegration of H..264. First of all, YCbCr image of
pixel value need to give the RAM. RAM M stores the data and traansmits as coompatible to the top
moduule.Prediction stage fetches the data from m the RAM in the form of pixel value. Tw wo types of prrediction
are doone in that staage I & P preddiction. I slice (Intra-coded slice) is the coded slice by using predicttion only
from decoded samp mples within thhe same slice. P slice (Preedictive-codedd slice) is the coded slice by b using
inter-pprediction froom previously decoded reference picturess, using at moost one motionn vector and reference
index to predict thee sample valuees of each bloock. The prediiction requiress previous datta which is tak ken from
the reeconstruction block.
b By usinng the previouus data of preddiction stage frame
f is prediicted. Predictiion stage
transmmits DATA too transform annd quantizationn stage.
Transform m and quantizzation Stage convert
c spatiaal domain to frequency doomain using 2d 2 DCT
transfform. After thhat data is trannsmited to thee Context-Bassed Adaptive Variable
V Lenggth Coding (C CAVLC)
and then
t Inverse transform annd De-quantizzation operation is perforrmed. Inversee transform and a De-
Quanttization blockk converts dataa again to spattial domain fo or the reconstruuction block.
Reconstruuction Block provides
p the Feeedback for th hat it requires storing data inn RAM which h is used
in futuure. This dataa is used in prediction stagees. Headers sttore all headerr data and com mbine headerss to byte
streamm for Decodinng side.
Context-B Based Adaptivve Variable Leength Coding (CAVLC) is doing entroppy coding for transmit
byte stream
s for the data transfer.
4. SIMULATION RESULTS
S S
The simulation result off Intra Predicttion module in software toool ISIM 13.4 is shown in Figure
F 5.
Beforre the first maacroblock in a line NEWLIINE is set thiss initializes alll pointers andd resets the prrediction
mode to no adjacennt blocks. NE EWLINE shoould be set forr at least 1 CL LK before ST TROBEI goes high. If
this iss the first in a slice, NEWS
SLICE shouldd be set at the same time ass NEWLINE. Data is clock ked in by
STRO OBEI and apprropriate modee is selected annd generated output
o DATA AO.
Figure 6 shows the simulation result for the Transform, Quantization, Dequantization and Inv-
Transform. Input pixel value is represented in 8 bit and reconstructed pixel value represented in 10 bit
because of extra carry bits.
Simulation result for Context-Based Adaptive Variable Length Coding (CAVLC) shown in Figure 7
All modules are integrated in TOP module and simulation result of TOP module is shown in Figure 8.
YUV 4:2:0 Video formats is used for the input of the whole design. TEXTIO package defines
input byte file which can be read in the format of character. First of all Y frame is read for frame size and
then U, V frame is read for half of frame size. Frame is divided in macroblocks and sub macroblocks.
Design algorithm takes input from the input macroblock and applies to whole design of H.264 standard.
Output byte stream is converted to character to extract for output file.
The Mean Square Error (MSE) and the Peak Signal to Noise Ratio (PSNR) are the two error
metrics used to compare image compression quality. The MSE represents the cumulative squared error
between the compressed and the original image, whereas PSNR represents a measure of the peak error. The
lower the value of MSE gives the lower the error. The higher the PSNR better the quality of the
compression.
To compute the PSNR, of the block first calculate the mean-squared error using the following
equation:
∑ , , ,
(3)
In the Equation 3, M and N are the number of rows and columns in the input images, respectively.
Then the block computes the PSNR using the following equation:
10 log (4)
In the Equation 4, R is the maximum fluctuation in the input image data type. For example, if the
input image has a double-precision floating-point data type, then R is 1. If it has an 8-bit unsigned integer
data type, R is 255, etc. For the YUV video sequences City is taken as test input sequence and using
TEXTIO pixel value is extracted from the frame of the sequence. One frame of sequence is used to measure
PSNR as Equation 4. PSNR-Y is calculated by comparing with original image Y frame component to
compressed video Y frame for each component by developed algorithm for calculation.In this PSNR-Y of
output sequence shown in Figure 9.
5. CONCLUSION
This paper describesVHDL implementation of H.264 video coding standard and for that each
module of the H.264 standard is designed and tested individually. Ultimately all design is integrated in one
design to achieve the standard.
Design is developed for Transform and Quantization to scaling the input video and converted for
generating byte stream of input video. Prediction stage is designed for the same frame and consecutive
frames. CAVLC variable coding is designed for entropy coding. As per H.264 Syntax byte stream is
generated.
TEXTIO package of VHDL is used for testing design. YUV Video sequence is taken as input of
Test Bench of design using TEXTIO, and H.264 codec output video is generated. Video quality is checked
by measuring PSNR of o/p video with original which is 34.7dB.
REFERENCES
[1]. Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, JVTG050, “Draft ITU-T
recommendation and final draft international standard of joint video specification (ITU-T Rec.
H.264/ISO/IEC 14 496-10 AVC”, 2003.
[2]. Thomas Wiegand, Gary J. Sullivan, “Overview of the H.264/AVC Video Coding Standard”, IEEE
transactions on circuits and systems for video technology, vol. 13, no. 7, July 2003.
[3]. Soon-kak Kwon a, A. Tamhankar, K.R. Rao, “Overview of H.264/MPEG-4 part 10”, J. Vis.
Commun. Image R. 17, 2006, 186–216.
[4]. Iain Richardson “White Paper: 4x4 Transform and Quantization in H.264/AVC”, Vcodex © 2002-
2011.
[5]. Iain Richardson “White Paper: H.264 / AVC Intra Prediction” ,Vcodex © 2002-2011.
[6]. Iain Richardson “White Paper: H.264 / AVC Inter Prediction” Vcodex © 2002-2011
[7]. Iain Richardson “White Paper: H.264 / AVC loop filter” Vcodex © 2002-2011
[8]. Iain Richardson “White Paper: H.264 / AVC CAVLC” Vcodex © 2002-2011
[9]. Iain Richardson “White Paper: H.264 / AVC CABAC” Vcodex © 2002-2011
[10]. R. Schafer, T. Wiegand, and H. Schwarz, “The Emerging H.264 AVC Standard”, EBU Technical
Review, January 2003.
BIOGRAPHIES OF AUTHORS
Jignesh R. Patel
He is purshing his Master in Electronics & Communication Enggineering at Parul Institute of
Engineering & Technolohy,Vadodara.
Haresh A.Suthar
He is Reader and Head of the E&C Engineering Department at Parul Institute of
Technolgy,Vadodara.He got his BE-Electronics and Master in Autmatic Control & Robotics from
M.S.U of Baroda. His area of interest is embedded, control systems and Artificial Intelligence.