100% found this document useful (1 vote)
58 views29 pages

Unit - 6 Fundamentals of Digital Video

The document provides an overview of fundamentals of digital video including video signals, analog video signals, digital video sampling, frame types, and intra and inter frame coding. Key topics covered include raster scanning, chroma subsampling formats, and the characterization of digital video signals.

Uploaded by

xx69dd69xx
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
58 views29 pages

Unit - 6 Fundamentals of Digital Video

The document provides an overview of fundamentals of digital video including video signals, analog video signals, digital video sampling, frame types, and intra and inter frame coding. Key topics covered include raster scanning, chroma subsampling formats, and the characterization of digital video signals.

Uploaded by

xx69dd69xx
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Fundamentals of

Digital Video
Syllabus

References

1. Data Compression- Khalid Sayood


2. Data Compression- The complete reference- David Salomon
3. Fundamentals of multimedia- Ze-Nian Li, Mark Drew
Video signal
 Any sequence of time varying images
 Still image is a spatial distribution of intensities that remain constant with time
 Whereas a time varying image has a spatial intensity distribution that varies with time
 Video signal is treated as a series of images called frames
 An illusion of continuous video is obtained by changing the frames in a faster manner
which is generally termed as frame rate
 Real world scene is a 3d signal changing in time F(X,Y,Z, t)
 Here (X,Y,Z) and 3D spatial coordinates, and t is time
 Video is projection of dynamic scene onto 2D camera plane f(x,y,t)
 (x,y) is projection of (X,Y,Z) onto a 2D plane
 For a given t, f(x,y,t) is a 2D frame
Analog Video Signals
 Most common consumer display mechanism for video still uses analogue display
devices such as CRT
 Three principal Analog Video Signal formats are:
 NTSC (National Television Systems Committee)
 PAL (Phase Alternate Line)
 SECAM (Sequential Color with Memory).
 In all the above formats, each picture is captured by CCD or CRT is scanned
from left to right to create a sequential intensity signal
 Formats take advantage of the persistence of human vision by using interlaced
scanning pattern in which the odd and even lines of each picture are read out
in two separate scans of he odd and even fields respectively
 This allows good reproduction of movement in the scene at the relatively low
field rate of 50
 50 fields/sec for PAL and SECAM and 60 fields/sec for NTSC
Analog Video Raster

 Raster scan for video capture and display is used


 Raster scan- captures the signal in both temporal and vertical direction
 Resulting waveform is stored in continuous 1D form
 3D signal p(x,y,t) converted into 1D signal s(t)
 Raster scanning methods- Progressive scanning and Interlaced scanning
Progressive scanning
 Left to right, top to bottom
 Samples in time: frames/sec
 Samples along y:lines
 Samples along x: pixels
Interlaced scanning
Progressive scanning Interlaced scanning
A frame is formed by a single scanning pass i.e. Each frame is scanned in two fields and each field
the electronic beam continuously scans the image contains half the number of lines in a frame. This is
region from top to bottom and then back to top. called 2:1 interlace. The time interval between two
fields, called the field interval is half the frame
interval.
Progressive video, as stated above captures 1 Interlaced scanning is simply displaying alternating
entire image per frame. sets of lines. First even numbered lines are
displayed and then odd numbered lines are
displayed.
more expensive Less expensive

Progressive scanning results in a clearer image and Results in clearer image. Mostly used in computer
handles movement differently. Mostly used in TV display.
display.
Progressive video on the other hand displays the Each even set of lines are displayed for 1/60th of a
entire image in 1/60th of a second. second and then the odd lines are displayed for
1/60th of a second.
Each time an odd number series of lines are
displayed it is called a “field”. The same name is
given to the even set of lines. Because each field
happens so quickly we are given the illusion of a
whole image. However, we are only being
presented with half an image and very quickly
after that we are presented with the other half of
the image.
Characterization of video raster
 Raster signal is characterized by following parameters
• Frame rate-fs,t (frames/sec)
• Line number-fs,y (lines/frame or lines/picture_height)
• Line rate- fl,s (lines/sec= fs,t/fs,y)
Parameters of digital video

 Raster signal is characterized by following parameters


• Frame rate-fs,t (frames/sec)
• Line number-fs,y (lines/frame or lines/picture_height)
• Line rate- fl,s (lines/sec= fs,t/fs,y)

 Defined by parameters
• frame rate 𝑓𝑠,𝑡 , line number𝑓𝑠,𝑦 and samples per line 𝑓𝑠,𝑥
𝑝𝑖𝑐𝑡𝑢𝑟𝑒ℎ𝑒𝑖𝑔ℎ𝑡
• Vertical sampling interval ∆𝑦 =
𝑓𝑠,𝑦
𝑝𝑖𝑐𝑡𝑢𝑟𝑒𝑤𝑖𝑑𝑡ℎ
• Horizontal sampling interval ∆𝑥 =
𝑓𝑠,𝑥

• Number of bits per pixel 𝑁𝑏 .


 𝑁𝑏 = 8 𝑏𝑖𝑡𝑠 for monochrome and 𝑁𝑏 = 24 bits for colour video

 Data rate R of digital video is given by


𝑅 = 𝑓𝑠,𝑡 ∗ 𝑓𝑠,𝑥 ∗ 𝑓𝑠,𝑦 ∗ 𝑁𝑏 𝑏𝑖𝑡𝑠/𝑠𝑒𝑐
Digital Video

 Picture information is digitized both spatially and temporally and the


resultant pixel intensities are quantized
Digital Video Sampling

 Spatial sampling
 Frame size: Number of pixels per frame
 Spatial sampling and quantization of a natural video signal digitizes the image
plane into a two dimensional set of digital pixels that define a digital image.
 Temporal Sampling
 Frame rate: Number of frames per sec
 Temporal sampling of a natural video signal creates a sequence image frames
typically used for motion pictures and television.
 Most video formats use temporal sampling rate of 24 frames per sec and above
Advantages of digital video

It permits
 Storing video on digital devices or in memory, ready to be proceeds (noise
removal, cut and paste, and so on) and integrated into various multimedia
applications
 Direct access, which makes nonlinear video editing simple
 Repeated recording without degradation of image quality
 Ease of encryption, better tolerance to channel noise
Applications of digital video

Demand of digital video is increasing in areas such as


 Video teleconferencing
 Multimedia authoring systems
 Education
 Video-on-demands systems
Chroma Subsampling
 Basic concept of subsampling is to reduce dimension of the input
video (horizontal or vertical dimension)
 This process is done prior to the encoding process, thus reducing the
number of pixels to be encoded
 At the receiver, the decoded images are interpolated for display
 One of most elementary compression techniques which makes use
of specific physiological characteristics of the human eye
 Removes subjective redundancy contained in the video data.
 Concept is also used to explore subjective redundancies contained
in chrominance data, i.e., human eye is more sensitive to changes
in brightness than to chromaticity changes.
 RGB format is not preferred because R, G, B components are
correlated and transmitting R,G,B components separately is
redundant.
 To overcome this, the input image is divided into YUV components
(one luminance and two chrominance components).
4:4:4 Chroma subsampling format

A:B:C

A= number of samples/size of
the sample

B= number of pixels having


their own chroma sample

C= number of pixel in the


second row with their own
chroma sample
4:2:2 Chroma subsampling format
4:1:1 Chroma subsampling format

(𝐶𝑏0 , 𝑌0 ) (𝐶𝑟0 , 𝑌1 ) (𝑌2 ) (𝑌3 )


4:2:0 Chroma subsampling format
Chroma subsampling format
Chroma subsampling format
Example

NTSC version
 Scan lines= 525 scan lines,
 each having 858 pixels
 Uses 4:2:2 subsampling, so each pixel is represented by two bytes (8 bits for Y
and 8 bits alternating between Cb and Cr)
 Frame rate is 30fps
Then NTSC data rate is given by
525 𝑋 858 𝑋 30 𝑋 16𝑏𝑖𝑡𝑠 ≈ 216𝑀𝑏𝑝𝑠
Frame types

 Three types of video frames- I, P and B


 I- Intra frames
 Encoded without any motion compensation
 Used as reference for future predicted P and B frames
 Require relatively large number of bits for encoding

 P- Predictive frames (Inter frames)


 Encoded using motion compensated prediction from previous reference frame which can
be either I or P frame
 More efficient than I frames in terms of number of bits required for encoding
 But still requires more bits than B frames

 B- Bidirectional frames
 Encoded using future as well as backward prediction
 Requires lowest number of bits than I and P
 But incurs computational complexity
Inter frame and Intra frame coding

 Intra frame coding


 Removing spatial redundancy is termed as intra frame coding
 Spatial redundancy within the frame can be minimised using transform (eg DCT)
 Inter frame coding
 Temporal redundancy between the two frames s removed by inter frame coding
 It exploits interdependencies of the video frames
 It relies on the fact that the adjacent pictures in the video sequence have high
temporal correlation
Group of Pictures
 Frames between two successive I frames, including the leading I
frame are called as Group of Pictures (GOP)
 The existence of GOPs facilitates the implementation of features
such as random access, fast forward or fast and normal play
back
Video quality measures

 Subjective quality measurement


 Objective quality measurement
Subjective quality measurement
 Used to establish performance of the television systems
 Non expert observers are shown series of test scenes and asked to score quality of scenes.
 Test methods can be double stimulus or single stimulus
 Double stimulus- viewers rate the change of quality between the two videos(reference and impaired)
 Single stimulus (Absolute Category Rating) - viewers rate quality of just one video stream(impaired)
 Each sequence is rated individually on the ACR scale, labels on the scale are "bad", "poor", "fair", "good", and
"excellent", and they are translated to the values 1, 2, 3, 4 and 5 when calculating the MOS (Mean Opinion Score)
 Various methods for subjective quality measurement
 Double stimulus impairment scale method (Degradation Category Rating)
 Double stimulus continuous quality scale method (DSCQS)
 Single stimulus method
 Stimulus comparison method
 Single stimulus continuous quality evaluation
 In ITU-R recommendation BT.500-11 (ITU, 2001) most popular is evaluation by the DSCQS
 DSCQS
 Viewer evaluates a pair of digital video image short sequences, called A and B, one after another.
 He is asked to give a score to A and B sequences on a continuous scale
 Scale is divided into five intervals of the subjective quality scores reaching from excellent through
good, fair, poor to bad quality.
Objective quality measurement

 Methods are based on automated, computational approach.


 Metric for objective quality measurement are as follows
 Mean Square Error (MSE)
 Peak Signal-to-Noise Ratio (PSNR).
 Mean Absolute Difference(MAD)
Objective quality measurement

 Mean Square Error


1 2
𝑀𝑆𝐸 = 𝜎𝑒2 = ෍ ෍ 𝑓 𝑚, 𝑛, 𝑘 − 𝑓መ 𝑚, 𝑛, 𝑘
𝑁
𝑘 𝑚.𝑛
Here ,N= total number of pixels in each sequence

 Peak Signal to Noise Ratio (PSNR) in dB


2
𝑓𝑚𝑎𝑥
𝑃𝑆𝑁𝑅 = 10 log10 2
𝜎𝑒
Here , 𝑓𝑚𝑎𝑥 is the peak intensity value of video signal usually taken as 255.
 Mean Absolute difference
1

𝑀𝐴𝐷 = ෍ ෍ 𝑓 𝑚, 𝑛, 𝑘 − 𝑓(𝑚, 𝑛, 𝑘)
𝑁
𝑘 𝑚.𝑛

You might also like