0% found this document useful (0 votes)

154 views

Sampling Image Video Processing

The document discusses imaging basics including sampling, quantization, color spaces, and the RGB color model. Sampling is the process of examining a continuous function at regular intervals. Quantization constrains sampled values to a finite set of discrete values. The RGB color model represents colors as combinations of red, green, and blue intensities and fits how computers represent colors. The human eye contains three types of color-sensitive cones that perceive color based on the red, green, and blue responses, forming the basis of the RGB color model.

Uploaded by

Srinivas Reddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

154 views

Sampling Image Video Processing

Uploaded by

Srinivas Reddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Chapter 3

Imaging Basics

3.1 Sampling and Quantization

Sampling is the process of examining the value of a continuous function at regular intervals. We might measure the voltage of an analog waveform every millisecond, or measure the brightness of a photograph every millimeter, horizontally and vertically. Sampling rate is the rate at which we make the measurements and can be dened as Sampling rate = 1 Hz Sampling interval

If sampling is performed in the time domain, Hz is cycles/sec. In the case of image processing, we can regard an image as a two-dimensional lightintensity function f (x, y ) of spatial coordinates (x, y ). Since light is a form of energy, f (x, y ) must be nonnegative. In order that we can store an image in a computer, which processes data in discrete form, the image function f (x, y ) must be digitized both spatially and in amplitude. Digitization of the spatial coordinates (x, y ) is referred to as image sampling or spatial sampling , and digitization of the amplitude f is referred to as quantization. Moreover, for moving video images, we have to digitize the time component and this is called temporal sampling . Digital video is a representation of a real-world scene, sampled spatially and temporarily and with the light intensity value quantized at each spatial point. A scene is sampled at an instance of time to produce a f rame, which consists of the complete visual scene at that instance, or a f ield, which consists of odd- or even-numbered lines of spatial samples. Figure 3-1 shows the concept of spatial and temporal sampling of videos.

Temporal Samples

Spatial Samples Figure 3-1 Temporal Sampling and Spatial Sampling

3.1.1 Spatial Sampling

Usually, a two-dimensional ( 2D ) sampled image is obtained by projecting a video scene onto a 2D sensor, such as an array of Charge Coupled Devices ( CCD array ) . For colour 1

Sampling and Quantization

images, each colour component is ltered and projected onto an independent 2D CCD array. The CCD array outputs analogue signals representing the intensity levels of the colour component. Sampling the signal at an instance in time produces a sampled image or frame that has specied values at a set of spatial sampling points in the form of an N M array as shown in the following equation. f (x, y ) f (0, M 1) f (1, M 1) . . f (N 1, M 1)

f (0, 0) f (1, 0) . .

f (0, 1) f (1, 1) . .

... ... ... ...

(3.1)

f (N 1, 0) f (N 1, 1) ...

The right image of Figure 3-2 below shows that a rectangular grid is overlaid on a 2D image to obtain sampled values f (x, y ) at the intersection points of the grid. We may approximately reconstruct the sampled image by representing each sample as a square picture element ( pixel ) as shown on the left image of Figure 3-2. The visual quality of the reconstructed image is affected by the choice of the sampling points. The more sampling points we choose, the higher resolution the resulted sampled image will be. Of course, choosing more sampling points requires more computing power and storage.

Figure 3-2 Spatial Sampling

3.1.2 Temporal Sampling

T emporal sampling of video images refers to the process of taking a rectangular snapshot of the image signal at regular time intervals. The rate at which we take the the snapshots is the sampling rate and is dened as the f rame rate or f ield rate. When we play back a sequence of frames obtained in this way at the same rate, an illusion of motion may be created. A higher frame rate produces apparently smoother motion but requires more

Chapter 3

Imaging Basics

computing power and storage to process and save the larger number of samples. Early silent lms used anything between 16 and 24 frames per second ( fps ). Current television standards use sampling rate of 25 or 30 frames per second. There are two commonly used temporal sampling techniques, progressive sampling and interlaced sampling. Progressive sampling is a frame-based sampling technique where a video signal is sampled as a series of complete frames. Film is a progressive sampling source for video. Interlaced sampling is a eld-based sampling technique where the video is sampled periodically at two sample elds; half of the data in a frame ( one eld ) are scanned at one time. To reconstruct the frame, a pair of sample elds are superimposed on each other ( interlaced ). In general, a eld consists of either the odd-numbered or even-numbered scan lines within a frame as shown in Figure 3-3. Start of odd eld Start of even eld

End of odd eld

End of even eld

Figure 3-3 Interlaced Scanning An interlaced video sequence contains a sequence of elds, each of which consists of half the data of a complete frame. The interlaced sampling technique can give the appearance of smoother motion as compared to the progressive sampling method when the data are sampled at the same rate. This is due to the motion blur effect of human eyes; the persistence of vision can cause images shown rapidly in sequence to appear as one. When we rapidly switch between two low quality elds, they appear like a single high quality image. Because of this advantage, most current video image formats, including several high-denition video standards, use interlaced techniques rather than progressive methods.

3.1.3 Quantization
Quantization is the procedure of constraining the value of a function at a sampling point to a predetermined nite set of discrete values. Note that the original function can be either continuous or discrete. For example, if we want to specify the temperature of Los Angels, ranging from 0o C to 50o C , up to a a precision of 0.1o C , we must be able to represent 1001 possible values, which require 10 bits to represent one sample. On the other hand, if we only need a precision of 1o C , we only have 51 possible values requiring 6 bits for the representation. For image processing, higher precision give higher image quality but requires more bits in the representation of the samples. We will come back to this topic and discuss how to use quantization to achieve lossy image compression.

Color Spaces

3.2 Color Spaces

To describe an image, we need a way to represent the color information. A gray-level image only requires one number to indicate the brightness or luminance of each spatial sample. Very often, we employ a color model to precisely describe the color components or intensities. A color model can be regarded as an abstract mathematical model that describes how colors are presented as tuples of numbers, typically as three or four values or color components; the resulting set of colors that dene how the components are to be interpreted is called a color space. The commonly used RGB color model naturally ts the representation of colors by computers. However, it is not a good model for studying the characteristics of an image.

3.2.1 RGB Color Model

X-ray, light, infrared radiation, microwave and radio waves are all electromagnetic ( EM ) waves with different wavelengths. Light waves lie in the visible spectrum with a narrow wavelength band from about 350 to 780 nm. The retina of a human eye can detect only EM waves lying within this visible spectrum but not anything outside. The eye contains two kinds of light-sensitive receptor cells, cones and rods that can detect light. The cones are sensitive to colors and there are three types of cones, each responding to one of the three primary colors, red, green and blue. Scientists found that our perception of color is a result of our cones relative response to the red, green and blue colors. Any color can be considered as a combination of these three colors with certain intensity values. The human eye can distinguish about 200 intensities of each of the red, green and blue colors. Therefore, it is natural that we represent each of these colors by a byte which can hold 256 values. In other words, 24 bits are enough to represent the true color. More bits will not increase the quality of an image as human eyes cannot resolve the extra colors. Each eye has 6 to 7 million cones located near the center of the eye, allowing us to see the tiny details of an object. On the other hand, the rods cannot distinguish colors but are sensitive to dim light. Each eye has 75 million to 150 millions rods located near its corner, allowing us to detect peripheral objects in an environment of near darkness. We can characterize a visible color by a function C () where is the wavelength of the color in the visible spectrum. The value for a given wavelength gives the relative intensity of that wavelength in the color. This description is accurate when we measure the color with certain physical instrument. However, the human visual system ( HVS ) does not perceive color in this way. Our brains do not receive the entire distribution C () of the visible spectrum but rather three values the tristimulus values that are the responses of the three types ( red, green and blue ) of cones to a color. This human characteristics leads to the formulation of the trichromatic theory: If two colors produce the same tristimulus values, they are visually indistinguishable. A consequence of of this theory is that it is possible to match all of the colors in the visible spectrum by appropriate mixing of three primary colors. In other words, any color can be created by combining red, green, and blue in varying proportions. This leads to the development of the RGB color model. The RGB ( short for red, green, blue ) color model decomposes a color into three components, Red ( R ), Green ( G ), and Blue ( B ); we can represent any color by three components R, G, B just like the case that a spatial vector is specied by three components x, y, z . If the color components R, G and B are conned to values between 0 and 1, all

Chapter 3

Imaging Basics

denable colors lie in a unit cube as shown in Figure 3-4. This color space is most natural for representing computer images, in which a color specication such as ( 0.1, 0.8, 0.23 ) can be directly translated into three positive integer values, each of which is represented by one byte.

Blue

[0, 0, 1]

Cyan

M agenta

W hite

[0, 1, 0] Black [1, 0, 0] Green

Red

Y ellow

Figure 3-4 RGB Color Cube

In this model, we express a color C in the vector form, R C= G 0 R, G, B 1 B

(3.2)

In some other notations, the authors like to consider R, G, and B as three unit vectors like the three spatial unit vectors i, j, and k. Just as a spatial vector V can be expressed as v = xi + y j + z k, any color is expressed as C = (rR + gG + bB ), and the red, green, blue intensities are specied by the values of r, g , and b respectively. In our notation here, R, G, and B represent the intensity values of the color components. Suppose we have two colors C1 and C2 given by R1 R2 C1 = G1 , C2 = G2 B1 B2 Does it make sense to add these two colors to produce a new color C ? For instance, consider

Color Spaces

R1 + R2 C = C1 + C2 = G1 + G2 B1 + B2 You may immediately notice that the sum of two components may give a value larger than 1 which lies outside the color cube and thus does not represent any color. Just like adding two points in space is illegitimate, we cannot arbitrarily combine two colors. A linear combination of colors makes sense only if the sum of the coefcients is equal to 1. Therefore, we can have C = 1 C1 + 2 C2 when 0 1 , 2 and 1 + 2 = 1 (3.3)

In this way, we can guarantee that the resulted components will always lie within the color cube as each value will never exceed one. For example, R = 1 R1 + 2 R2 1 1 + 2 1 = 1 which implies R1 The linear combination of colors described by Equation (3.3) is called color blending .

3.2.2 YUV Color Model

While the RGB color model is well-suited for displaying color images on a computer screen, it is not an effective model for image processing or video compression. This is because the human visual system ( HVS ) is more sensitive to luminance ( brightness ) than to colors. Therefore, it is more effective to represent a color image by separating the luminance from the color information and representing luma with a higher resolution than color. The YUV color model, dened in the TV standards, is an efcient way of representing color images by separating brightness from color values. Historically, YUV color space was developed to provide compatibility between color and black /white analog television systems; it is not dened precisely in the technical and scientic literature. In this model, Y is the luminance ( luma ) component, and U and V are the color differences known as chrominance or chroma, which is dened as the difference between a color and a reference white at the same luminance. The conversion from RGB to YUV is given by the following formulas: Y = kr R + kg G + kb B U =BY V =RY

(3.4)

Chapter 3 with

Imaging Basics

0 kr , kb , kg kr + kb + kg = 1

(3.5)

Note that equations (3.4) and (3.5) imply that 0 Y 1 if the R, G, B components lie within the unit color cube. However, U and V can be negative. Typically, kr = 0.299, kg = 0.587, kb = 0.114 (3.6)

which are values used in some TV standards. For convenience, in the forthcoming discussions, we always assume that 0 R, G, B 1 unless otherwise stated. The complete description of an image is specied by Y ( the luminance component ) and the two color differences ( chrominance ) U and V . If the image is black-and-white, U = V = 0. Note that we do not need another difference ( G Y ) for the green component because that would be redundant. We can consider (3.4) as three equations with three unknowns, R, G, B . We can always solve for the three unknowns and recover R, G, B . A fourth equation is not necessary. It seems that there is no advantage of using YUV over RGB to represent an image as both system requires three components to specify an image sample. However, as we mentioned earlier, human eyes are less sensitive to color than to luminance. Therefore, we can represent the U and V components with a lower resolution than Y and the reduction of the amount of data to represent chrominance components will not have an obvious effect on visual quality. Representing chroma with less number of bits than luma is a simple but effective way of compressing an image.

3.2.3 YCbCr Color Model

The YCbCr color model dened in the standards of ITU (International Telecommunication Union) is closely related to YUV but with the chrominace components scaled and shifted to ensure that they lie within the range 0 and 1. It is sometimes abbreviated to YCC. It is also used in the JPEG and MPEG standards. In this model, an image sample is specied by a luminance ( Y ) component and two chrominance components ( Cb, and Cr ). The following equations convert an RGB image to one in YCbCr space. Y = kr R + kg G + kb B Cb = BY + 0.5 2(1 kb ) (3.7) (R Y ) Cr = + 0.5 2(1 kr ) kr + kb + kg = 1 An image may be captured in the RGB format and then converted to YCbCr to reduce storage or transmission requirements. Before displaying the image, it is usually necessary to convert the image back to RGB. The conversion from YCbCr to RGB can be done by solving for R, G, B in the equations of (3.7). The equations for converting from YCbCr to

8 RGB are shown below:

Color Spaces

R = Y + (2Cr 1)(1 kr ) B = Y + (2Cb 1)(1 kb ) G= Y kr R kb B kg kr (2Cr 1)(1 kr ) + kb (2Cb 1)(1 kb ) =Y kg (3.8)

If we use the ITU standard values kb = 0.114, kr = 0.299, kg = 1 kb kr = 0.587 for (3.7) and (3.8), we will obtain the following commonly used conversion equations. Y = 0.299R + 0.587G + 0.114B Cb = 0.564(B Y ) + 0.5 Cr = 0.713(R Y ) + 0.5 (3.9) R = Y + 1.402Cr 0.701 G = Y 0.714Cr 0.344Cb + 0.529 B = Y + 1.772Cb 0.886 In equations (3.7), it is obvious that 0 Y 1. It turns out that the chrominance components Cb and Cr dened in (3.7) also always lie within the range [0, 1]. We prove this for the case of Cb . From (3.7), we have

= = = =

BY 1 + 2(1 kb ) 2 B kr R kg G kb B + 1 kb 2(1 kb ) B k r R k g G + 1 k b + 2 2(1 kb ) kr 1 kg 1 + 1 kb B + 2 2(1 kb ) B 2 0

Thus

Cb 0

(3.10)

Chapter 3 Also,

Imaging Basics

BY 1 + 2(1 kb ) 2 B kr R kg G kb B 1 = + 2(1 kb ) 2 B kb B 1 + 2(1 kb ) 2 1 B + = 2 2 1 1 + 2 2 = 1 =

Thus Cb 1 Combining (3.10) and (3.11), we have 0 Cb 1 Similarly 0 Cr 1 In summary, we have the following situation. If then 0 R, G, B 1 (3.14) 0 Y, Cb , Cr 1 Note that the converse is not true. That is, if 0 Y, Cb , Cr 1, it does not imply 0 R, G, B 1. A knowledge of this helps us in the implementations of the conversion from RGB to YCbCr and vice versa. We mentioned earlier that the eye can only resolve about 200 different intensity levels of each of the RGB components. Therefore, we can quantize all the RGB components in the interval [0,1] to 256 values, from 0 to 255, which can be represented by one byte of storage without any loss of visual quality. In other words, one byte ( or an 8-bit unsigned integer ) is enough to represent all the values of each RGB component. When we convert from RGB to YCbCr, it only requires one 8-bit unsigned integer to represent each YCbCr component. This implicitly implies that all conversions can be done efciently in integer arithmetic that we shall discuss below. (3.13) (3.12) (3.11)

3.3 Conversions between RGB and YCbCr

It is straightforward to write a java program to convert RGB to YCbCr or from YCbCr to RGB. We discussed in the previous section that the implementation can be effectively done in integer arithmetic. However, for clarity of presentation, we shall rst discuss a

Conversions between RGB and YCbCr

oating point implementation. The java programs presented in this book are mainly for illustration of concepts. In most cases, error checking is omitted and some variable values are hard-coded.

3.3.1 Floating Point Implementation

The program listed below shows the conversion between RGB and YCbCr using ITU standard coefcients. It is a direct implementation of equations (3.9). The R, G, and B values, which must lie between [0,1] are hard-coded and converted to Y, Cb, and Cr, which are then converted back to R, G, and B.

Program Listing 3-1

/* * * * * * */ Rgbyccf.java Program to demonstrate the conversions between RGB and YCbCr using ITU standard coefficients. Floating point arithmetic is used. Compile: $javac rgbyccf.java Execute: $java regyccf

import java.io.*; class Rgbyccf { public static void main(String[] args) { //0 <= R, G, B <= 1, sample values double R = 0.3, G = 0.7, B = 0.2, Y, Cb, Cr; System.out.printf("\nOriginal R, G, B:\t%f, %f, %f", R, G, B ); Y = 0.299 * R + 0.587 * G + 0.114 * B; Cb = 0.564 * (B - Y) + 0.5; Cr = 0.713 * (R - Y) + 0.5; System.out.printf("\nConverted Y, Cb, Cr:\t%f, %f, %f",Y,Cb,Cr); //recovering R, G, B R = Y + 1.402 * Cr - 0.701; G = Y - 0.714 * Cr - 0.344 * Cb + 0.529; B = Y + 1.772 * Cb - 0.886; System.out.printf("\nRecovered R, G, B:\t%f, %f, %f\n\n",R,G,B); } }

The program generates the following outputs:

Original R, G, B: Converted Y, Cb, Cr: Recovered R, G, B:

0.300000, 0.700000, 0.200000 0.523400, 0.317602, 0.340716 0.300084, 0.699874, 0.200191

The recovered R, G, and B values differ slightly from the original ones due to rounding errors in computing and the representation of numbers in binary form.

Chapter 3

Imaging Basics

3.3.2 Integer Implementation

The above program illustrates the conversion between RGB and YCbCr using oatingpoint calculations. However, such an implementation is not practical. Not only that rounding errors are introduced in the computations, oating-point arithmetic is very slow. When compressing an image, we need to apply the conversion to every pixel. Switching to integer-arithmetic in calculations can easily shorten the computing time by a factor of two to three. In RGB-YCbCr conversion, using integer-arithmetic is quite simple because we can always approximate a real number as a fraction between two integers. For example, the coefcients for calculating Y from RGB can be expressed as: 0.299 = 19595/216 0.587 = 38470/216 0.114 = 7471/216 The integer-arithmetic expression for Y can be obtained by multiplying the equation Y = 0.299R + 0.587G + 0.114B by 216 , which becomes 216 Y = 19595R + 38470G + 7471B (3.16)

(3.15)

At the same time, we quantize the R, G, and B values from [0, 1] to 0, 1, ..., 255 which can be done by multiplying the oating-point values by 255. We also need to quantize the shifting constants 0.5, 0.701, 0.529, and 0.886 of (3.9) using the same rule by multiplying them by 255, which will become 0.5 255 = 128 0.701 255 = 179 0.529 255 = 135 0.886 255 = 226

(3.17)

Actually, representing a component of RGB with integer values 0 to 255 is the natural way of a modern computer handling color data. Each pixel has three components ( R, G, and B ) and each component value is saved as an 8-bit unsigned number. As shown in (3.9), in oating-point representation, the Cb component is given by Cb = 0.564(B Y ) + 0.5 After quantization, it becomes Cb = 0.564(B Y ) + 128 Multiplying (3.18) by 2 , we obtain 216 Cb = 36962(B Y ) + 128 216 The corresponding equation for Cr is: 216 Cr = 46727(R Y ) + 128 216 (3.20) (3.19)
16

(3.18)

Conversions between RGB and YCbCr

As R, G, and B have become integers, we can carry out the calculations using integer multiplications and then divide the result by 216 . In binary calculations, dividing a value by 216 is the same as shifting the value right by 16. Therefore, from (3.16), (3.19) and (3.20), the calculations of Y and Cb using integer-arithmetic can be carried out using the following piece of java code. Y = (19595 R + 38470 G + 7471 B ) >> 16; Cb = (36962 (B Y ) >> 16) + 128; Cr = (46727 (R Y ) >> 16) + 128;

(3.21)

One should note that the sum of the coefcients in calculating Y is 216 (i.e. 19595 + 38470 + 7471 = 65536 = 216 ), corresponding to the requirement, kr + kg + kb = 1 in the oating-point representation. The constraints of (3.14) and the requirement of 0 R, G, B 255 implies that in our integer representation, 0 Y 255 0 Cb 255 0 Cr 255 In (3.9) the R component is obtained from Y and Cr : R = Y + 1.402Cr 0.701 In integer-arithmetic, this becomes 216 R = 216 Y + 91881Cr 216 179 The value of R is obtained by dividing (3.23) by 216 as shown below in java code: R = (Y + 91881 Cr >> 16) 179; (3.24) (3.23)

(3.22)

We can obtain similar equations for G and B. Combining all these, equations of (3.9) when expressed in integer-arithmetic and in java code will take the following form: Y = (19595 R + 38470 G + 7471 B ) >> 16; Cb = (36962 (B Y ) >> 16) + 128; Cr = (46727 (R Y ) >> 16) + 128; (3.25) R = Y + (91881 Cr >> 16) 179; G = Y ((46793 Cr + 22544 Cb) >> 16) + 135; B = Y + (116129 Cb >> 16) 226; In (3.25), it is obvious that a 32-bit integer is large enough to hold any intermediate calculations. Program Listing 3-2 below shows its implementation. The program generates the outputs shown below.

Chapter 3

Imaging Basics Program Listing 3-2

/* Rgbycci.java * Simple program to demonstrate conversion from RGB to YCbCr and vice * versa using ITU-R recommendation BT.601, and integer-arithmetic. * Since Java does not have data type "unsigned char", we use "int". * Compile: $javac rgbycci.java * Execute: $java regycci */ import java.io.*; /* Note: * 216 = 65536 * kr = 0.299 = 19595 / 216 * kg = 0.587 = 38470 / 216 * Kb = 0.114 = 7471 / 216 * 0.5 = 128 / 255 * 0.564 = 36962 / 216 * 0.713 = 46727 / 216 * 1.402 = 91881 / 216 * 0.701 = 135 / 255 * 0.714 = 46793 / 216 * 0.344 = 22544 / 216 * 0.529 = 34668 / 216 * 1.772 = 116129 / 216 * 0.886 = 226 / 255 */ class Rgbycci { public static void main(String[] args) { int R, G, B; //RGB components int Y, Cb, Cr; //YCbCr components //some sample values for demo R = 252; G = 120; B = 3; //convert from RGB to YCbcr Y = ( 19595 * R + 38470 * G + 7471 * B ) >> 16; Cb = ( 36962 * ( B - Y ) >> 16) + 128; Cr = (46727 * ( R - Y ) >> 16) + 128; System.out.printf("\nOriginal RGB & corresponding YCbCr values:"); System.out.printf("\n\tR = %6d, G = %6d, B = %6d", R, G, B ); System.out.printf("\n\tY = %6d, Cb = %6d, Cr = %6d", Y, Cb, Cr ); //convert from YCbCr to RGB R = Y + (91881 * Cr >> 16) - 179; G = Y -( ( 22544 * Cb + 46793 * Cr ) >> 16) + 135; B = Y + (116129 * Cb >> 16) - 226; System.out.printf("\n\nRecovered RGB values:"); System.out.printf("\n\tR = %6d, G = %6d, B = %6d\n\n", R, G, B ); } }

Outputs of Program Listing 3-2

Original RGB & corresponding YCbCr values: R = 252, G = 120, B = 3 Y = 146, Cb = 47, Cr = 203 Recovered RGB values: R = 251, G

120, B

YCbCr Sampling Formats

Again, some precision has been lost when we recover R, G, and B from the converted Y, Cb, and Cr values. This is due to the right shifts in the calculations, which are essentially truncate operations. Because of rounding or truncating errors, the recovered R, G, and B values may not lie within the range [0, 255]. To remedy this, we can have a function that check the recovered value; if the value is smaller than 0, we set it to 0 and if it is larger than 255, we set it to 255. For example, if ( 0 < R ) R = 0; else if ( R > 255 ) R = 255; However, this check is not necessary when we convert from RGB to YCbCr. This is because from (3.14), we know that we always have 0 Y, Cb , Cr 1. For any positive real number, a and 0 a 1 and any positive integer I , 0 Round(aI ) Round(I ) = I and similarly 0 T runcate(aI ) I

This implies that after quantization and rounding, we always have 0 Y, Cb , Cr 255.

3.4 YCbCr Sampling Formats

We mentioned earlier that we may represent the Cr and Cb components with less bits than Y without much effect on visual quality as our eyes are less sensitive to color than to luminance. This is a simple way of compressing an image. In general, people consider four adjacent pixels of an image at a time and this leads to the standards 4:4:4, 4:2:2, and 4:2:0 sampling formats, which are supported by video standards MPEG-4 and H.264.

4:4:4 YCbCr Sampling Formats

4:4:4 YCbCr sampling means that for every four luma samples there are four Cb and four Cr samples and hence the three components, Y , Cb , and Cr have the same resolution. The numbers indicate the relative sampling rate of each component in the horizontal direction. So at every pixel position in the horizontal direction, a sample of each component of ( Y , Cb , Cr ) exists. The 4:4:4 YCbCr format requires as many bits as the RGB format and thus preserves the full delity of the chrominance components.

4:2:2 YCbCr Sampling Formats (High Quality Color Reproduction)

4:2:2 YCbCr sampling means that the chrominance components have the same vertical resolution as the luma but half the horizontal resolution. Therefore, for every four luma samples there are two Cb and two Cr samples. Sometimes this format is referred to as YUY2.

Chapter 3

Imaging Basics

4:2:0 YCbCr Sampling Formats (Digital Television and DVD Storage)

4:2:0 YCbCr sampling means that each of the chrominance components has half the horizontal and vertical resolution of the luma component. That is, for every four luma samples ( Y ) there are one Cb and one Cr samples. It is sometimes known as YV12 and is widely used in video conferencing, digital television and digital versatile disk (DVD) storage. The term 4:2:0 is rather confusing as the numbers do not reect relative resolutions between the components and apparently have been chosen due to historical reasons to distinguish it from the 4:4:4 and 4:2:2 formats. Figure 3-5 shows the sampling format of 4:2:0; progressive sampling is used. .......................... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .........................

Y Sample Cr Sample Cb Sample

Figure 3-5 4:2:0 Sampling Patterns Example 3-1

Image resolution: 1024 x 768 pixels 4:4:4 Y ,Cb, Cr resolution: 1024 x 768 samples Total number of bits: 1024x768x8x3 = 18874368 bits 4:2:0 Y resolution: 1024 x 768 samples 4:2:0 Cb, Cr resolution: 512x384 samples (8 bits for samples) Total number of bits: (1024 x 768 x 8) + (512 x 384 x 8 x 2) = 9437184 bits The 4:2:0 format requires half as many bits as the 4:4:4 format and the RGB format.

3.5 Measuring Video Quality

It is important to have some agreed upon methods to measure the quality of video so that we can evaluate and compare various video images presented to the viewer. However, this is a difcult and often an imprecise process and inherently subjective as there are so many factors that can inuence the measurement. In general, there are two classes of methods

Measuring Video Quality

that people use to measure video quality: subjective tests, where human subjects are asked to assess or rank the images, and objective tests, which compute the distortions between the original and processed video sequences.

3.5.1 Subjective Quality Measurement

Subjective quality measurement asks human subjects to rank the quality of a video based on their own perception and understanding of quality. For example, a viewer can be asked to rate the quality on a 5-point scale, with quality ratings ranging from bad to excellent as shwon in Figure 3-6. 100 Excellent: Imperceptible 80 Good: Perceptible 60 Fair: Slightly Annoying 40 Poor: Annoying 20 Bad: Very Annoying 0 Figure 3-6 Example of video quality assessment scale used in subjective tests Very often, a viewers perception on a video is affected by many factors like the viewing environment, the lighting conditions, display size and resolution, the viewing distance, the state of mind of the viewer, whether the material is interesting to the viewer and how the viewer interacts with the visual scene. It is not uncommon that the same viewer who observes the same video at different times under different environments may give signicantly different evaluations on the quality of the video. For example, it has been shown that subjective quality ratings of the same video sequence are usually higher when accompanied by good quality sound, which may lower the evaluators ability to detect impairments. Also, viewers tend to give higher ratings to images with higher contrast or more colorful scenes even though objective testing show that they have larger distortions in comparison to the originals. Nevertheless, subjective quality assessment still remains the most reliable methods of measuring video quality. It is also the most efcient method to test the performance of components, like video codecs, human vision models and objective quality assessment metrics.

Chapter 3

Imaging Basics

3.5.1.1 ITUR BT.500

The ITU-R Recommendation BT-500-11 formalizes video subjective tests by recommending various experiment parameters like viewing distance, room lighting, display features, selection of subjects and test material, assessment and data analysis methods. There are three most commonly used procedures from the standard: Double Stimulus Continuous Quality Scale ( DSCQS ), Double Stimulus Impairment Scale ( DSIS ) and Single Stimulus Continuous Quality Evaluation ( SSCQE ) .

Double Stimulus Continuous Quality Scale (DSCQS)

In the DSCQS method, a viewer is presented with a pair of images or short sequences X and Y, one after the other. The viewer is asked to rank X and Y by marking on a continuous line with ve intervals ranging from Bad to Excellent, which has an equivalent numerical scale from 0 to 100, like the one shown in Figure 3-6. The reference and test sequences are shown to the viewer twice in alternating fashion, the order chosen in random. The accessor does not know in advance which is the reference sequence and which is the test sequence. Figure 3-7 shows an experimental set-up that can be used for testing a video coder-decoder ( CODEC ); it is randomly assigned which sequence is X and which sequence is Y. X or Y Encoder Decoder Display Source Video Sequence X or Y Figure 3-7 DSCQS Testing System

Double Stimulus Impairment Scale (DSIS)

In the DSIS method the reference sequence is always presented before the test sequence, and it is not necessary to show the pair twice. Viewers are asked to rate the sequences on a 5-point scale, ranging from very annoying to imperceptible like the one shown in Figure 3-6. This method is more effective for evaluating clearly visible impairments, such as noticeable artifacts caused by encoding or transmission. Both the DSCQS and DSIS methods use short sequences ( 8 - 10 sec ) in the test and this becomes a problem when we want to evaluate video sequences with long duration and quality varies signicantly over time like those distributed via the Internet.

Single Stimulus Continuous Quality Evaluation (SSCQE)

SSCQE is designed to evaluate video sequences with signicant temporal variations of quality. In this method, longer sequences ( 20 - 30 minutes ) are presented to the viewers without any reference sequence. The accessors evaluate instantaneously the perceived

Measuring Video Quality

quality by continuously adjusting a side slider on the DSCQS scale, ranging from bad to excellent. The slider value is periodically sampled every 1 - 2 seconds. Using this method, differences between alternative transmission congurations can be analyzed in a more informative manner. However, as the accessor has to adjust the slider from time to time, she may be distracted and thus the rating may be compromised. Also, because of the recency or memory effect, it is quite difcult for the accessor to consistently detect momentary changes in quality, leading to stability and reliability problems of the results.

3.5.2 Objective Quality Measurement

Though subjective measurements are the most reliable method to evaluate video qualities, they are complex and expensive as human subjects are required to do the evaluation. It is a lot more convenient and cost-effective to automatically measure quality using an algorithm. Indeed, video processing system developers rely heavily on objective ( algorithmic ) measurement to access video qualities. The simplest and most widely used form of measuring the quality is Peak Signal to Noise Ratio ( PSNR ) which calculates the distortion at the pixel level. Peak Signal to Noise Ratio ( PSNR ) measures the mean squared error ( MSE ) between the reference and test sequences on a logarithmic scale, relative to the square of the highest possible signal value in the image, (2n 1)2 , where n is the number of bits per image sample. It is described by Equation (3.26): (2n 1)2 (3.26) M SE The mean squared error, MSE of two M N images X and Y where one of the images is considered to be a noisy approximation of the other with sample values Xij and Yij respectively can be calculated using the following equation: P SN Rdb = 10log10 1 M SE = M N
M 1 N 1

(Xij Yij )2
i=0 j =0

(3.27)

Though PSNR is a straightforward metric to calculate, it cannot describe distortions perceived by a complex and multi-dimensional system like the human visual system (HVS), and thus fails to give good evaluations in many cases. For example, a viewer may be interested in an object of an images but not its background. If the background is largely distorted, the viewer would still rate that the image is of high quality; however, PSNR measure would indicate that the image is of poor quality. The limitations of this metric have led to recent research in image processing that has focused on developing metrics that resembles the response of real human viewers. Many approaches have been proposed but none of them can be accepted as a standard to be used as an alternative to subjective evaluation. The search of a good acceptable objective test for images will remain a research topic for some time.

Chapter 3

Imaging Basics

Other books by the same author

Windows Fan, Linux Fan by

Fore June

Windws Fan, Linux Fan describes a true story about a spiritual battle between a Linux fan and a Windows fan. You can learn from the successful fan to become a successful Internet Service Provider ( ISP ) and create your own wealth.

Second Edition, 2002. ISBN: 0-595-26355-0 Price: $6.86

An Introduction to Video Compression in C/C++ by

Fore June

The book describes the the principles of digital video data compression techniques and its implementations in C/C++. Topics covered include RBG-YCbCr conversion, macroblocks, DCT and IDCT, integer arithmetic, quantization, reorder, run-level encoding, entropy encoding, motion estimation, motion compensation and hybrid coding. January 2010 ISBN: 9781451522273

Kahuna User Manual - Chapter2 PDF
No ratings yet
Kahuna User Manual - Chapter2 PDF
38 pages
Christie Roadster S+20K Serial Communications
No ratings yet
Christie Roadster S+20K Serial Communications
87 pages
IP Final Exam All Units PDF
No ratings yet
IP Final Exam All Units PDF
190 pages
Advanced Digital Signal Processing Tutor
No ratings yet
Advanced Digital Signal Processing Tutor
29 pages
Fundamentals of Image Processing
No ratings yet
Fundamentals of Image Processing
103 pages
Image Acquisition: Illuminating A Scene and Absorbing
No ratings yet
Image Acquisition: Illuminating A Scene and Absorbing
24 pages
UNIT 2
No ratings yet
UNIT 2
14 pages
Digital ImageProcessing - Sampling & Quantization
No ratings yet
Digital ImageProcessing - Sampling & Quantization
80 pages
DIP-Brightness Adaptation and Sampling Quantization
100% (1)
DIP-Brightness Adaptation and Sampling Quantization
3 pages
Cvdip Assignment
No ratings yet
Cvdip Assignment
12 pages
CHAPTER-2 Modified
No ratings yet
CHAPTER-2 Modified
74 pages
Direct Coding
No ratings yet
Direct Coding
5 pages
Unit 1 Digital Image .
No ratings yet
Unit 1 Digital Image .
74 pages
Unit 1_ppt
No ratings yet
Unit 1_ppt
67 pages
imageprocessing
No ratings yet
imageprocessing
23 pages
Dip Unit-I
No ratings yet
Dip Unit-I
14 pages
Unit V PDF
No ratings yet
Unit V PDF
21 pages
MM03-1
No ratings yet
MM03-1
43 pages
Digital Image Processing Notes
No ratings yet
Digital Image Processing Notes
94 pages
Robotics EC368 Module 3
No ratings yet
Robotics EC368 Module 3
36 pages
DIgital Image Processing
No ratings yet
DIgital Image Processing
74 pages
Lect 1 and 2
No ratings yet
Lect 1 and 2
100 pages
Hanrahan 95
No ratings yet
Hanrahan 95
22 pages
Basic Signal Processing: Motivation
No ratings yet
Basic Signal Processing: Motivation
22 pages
Digital Image Processing_Unit 5
No ratings yet
Digital Image Processing_Unit 5
23 pages
Lecture 2_Image Fundamentals
No ratings yet
Lecture 2_Image Fundamentals
10 pages
Digital Image Definitions&Transformations
No ratings yet
Digital Image Definitions&Transformations
18 pages
WEEK 5 - MODULE 2
No ratings yet
WEEK 5 - MODULE 2
16 pages
02 Topic ImageData
No ratings yet
02 Topic ImageData
61 pages
2015 04 29 051749unit5
No ratings yet
2015 04 29 051749unit5
13 pages
Unit 4
No ratings yet
Unit 4
13 pages
DIP UNIT 1
No ratings yet
DIP UNIT 1
55 pages
62d49c61-ba35-4249-a1a4-35ebbbcb630a (3)
No ratings yet
62d49c61-ba35-4249-a1a4-35ebbbcb630a (3)
32 pages
R20 DIP Digital Notes 5th unit
No ratings yet
R20 DIP Digital Notes 5th unit
69 pages
Fundamental Concepts of Color
No ratings yet
Fundamental Concepts of Color
11 pages
Dip 03 37
No ratings yet
Dip 03 37
37 pages
Digitalisasi Citra
No ratings yet
Digitalisasi Citra
28 pages
DIGITAL Image Processing 3
No ratings yet
DIGITAL Image Processing 3
17 pages
Compression Using Huffman Coding
No ratings yet
Compression Using Huffman Coding
9 pages
DIP Lecture2
No ratings yet
DIP Lecture2
13 pages
DIP_Module_3&4(1)
No ratings yet
DIP_Module_3&4(1)
23 pages
Image Aquisition
No ratings yet
Image Aquisition
8 pages
DIGITALtv
No ratings yet
DIGITALtv
40 pages
Image Processing: Presented by
No ratings yet
Image Processing: Presented by
15 pages
Unit 1
No ratings yet
Unit 1
5 pages
Multimedia systems - lecture notes
No ratings yet
Multimedia systems - lecture notes
58 pages
sampling of video signals module 5
No ratings yet
sampling of video signals module 5
9 pages
M01 - Digitalimagefundamentals
No ratings yet
M01 - Digitalimagefundamentals
91 pages
Topic 1 Digital Image Fundamentals PDF
No ratings yet
Topic 1 Digital Image Fundamentals PDF
30 pages
DIP - Lecture 2 Updated
No ratings yet
DIP - Lecture 2 Updated
35 pages
Chapter Two
No ratings yet
Chapter Two
15 pages
Chapter 1
No ratings yet
Chapter 1
58 pages
Maths Project
No ratings yet
Maths Project
5 pages
Digital Image Processing
No ratings yet
Digital Image Processing
18 pages
Digital Image Fundamentals
No ratings yet
Digital Image Fundamentals
50 pages
Dumic Elmar2009 v06 SG
No ratings yet
Dumic Elmar2009 v06 SG
7 pages
EE-333 Digital Image Processing
No ratings yet
EE-333 Digital Image Processing
50 pages
Chap07 DMMvideo
No ratings yet
Chap07 DMMvideo
40 pages
Graphics Complete
No ratings yet
Graphics Complete
139 pages
Lecture - 2 Multimedia
No ratings yet
Lecture - 2 Multimedia
35 pages
Color Matching Function: Understanding Spectral Sensitivity in Computer Vision
From Everand
Color Matching Function: Understanding Spectral Sensitivity in Computer Vision
Fouad Sabry
No ratings yet
Computer Stereo Vision: Exploring Depth Perception in Computer Vision
From Everand
Computer Stereo Vision: Exploring Depth Perception in Computer Vision
Fouad Sabry
No ratings yet
TV-5 Downlink Parameters
No ratings yet
TV-5 Downlink Parameters
1 page
Sony HXCU-100 CCU User Manual
No ratings yet
Sony HXCU-100 CCU User Manual
0 pages
Vision Mixer - Sony MCS 8
No ratings yet
Vision Mixer - Sony MCS 8
52 pages
Using The Avid Mojo SDI
No ratings yet
Using The Avid Mojo SDI
48 pages
Configuration Utility Reference Guide: Edition Q Routermapperman
No ratings yet
Configuration Utility Reference Guide: Edition Q Routermapperman
832 pages
PVS0403U User Manual220414 PDF
No ratings yet
PVS0403U User Manual220414 PDF
34 pages
HandBrake Denoise Custom Settings
No ratings yet
HandBrake Denoise Custom Settings
10 pages
Digital Video and HD algorithms and interfaces 2nd Edition Charles Poynton 2024 Scribd Download
100% (9)
Digital Video and HD algorithms and interfaces 2nd Edition Charles Poynton 2024 Scribd Download
60 pages
Blackmagic Ultrascope and Pocket Ultrascope: Operation Manual
No ratings yet
Blackmagic Ultrascope and Pocket Ultrascope: Operation Manual
37 pages
Tle/Epas: Quarter 1 - Module 5: Operating Principles of Television Receiver Chroma Section
100% (2)
Tle/Epas: Quarter 1 - Module 5: Operating Principles of Television Receiver Chroma Section
32 pages
Pulse Modulation Technique
No ratings yet
Pulse Modulation Technique
66 pages
Colorist Guide - 121 160
No ratings yet
Colorist Guide - 121 160
40 pages
Curves H
No ratings yet
Curves H
4 pages
EBU - TECH 3299 HDTV Standards PDF
No ratings yet
EBU - TECH 3299 HDTV Standards PDF
6 pages
Sony MFS-2000 User Guide
No ratings yet
Sony MFS-2000 User Guide
251 pages
42 - VFX Production I (Compositing)
No ratings yet
42 - VFX Production I (Compositing)
157 pages
Chromaticity and Chrominance in Color Definition
No ratings yet
Chromaticity and Chrominance in Color Definition
4 pages
(Ebook) Digital Video and HD, algorithms and interfaces by Charles Poynton ISBN 9780123919267, 0123919266 download
100% (2)
(Ebook) Digital Video and HD, algorithms and interfaces by Charles Poynton ISBN 9780123919267, 0123919266 download
50 pages
What Sapp
0% (1)
What Sapp
20 pages
TEK 1780R Operator
No ratings yet
TEK 1780R Operator
188 pages
NukeSurvivalToolkit NewToolsHighlights Release v1.1.0
No ratings yet
NukeSurvivalToolkit NewToolsHighlights Release v1.1.0
14 pages
ITU-R Recommendation BT.709, More Commonly Known by The
No ratings yet
ITU-R Recommendation BT.709, More Commonly Known by The
4 pages
Technical Specifications: For The Delivery of Moving Images
No ratings yet
Technical Specifications: For The Delivery of Moving Images
16 pages
Rec. ITU-R BT.601-5
No ratings yet
Rec. ITU-R BT.601-5
31 pages
Lecture99 17671 Lecture88 11385 Video
No ratings yet
Lecture99 17671 Lecture88 11385 Video
24 pages
MAC and HDTV
No ratings yet
MAC and HDTV
7 pages
Multimedia Computing
No ratings yet
Multimedia Computing
29 pages
1.1. Overview: Department of Computer Science & Engineering, VITS, Satna (M.P.) India
No ratings yet
1.1. Overview: Department of Computer Science & Engineering, VITS, Satna (M.P.) India
36 pages
Arri Digital Camera Basics
100% (1)
Arri Digital Camera Basics
13 pages
Composite Video Separation Techniques: Application Note October 1996 AN9644
No ratings yet
Composite Video Separation Techniques: Application Note October 1996 AN9644
8 pages
Assosa University Department of Electrical &computer Engineering Communication Stream
No ratings yet
Assosa University Department of Electrical &computer Engineering Communication Stream
28 pages
Digital Video and HD algorithms and interfaces 2nd Edition Charles Poynton instant download
100% (1)
Digital Video and HD algorithms and interfaces 2nd Edition Charles Poynton instant download
52 pages
CDA-3010-1 Manual Ver2 0
No ratings yet
CDA-3010-1 Manual Ver2 0
26 pages

Sampling Image Video Processing

Uploaded by

Sampling Image Video Processing

Uploaded by

Chapter 3

3.1 Sampling and Quantization

Spatial Samples Figure 3-1 Temporal Sampling and Spatial Sampling

3.1.1 Spatial Sampling

Sampling and Quantization

... ... ... ...

Figure 3-2 Spatial Sampling

3.1.2 Temporal Sampling

End of odd eld

End of even eld

3.2 Color Spaces

3.2.1 RGB Color Model

[0, 1, 0] Black [1, 0, 0] Green

Figure 3-4 RGB Color Cube

In this model, we express a color C in the vector form, R C= G 0 R, G, B 1 B

3.2.2 YUV Color Model

3.2.3 YCbCr Color Model

8 RGB are shown below:

R = Y + (2Cr 1)(1 kr ) B = Y + (2Cb 1)(1 kb ) G= Y kr R kb B kg kr (2Cr 1)(1 kr ) + kb (2Cb 1)(1 kb ) =Y kg (3.8)

BY 1 + 2(1 kb ) 2 B kr R kg G kb B + 1 kb 2(1 kb ) B k r R k g G + 1 k b + 2 2(1 kb ) kr 1 kg 1 + 1 kb B + 2 2(1 kb ) B 2 0

BY 1 + 2(1 kb ) 2 B kr R kg G kb B 1 = + 2(1 kb ) 2 B kb B 1 + 2(1 kb ) 2 1 B + = 2 2 1 1 + 2 2 = 1 =

3.3 Conversions between RGB and YCbCr

Conversions between RGB and YCbCr

3.3.1 Floating Point Implementation

Program Listing 3-1

The program generates the following outputs:

Original R, G, B: Converted Y, Cb, Cr: Recovered R, G, B:

0.300000, 0.700000, 0.200000 0.523400, 0.317602, 0.340716 0.300084, 0.699874, 0.200191

3.3.2 Integer Implementation

Conversions between RGB and YCbCr

Imaging Basics Program Listing 3-2

Outputs of Program Listing 3-2

YCbCr Sampling Formats

3.4 YCbCr Sampling Formats

4:4:4 YCbCr Sampling Formats

4:2:2 YCbCr Sampling Formats (High Quality Color Reproduction)

4:2:0 YCbCr Sampling Formats (Digital Television and DVD Storage)

Y Sample Cr Sample Cb Sample

Figure 3-5 4:2:0 Sampling Patterns Example 3-1

3.5 Measuring Video Quality

Measuring Video Quality

3.5.1 Subjective Quality Measurement

3.5.1.1 ITUR BT.500

Double Stimulus Continuous Quality Scale (DSCQS)

Double Stimulus Impairment Scale (DSIS)

Single Stimulus Continuous Quality Evaluation (SSCQE)

Measuring Video Quality

3.5.2 Objective Quality Measurement

Other books by the same author

Windows Fan, Linux Fan by

Second Edition, 2002. ISBN: 0-595-26355-0 Price: $6.86

An Introduction to Video Compression in C/C++ by

You might also like