17ec741 - Multimedia Information Representation - Module 2
17ec741 - Multimedia Information Representation - Module 2
MODULE – 2:
TEXT BOOK:
1. Multimedia Communications: Applications, Networks, Protocols and
Standards, Fred Halsall, Pearson Education, Asia, Second Indian reprint 2002.
REFERENCE BOOKS:
Page No - 1
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
INTRODUCTION:
All types of multimedia information are process and store within the computer
in a digital form.
Textual information: contains strings of characters entered through
keyboard. Code word: each character represented by a unique combination of
fixed number of bits. Complete text hence, can be represented by strings of
code words.
Image: computer-generated graphical images made up of a mix of lines, circles,
squares, and so on each represented in a digital form. Ex.: line - represented by
start and end co-ordinates of the line, each coordinate being defined in the
form of a pair of digital values relative to the complete image.
Audio and video: microphone and video cameras produce electrical signals,
whose amplitude varies continuously with time amplitude indicating the
magnitude of the sound wave/image-intensity at that instant.
Analog Signal: signal whose amplitude varies continuously with time. In order
to store and process analog signal type of media in a computer we should
convert any time-varying analog signals into a digital form is necessary.
for speech and audio - in like, loud speakers, and for display of digitized images
in like, computer monitors - digital values of media types must be converted
back again into a corresponding time-varying analog form on output – from
the computer.
for a particular media type:
Conversion of analog signal into digital signal is carried out using an electrical
circuit known as Signal Encoder, it includes following steps:
1. Sampler: It samples the amplitude of analog signals at repetitive time
intervals.
2. Quantization: converting amplitude of each sample into a
corresponding digital value.
Conversion of stored digital sample relating to a particular media type into
their corresponding time-varying analog form is performed by a electrical
circuit is known as a signal decoder.
All media types associated with the various multimedia applications stored and
processed within a computer in an all-digital form so, different media types can
be readily integrated together resulting integrated bitstream can be
transmitted over a single all-digital communication network.
Page No - 2
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Speech is a humans produce sounds, which are converted into electrical signals
by a microphone are made up of a range of sinusoidal signals varying in
frequency between 50Hz and 10kHz and for music range of signals is wider
Page No - 3
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
varies between 15kHz to 20kHz being comparable with the limits of the
sensitivity of the ear.
Analog signal when, being transmitted through a network BW of transmission
channel (range of frequencies, channel will pass) ≥ BW of the signal if BW of
channel < BW of signal some low and/or high frequency components will be
lost, thereby degrading the quality of the received signal. Such, a channel is
called the bandlimiting channel as in Figure below.
Encoder design:
Page No - 4
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Sampling Rate:
Nyquist Sampling Theorem: states that for an accurate representation of a
time- varying analog signal it's amplitude must be sampled at a minimum rate
that is equal to or greater than twice the highest sinusoidal frequency
component that is present in the signal known as Nyquist rate, normally
represented as either Hz or, or correctly, samples per second (sps).
Page No - 5
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Page No - 6
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Quantization intervals:
To represent in the digital form the amplitudes of the set of analog samples
would require an infinite number of binary digits. When finite numbers of
digits are used, each sample can be represented by a corresponding number of
discrete levels.
Figure below. Shows effect of using a finite number of bits
Ex.: here, 3 bits to represent each sample including a sign bit results in 4
positive and 4 negative quantization intervals, the two magnitude bits - being
determined by the particular quantization interval the analog input signal is in
at the time of each sample.
Page No - 7
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Quantization Error: It is the difference between the actual signal amplitude and
the corresponding nominal amplitude.
Figure below. shows quantization error values shown expanded.
Usually the error values vary randomly from sample to sample thus
quantization error is also known as Quantization noise.
Noise: This term used in electrical circuits to refer to a signal whose, amplitude
varies randomly with time.
Smallest amplitude relative to its peak amplitude of the signal is the influencing
factor for the choice of the number of quantization intervals for a particular
signal.
With high-fidelity music: It is important to be able to hear very quiet passages
without any distortion created by quantization noise.
Dynamic range, D (of the signal) is the ratio of the peak amplitude of a signal to
its minimum amplitudes.
decibels (dB): D is normally quantified using logarithmic scale.
Determining the quantization intervals, and number of bits to be used it is
necessary to ensure level of quantization noise relative to the smallest signal
amplitude is acceptable.
Decoder design:
Analog signals are store, process and transmitted in the digital form, prior to
their output, normally analog signals must be converted back again into their
analog form.
Ex.: loudspeakers - are driven by an analog current signal.
Signal decoder is electronic circuit which performs the conversion of digital to
analog form.
Page No - 8
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
TEXT:
Page No - 9
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Unformatted text:
ASCII - American Standard Code for Information Interchange is one of the most
widely used character sets.
Table shows binary codewords used to represent each character. Each
character is represented by a unique 7-bit binary codeword.
Use of 7 bits means there are 128 (27) alternate characters and codeword used
to identify each character and is obtained by combining the corresponding
column (bits 7-5) and row (bits 4-1) bits together.
Bit 7 is MSB and Bit 0 is LSB thus codeword for uppercase M is 1001101.
Printable Characters: It is a collection of normal alphabetic, numeric and
punctuation characters but ASCII total characters also includes a number of
control characters including:
1. Format control characters: BS (backspace), LF (Linefeed), CR (Carriage
Return), SP (Space), DEL (Delete), ESC (Escape), and FF (Formfeed).
2. Information separators: FS (File Separator), RS (Record Separator).
Transmission control characters: SOH (Start-Of-Heading), STX (Start-Of- Text),
ETX (End-Of-Text), ACK (Acknowledge), NAK (Negative Acknowledge),
SYN(Synchronous Idle), and DLE (Data Link Escape).
Fig. b tabulates the character set which is the supplementary version of that in Fig.
a.
Page No - 10
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Formatted text:
Page No - 11
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Hypertext:
Page No - 12
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Linked set of pages stored in the server accessed and viewed using a browser
(a client program).
Browser can run in either the same computer on which the server software is
running or more usually, in a separate remote computer.
Home Page: associated with each set of linked pages comprises a form of index
to the set of pages linked to it each of which has a hyperlink entry-point
associated with it.
Hyperlinks: are forms of underlined text string user. Initiates the access and
display of a particular page by pointing and clicking mouse on the appropriate
string/link.
Each link: associated with textual name of the link + related format-control
information for its display + a unique network-wide name known as URL
(Uniform Resource Locator).
URL comprises a number of logical parts including:
Page No - 13
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Page No - 14
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Hypermedia: other media types, such as sound and video clips can also be
included hypermedia and hypertext terms often used interchangeably when
referring to pages created in HTML.
Specification of a hyperlink made by specifying both the URL where, the
require page is located together with the textual name of the link.
Ex.: specification of a hyperlink to a page containing 'Further details' would
have the form <A HREF= "URL">Further details</A>.
Images:
Graphics:
A range of software packages and programs are available for creation of
computer graphics.
They provide easy-to-use tools to create graphics which, are composed of all
kinds of visual objects including lines, arcs, squares, rectangles, circles, ovals,
diamonds, stars, and so on, as well as any form of hand-drawn (normally
referred to as freeform) objects produced by drawing desired shape on the
screen by means of a combination of a cursor symbol on the screen.
The mouse facilities are also provided to edit these objects. Ex.: to change their
shape, size, or color and, to introduce complete predrawn images, either
previously created by the author of the graphic or clip-art (selected from a
gallery of images that come with the package).
Page No - 15
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Attributes: each object has a number of attributes associated with it they include:
1. Its shape - a line, a circle, a square, and so on.
2. Its size - in terms of pixel positions of its border coordinates.
3. Color of border.
Page No - 16
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
All objects are drawn on the screen by the user simply specifying name of the
objects and its attributes including its color-fill and shadow effect if required
set of more basic lower-level commands are then used to determine both the
pixel locations that are involved and the color that should be assigned to each
pixel.
Representation of a complete graphic is analogous to the structure of a
program written in a high-level programming language.
2 forms of representation of computer graphic
1. High-level version (similar to the source code of a high-level program).
2. Actual pixel-image of the graphic (similar to the byte-string, generally,
as bit-map format).
Graphic can be transferred over a network in either form
High-level program form much more compact requires less memory to store
Page No - 17
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
the image requires less BW for its transmission destination must be able to
interpret various high-level commands.
Bit-map form used to avoid above requirements there are a number of
standardized forms of representation such as:
1. GIF (Graphical Interchange Format)
2. TIFF (Tagged Image File Format)
SRGP (Simple Raster Graphics Package) convert the high-level language into
a pixel-image form.
Digitized documents:
Scanner associated with the fax machine operated by scanning each complete
page from left to right to produce a sequence of scan lines that start at the top
of the page and end at the bottom vertical resolution of scanning procedure is
either 3.85 or 7.7 lines/mm which is equivalent to approximately 100 or 200
lines/inch.
As each line is scanned output of the scanner is digitized to a resolution of
approximately 8pels with fax machines/mm.
Fax machines use just a single binary digit to represent each pel:
1. 0 for white pel
2. 1 for black pel
Figure below. Shows digital representation of the scanned page.
Page No - 18
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
For a typical page which produces a stream of about two million bits.
Printer of fax then, reproduces original image by printing out the received
stream bits to a similar resolutions.
Use of a single binary digit per pel means, fax machines are best suited to
scanning bitonal (black-and-white) images such as printed documents
comprising mainly textual information.
Digitized pictures:
Scanners used for digitizing continuous tone monochromatic images (such as,
printed picture, scene) normally, more than a single bit is used to digitize each
pel.
Ex.: good quality black and white pictures can be obtained by using 8bits/pel
yields 256 different levels of gray per element varying between white and black
which gives substantially increased picture quality over a facsimile image when
reproduced.
For color images to understand digitization format used, it is necessary to
understand the principles of how color is produced and how the picture tubes
used in computer monitors (on which the images are eventually displayed)
operate.
Color Principles:
Studies have shown that human eye sees just a single color when a particular
set of 3 primary colors are mixed and displayed simultaneously.
Color gamut is a whole spectrum of colors which is produced by mixing
different proportions of 3 primary colors red (R), green (G), and blue (B).
Figure. A below shows mixing technique used is called additive color mixing.
Page No - 19
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Additive Color Mixing: Black is produced when all three primary colors are
zero particularly useful for producing a color image on a black surface, as is the
case in display application.
Subtractive Color Mixing: complementary to additive color mixing produces -
similar to additive color mixing range of colors.
Fig. b shows - as - in subtractive color mixing white is produced, when the 3
chosen primary colors cyan (C), Magenta (M), and Yellow (Y) are all zero these
colors are particularly useful for producing a color image on a white surface as
in, printing applications.
Raster-scan principles:
Page No - 20
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Page No - 21
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Spot is a practical shape of each pixel which merges with its neighbors when
viewed from a sufficient distance a continuous color image is seen.
Television picture tubes are designed to display moving images persistence of
light/color produced by the phosphor is designed to decay very quickly so,
continuous refresh of thee screen is needed.
For moving image light signals associated with each frame change to reflect the
motion that has taken place during the time required to scan the preceding
frame. For a static/still image same set of light signals are used for each frame.
Frame refresh rate: must be high enough to ensure that the eye is not aware
the display is continuously being refreshed.
Flicker is caused by a low refresh rate caused by the previous image fading
from the eye retina before the following image is displayed.
To avoid Flicker a refresh rate of at least 50 times/s is required.
Frame refresh rate: determined by frequency of the mains electricity supply
which is either 60Hz in North and South America and most of Asia and 50 Hz in
Europe and a number of other countries.
Current picture tubes operate in analog mode i.e., amplitude of each of 3 color
signals is continuously varying as each line is scanned.
In case of Digital television digitized pictures are stored within the computer
memory color signals are in the digital form comprise a string of pixels with a
fixed number of pixels per scan line.
To display the stored image pixels that make up each line are read from
memory in time-synchronism with the scanning process and, converted into a
continuously varying analog form by means of DAC.
Video RAM: IS a separate block of memory used to store the pixel image. Area
of computer memory that holds the sting of pixels that make up the image the
pixel image must be accessed continuously as each line is scanned.
Graphics program: Are needs to write the pixel images into video RAM
whenever, either selected pixels or the total image changes.
Figure Shows the architecture of various steps involved used to create a high-
Page No - 22
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
level version of the image interactively using, either the keyboard or a mouse.
Display controller (frame/display/refresh buffer): A part of the program
interprets sequences of display commands converts them into displayed
objects by writing appropriate pixel values into the video RAM.
Video controller: A hardware subsystem that reads the pixel values stored in
the video RAM in time-synchronism with the scanning process converts for
each set of pixel values into equivalent set of R, G, and B analog signals for
output to the display.
Pixel depth:
Aspect ratio:
Page No - 23
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
screen.
Europe: 3 color standards exists:
1. PAL: of UK
2. CCIR: of Germany
3. SECAM: of France
PAL, CCIR, SECAM uses 625 scan lines some lines carry
information and some lines carry control all lines are not
displayed on the screen.
Number of visible lines/frame = vertical resolution in terms of pixels i.e., 480
for NTSC monitor and 576 with the other 3 standards.
Figure below. Shows diagrammatic form of square lattice structure.
To produces a square picture avoiding distortion on the screen with 4/3 aspect
ratio it is necessary for displaying a square of (N X N) pixels to have :
1. 640 pixels (480 * 4/3) per line, with an NTSC monitor
2. 768 pixels (576 * 4/3) per line, with a European monitor
Memory requirements to store a single digital image can be high, vary between
307.2 Kbytes for an image displayed on a VGA screen with 8bits/pixel through
to approximately 2.36Mbytes for a SVGA (Super VGA) screen with 24 bits/pixel
as shown in the table below.
Standar Resolution Number of Memory/frame
d colors (bytes)
VGA 640 x 480 x 8 256 307.2kB
XGA 640 x 480 x 8 64k 614.4kB
1024 x 768 x 8 256 786.432kB
SVGA 800 x 600 x 16 64k 960kB
1024 x 768 x 8 256 786.432kB
1024 x 768 x 24 16M 2359.296kB
Page No - 24
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Problem: Derive the time to transmit the following digitized images at both
64kbps and 1.5Mbps.
A 640 x 480 x 8 VGA – compatible
images, A 1024 x 768 x 24 SVGA-
compatible images
Solution:
Figure above. Shows atypical arrangement used to capture and store a digital
image produced by a scanner or a digital camera (a still-image camera or a
video camera).
Page No - 25
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Page No - 26
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Each image/frame once, captured and stored on the image sensor(s), then
charge stored at each photosite location is read and digitized.
Using CCD set of charges on the matrix of photosites are read a single row at a
time.
Each of the photosites are read on a row by row basis.
Once in readout register the charge on each photosite position is shifted out,
amplified and digitized using ADC.
Low-resolution image (640 X 480 pixels) and pixel depth of 24 bits - 8 bits each
for R, G, and B amount of memory required to store each image is 921600
bytes.
If output of this is directed to computer bit-map can be loaded straight into the
frame buffer ready to be displayed.
If required to store within the camera multiple images of this size need to be
Page No - 27
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Page No - 28
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
PCM speech:
Page No - 29
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Prior to input signal being sampled and converted into a digital form by ADC, it
is passed through compression circuit, which effectively compresses the
Page No - 30
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
CD-quality audio:
Discs used in CD players and CD-ROMs are digital storage devices for
stereophonic music and more general multimedia information streams.
CD-DA (CD-digital audio) standard associated with these devices.
Music has an audible BW of from 15Hz through 20 kHz. So, minimum sampling
rate is 40ksps.
In the standard actual rate used is higher this rate because of following reasons:
1. Allow for imperfections in the band limiting filter used.
2. So that resulting bit rate is then, compatible with one of the higher
transmission channel bit rates available with public networks.
Sampling rate used one of is 44.1ksps means, signal is sampled at 23
microsecond intervals.
High number of samples can be used since; BW of a recording channel on a CD
is large.
Recording of stereophonic music requires 2 separate channels - so, total bit
rate required is double that for mono.
Hence, bit rate/channel = sampling rate x bits/sample
= 44.1 x 103 X 16
= 705.6 kbps (for
mono) Total bit rate = 2 X 705.6 = 1.411 Mbps
(for stereo)
CD-ROMs also uses same bit rate which, are widely used for distribution of
multimedia titles. To reduce the access delay multiples of this rate are used
within a computers.
Not feasible to interactively access a 30s portion of a multimedia title over a
Page No - 31
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
64kbps channel with 1.5Mbps channel the time is high for interactive purposes.
Synthesized audio:
Once digitized any form of audio can be stored within a computer, amount of
memory required to store a digitized audio waveform can be very large even
for relatively short passages.
Synthesized audio can be defined as sound generated by electronic signals
of different frequencies. Sound can be synthesized by the use of sound
synthesizers. The synthesizers use different programmed algorithms to
generate sound to different waveform synthesis.
Synthesized audio is often hence, used in multimedia applications since, the
amount of memory required to be between two and three orders of magnitude
less that required to store the equivalent digitized waveform versions.
It is much easier to edit synthesized audio and mix, several passages together.
Figure below. Shows components that make up audio synthesizer.
Page No - 32
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Page No - 33
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Data Type N
Status Byte Data Type 1 Data Type 2
MIDI Audio:
Page No - 35
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Page No - 36
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Rather than recording the sound of a note, MIDI software creates data about
each note as it is played on a MIDI keyboard (or another MIDI device)—which
note it is, how much pressure was used on the keyboard to play the note, how
long it was sustained, and how long it takes for the note to decay or fade away.
Ex.: This information, when played back through a MIDI device, allows the note
to be reproduced exactly. Because the quality of the playback depends upon
the end user’s MIDI device rather than the recording, MIDI is device dependent.
The sequencer software quantizes your score to adjust for timing
inconsistencies (a great feature for those who can’t keep the beat), and it may
also print a neatly penned copy of your score to paper.
An advantage of structured data such as MIDI is the ease with which you can
edit the data.
Scenario: Let’s say you have a piece of music being played on a honky-tonk
piano, but your client decides he wants the sound of a soprano saxophone
instead. If you had the music in digitized audio, you would have to re-record
and redigitize the music. When it is in MIDI data, however, there is a value that
designates the instrument to be used for playing back the music. To change
instruments, you just change that value.
Instruments that you can synthesize are identified by a General MIDI
numbering system that ranges from 0 to 127 as shown in the table below
Video:
Page No - 37
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Broadcast television:
Scanning Sequence:
Page No - 38
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Each field is refreshed alternately at 60/50 fields per second and hence the
resulting frame refresh rate is only 30/25 frames per second.
In this way, a refresh rate equivalent to 60/50 frames per second is achieved
but with only half the transmission bandwidth.
Color signals:
1. Brightness: It represents the amount of energy that stimulates the eye and
varies on a gray scale from black (lowest) through to white. It is independent of
the color of the source.
2. Hue: It represents the actual color of the source, each color has a
frequency/wavelength and the eye determines the color from that.
3. Saturation: this represents the strength or vividness of the color.
And
Chrominance Components:
All color television systems use this same basic principle to represent the
coloration of a source; there are some small differences between the two
systems in terms of the magnitude used for the two chrominance signals.
Page No - 40
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
The reason for the above is due to the constraint that the bandwidth of the
transmission channel for color broadcasts must be the same as that used for
monochrome.
Thus, in order to fit the Y, Cb and Cr signals in the same bandwidth, the three
signals must be combined together for transmission. The resulting signal is
then known as the composite video signal.
If the two color difference signals are transmitted at their original magnitudes,
the amplitude of the luminance signal can become greater than that of the
equivalent monochrome signal. This leads to a degradation in the quality of the
monochrome picture and hence is unacceptable.
Solution:
1. To overcome this effect, the magnitude of the two color difference sign
are both scaled down.
2. Since they both have different levels of luminance associated with them,
the scaling factor used for each signal is different.
The two color difference signals are referred to by different symbols in each
system:
PAL system: Cb, and Cr are referred to as U and V respectively, scaling factors
used for the three signals are:
NTSC system: two color difference signals are combined to form two different
signals referred to as I and Q, scaling factor used are:
Signal bandwidth:
The bandwidth of the transmission channel used for color broadcasts must be
the
same as that used for a monochrome broadcast.
Thus, for transmission, the two chrominance signals must occupy the same
bandwidth as that of the luminance signal.
Most of the energy associated with the luminance signal is in the lower part of
its frequency spectrum.
Thus, in order to minimize the level of interference between the luminance and
two chrominance signals following steps are followed:
1. The chrominance signals are transmitted in the upper part of the
luminance frequency spectrum using two separate subcarriers.
2. To restrict the bandwidth used to the upper part of spectrum, a smaller
bandwidth is used for both chrominance signals.
3. Both of the two chrominance subcarriers have the same frequency they
are 90 degrees out of phase with each other.
Page No - 41
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
NTSC system: the eye is more responsive to the I signal than the Q signal.
Hence to maximize the use of the available bandwidth while at the same
time minimizing the level of interference with the luminance signal, the I
signal has a modulated bandwidth of about 2MHz and the Q signal a
bandwidth of about I MHz.
The baseband spectrum of a color television signal in PAL systems is shown
in Figure below.
Page No - 42
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
PAL system: the larger luminance bandwidth about 5.5 MHz relative to 4.2
MHz allows both the U and V chrominance signals to have the same modulated
bandwidth which is about 3 MHz.
As show in the figures above, the audio/sound signal is transmitted using one
or in separate subcarriers which are all just outside of the luminance signal
bandwidth. Typically, the main audio subcarrier is for mono sound and the
auxiliary subcarriers are used for stereo sound.
When above signals are added to the baseband video signal, the composite
signal is called the complex baseband signal.
Digital video:
In most multimedia applications the video signals need to be digital form since
it then becomes possible to store them in the memory of the computer and to
readily edit and integrate them with other media type.
For transmission reasons the three component signals have to be combined for
analog television broadcasts, with digital television it is more usual to digitize
the three component signals separately prior to transmission.
The above is done to enable editing and other operations readily performed.
Since the three component signals are treated separately in digital
transmission, in principle it is possible simply to digitize the three RGB signals
make up the picture.
Disadvantage of this approach:
Same resolution in terms of sampling rate and bits per sample must be
used three signals i.e., RGB.
Visual perception of the eye is less sensitive for color than it is for luminance
this means that the two chrominance signals can tolerate a reduced resolution
relative to that used for the luminance signal.
Thus by using luminance and two color difference signals instead of the RGB
signals we can achieve significant saving in terms of:
1. Resulting bit rate.
2. Transmission bandwidth.
Digitization of video signals has been carried out in television studios for many
years like to perform conversions from one video format into another.
To standardize this process: The International Telecommunications Union
Radiocommunications Branch (ITU-R) formerly known as the Consultative
Committee for International Radiocommunications (CCIR) defined a
standard for the digitization of video pictures known as Recommendation
CCIR-601.
Page No - 43
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
A number of variants of this standard have been defined for use in other
application domains such as digital television broadcasting, video telephony,
and videoconferencing. Collectively these are known as digitization formats.
They all exploit the fact that the two chrominance signals can tolerate a
reduced resolution relative to that used for the luminance signal.
4:2:2 Format:
Page No - 44
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Page No - 45
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Problem: Derive the bit rate and the memory requirements to store each
frame that result from the digitization of both a 525 and 625 -line system
assuring a 4:2:2 format. Also find the total memory required to store a 1.5 hour
movie/video.
Solution
525-line system: The number of samples per line is 720 and the number of visible lines
is
480. Hence the resolution of the luminance (Y) and two chrominance {C b and Cr)
signals are:
Y= 720 x 480
Cb= Cr= 360 x 480
Bit rate: Line sampling rate is fixed at 13.5 MHz for Y and 6.75 MHz for both Cb
and Cr all with 8 bits per sample.
Thus,
Bit rate = 13.5 x 106 x8 + 2 (6.75 x 106 x 8) = 216Mbps.
Memory required: Memory required per line = 720 x 8 + 360 x 8 + 360 x 8
= 11520 bits or 1440 bytes
Hence memory per frame, each of 480 lines = 480 x 11520
= 5.5296Mbits or 69l.2kbytes
And memory to store 1.5 hours assuming 60 frames per second:
= 691.2 x 60 x 1.5 x 3600kbytes
= 223.9488Gbytes
625-line system: The number of samples per line is 720 and the number of visible lines
is
576. Hence the resolution of the luminance (Y) and two chrominance {Cb and Cr)
signals are:
Y = 720 x 576
Cb= Cr = 360 x 576
Bit rate: Line sampling rate is fixed at 13.5 MHz for Y and 6.75 MHz for both Cb
and Cr all with 8 bits per sample.
Thus,
Bir rate = 13.5 x 106 x8 + 2 (6.75 x 106 x 8) = 216Mbps.
Memory required: Memory required per line = 720 x 8 + 2 (360 x 8)
= 11520 bits or 1440 bytes
Hence memory per frame, each of 576 lines = 576 x 11520
= 6.63555Mbps or 829.44kbytes
and memory to store 1.5 hours assuming 50 frames per second:
Page No - 46
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
4:2:0 format
Derivative of the 4:2:2 format and is used in digital video broadcast
applications.
Gives good picture quality.
Uses the same set of chrominance samples for two consecutive lines.
Interlaced scanning is used since it is intended for broadcast applications.
Due to the absence of chrominance samples in alternative lines is origin of the
term 4:2:0.
The position of the three sample instants per frame are as shown in Figure below.
Thus it has same luminance resolution as the 4:2:2 format but half the
chrominance resolution:
525 line system:
625 line system:
Bit rate in both the system with this format is:
Above value is the worst-case bit rate since it includes samples during the
retrace times when the beam is switched off.
To avoid flicker effects with the chrominance signals, the receiver uses the
same chrominance values from the sampled lines for the missing lines.
With large-screen televisions, flicker effects are often reduced further by the
receiver storing the incoming digitized signals of each field in a memory buffer.
Problem: Derive the bit rate and the memory requirements to store each
frame that result from the digitization of both a 525 and 625-line system
Page No - 47
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
assuring a 4:2:0 format. Also find the total memory required to store a 1.5 hour
movie/video. 8 marks
Solution
525-line system: The number of samples per line is 720 and the number of visible lines
is
480. Hence the resolution of the luminance (Y) and two chrominance {Cb and Cr)
signals are:
Y= 720 x 480
Cb= Cr= 360 x 240
Bit rate: Line sampling rate is fixed at 13.5 MHz for Y and 3.375MHz for both Cb
and Cr all with 8 bits per sample.
Thus,
Bit rate = 13.5 x 106 x 8 +2 (3.375 x 106 x 8) = 162Mbps.
Memory required:
Memory required per frame for Y = 720 x 480 x 8 = 2.7648 Mbits
Memory required per frame for 360 x 240 x 8 x 2 = 1.3824 Mbits
Total Memory required per frame = Y + Cb + Cr
= 2.7648M + 1.3824M
= 4.1472 Mbits or 518.4 kbytes
625-line system: The number of samples per line is 720 and the number of visible lines
is
572. Hence the resolution of the luminance (Y) and two chrominance {Cb and Cr)
signals are:
Y = 720 x 576
Cb= Cr = 360 x 288
Bit rate: Line sampling rate is fixed at 13.5 MHz for Y and 3.375 MHz for both Cb
and Cr all with 8 bits per sample.
Thus,
Bit rate = 13.5 x 106 x8 + 2 (3.375 x 106 x 8) = 162Mbps.
Memory required:
Memory required per frame for Y = 720 x 576 x 8 = 3.317 Mbits
Memory required per frame for 360 x 288 x 8 x 2 = 1.65Mbytes
Total Memory required per frame = Y + Cb + Cr
Page No - 48
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
= 3.317M + 1.65M
= 4.967 Mbits or 620.875 kbytes
HDTV formats:
SIF format:
Has been found to give a picture quality comparable with that obtained with
video cassette recorders (VCRs).
It uses half the spatial resolution in both horizontal and vertical directions as
that used in the 4:2:0 format a technique known as subsampling.
Uses half the refresh rate as that used in the 4:2:0 format known as temporal
resolution.
Thus frame refresh rate is:
For 525-line system: 30Hz
For 625-line system: 25Hz
Thus total resolution
is:
o 525 line system:
o 625 line system:
The worst-case bit rate in both systems with this format is:
At the receiver, the missing samples are estimated by interpolating between
each pair of values that are sent.
This digitization format is known as 4:1:1.
Page No - 49
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
below
Problem: Derive the bit rate and the memory requirements to store each
frame that result from the digitization of both a 525-line system assuring a
4:1:1 format. Also find the total memory required to store a 1.5 hour
movie/video.
Solution
525-line system: The number of samples per line is 360 and the number of visible lines
is
240. Hence the resolution of the luminance (Y) and two chrominance {C b and Cr)
signals are:
Y= 360 x 240
Cb= Cr= 180 x 120
Bit rate: Line sampling rate is fixed at 6.75 MHz for Y and 1.6785MHz for both Cb
and Cr all with 8 bits per sample.
Thus,
Bit rate = 6.75 x 106 x 8 + 2 (1.6785 x 106 x 8) = 81Mbps.
Memory required:
Memory required per frame for Y = 360 x 240 x 8 = 691.2 kbits
Memory required per frame for 180 x 120 x 8 x 2 =345.6 kbits
Total Memory required per frame = Y + Cb + Cr
= 691.2 k + 345.6 k
= 1.0368 Mbits or 129.6 kbytes
Page No - 50
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
625-line system: The number of samples per line is 720 and the number of visible lines
is
572. Hence the resolution of the luminance (Y) and two chrominance {C b and Cr)
signals are:
Y = 360 x 288
Cb= Cr = 180 x 144
Bit rate: Line sampling rate is fixed at 6.75 MHz for Y and 1.6785MHz for both Cb
and Cr all with 8 bits per sample.
Thus,
Bit rate = 6.75 x 106 x 8 + 2 (1.6785 x 106 x 8) = 81Mbps.
Memory required:
Memory required per frame for Y = 360 x 288 x 8 = 829.44 kbits
Memory required per frame for 180 x 144 x 8 x 2 = 144.72 kbits
Total Memory required per frame = Y + Cb + Cr
= 829.44 k + 144.72 k
= 974.16 kbits or 121.77 kbytes
Page No - 51
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
The worst-case bit rate is the same as that of SIF and is given by:
We can deduce from this, to convert to the CIF, a 525-line needs a line-rate
conversion and a 625-line system a frame-rate conversion.
a number of higher-resolution derivative the CIF have also been defined since
there are a number of different types of videoconferencing applications
including those that involve a linked set of desktop PCs and those that involve
a linked set of videoconferencing studios.
Most desktop applications use switched circuits, a typical bit rate used is a
single 64 kbps ISDN channel. For linking videoconferencing studios, however,
dedicated circuits are normally used that comprise multiple 64 kbps channels.
As the bit rate of these circuits is much higher typically four or sixteen 64 kbps
channels then a higher-resolution version of the basic CIF can be used to
improve the quality of the video.
QCIF:
The quarter CIF (QCIF) format has been defined for use in video telephony
applications.
It is derived from the CIF.
Uses half the spatial resolution of CIF in both horizontal and vertical directions.
The temporal resolution is divided by either 2 or 4 fold of CIF.
Spatial resolution is:
Temporal resolution is either 15 or 7.5 Hz.
The worst-case bit rate with this format is:
Page No - 52
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
The positions of the three sampling instants per frame are as shown in Figure
below and, as we can see, it has the same 4:1:1 digitization format as CIF.
A QCIF is intended for use with video telephony application involves a single
switched 64 kbps channels.
In addition, there are lower-resolution versions of the QCIF which are
intended for use in applications that use lower bit rate channels such as that
provided by a modem and the PSTN.
These lower-resolution versions are known as sub-QCIF or S-QCIF and is
given
Here the sampling matrix appears sparse, in practice, only a small screen (or a
small window of a larger screen) is normally used for video telephony and
hence the total set of samples may occupy all the pixel positions on the screen
or window.
PC video:
Number of multimedia applications that involve live video, use a window on
the screen of a PC monitor for display purposes.
Ex.: desk video telephony, videoconferencing, and also video-in-a-window.
for multimedia applications that involve mixing live video with other
information on a PC screen, the line sampling rate is normally modified in
order to obtain the required horizontal resolution like 640 (480 x pixels per
line with a 525-line PC monitor and 768 (576 x 4/3) pixels per with a 625-line
PC monitor.
To achieve the necessary resolution with a 525-line monitor, the line sampling
rate is reduced from 13.5 MHz to 12.2727MHz while for a 625-line monitor,
the line sampling rate must be increased from 13.5 MHz to 14.75 MHz
Page No - 53
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal
Regulation – 2017(CBCS Scheme) MULTIMEDIA COMMUNICATION – 17EC741
Page No - 54
Prepared by RAJA G V – Dept. of ECE, Sri Sairam College of Engineering, Anekal