0% found this document useful (0 votes)
30 views78 pages

Unit2 Ece MMC 6th Sem

The document discusses multimedia communication, focusing on the representation and processing of multimedia information, including audio and video signals. It explains concepts such as codewords, signal encoding and decoding, sampling rates, quantization, and the design of encoders and decoders. Additionally, it covers various types of text, images, and audio, detailing their formats and transmission methods.

Uploaded by

Geetha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views78 pages

Unit2 Ece MMC 6th Sem

The document discusses multimedia communication, focusing on the representation and processing of multimedia information, including audio and video signals. It explains concepts such as codewords, signal encoding and decoding, sampling rates, quantization, and the design of encoders and decoders. Additionally, it covers various types of text, images, and audio, detailing their formats and transmission methods.

Uploaded by

Geetha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 78

Multimedia communication

[Bec613A]

Multimedia Information Representation


Module-2

Jayaprasad K M
Assistant Professor,
ECE Department
Coorg Institute of Technology
Page 79-135, Fred Halsal “ Multimedia Communication”
Introduction
▪ Codeword: a fixed number of bits representing a set of
symbols, e.g) ASCII Code, FAX Run-length Code, … .

▪ Signal Encoder
Audio-Video CODEC
▪ Signal Decoder (Coder-Decoder)

▪ CODEC performs the conversion using some codewords

Data
Data
Host Network Host

Data (or conversion Signal (or conversion Data (or


Signal) Data) Signal)
3
Multimedia Information Representation

Multimedia Information is stored and processed within a computer in a digital


form

Codeword: Combination of a fixed number of bits that represents each


character, in the case of textual information

Analogue signal: Signal whose amplitude (magnitude of the sound/image


intensity) varies continuously with time

Signal encoder: Electrical circuit used for the conversion of an analogue signal
into a digital form

Signal decoder: Electrical circuit that converts stored digitized samples into
time-varying analogue form
analogue Signals
The Fourier analysis can be
As mentioned earlier the used to show that any time
amplitude of the signal varying signal is made up of
varies continuously with infinite number of single-
time frequency sinusoidal
components

The range of frequencies of the


sinusoidal components that Speech bandwidth: 50Hz –
make up the signal is called the 10kHz
signal bandwidth

Music Bandwidth: 15Hz –


20kHz
Analogue Signals –Signal Properties
Analogue Signals
–Signal Properties

• To transmit an analogue
signal through a network the
bandwidth of the
transmission channel should
be equal to or greater than
the signal bandwidth

• If the bandwidth of the


channel is less than the
signal bandwidth than
channel is called the
bandlimiting channel
• The Encoder consists of bandlimiting filter
Encoder Design and an analogue-to-digital converter (ADC) (
comprising sample and hold + quantizer)
Encoder Design

• Bandlimiting filter: Removes the


selected higher frequency
components from the source signal
• Sample and hold Circuit:
Samples amplitude of the filtered
signal at regular intervals and holds
the sampled amplitudes between
samples
• Quantizer: Converts the samples
into their corresponding binary form
Encoder Design – Data representation

• The most significant bit of the codeword represents the sign of


the sample
• A binary 0 indicates a positive value and a binary 1 indicates a
negative value
• The signal must be sampled at a much higher rate than the
maximum rate of change of the signal amplitude
• The number of quantization levels should be as large as
possible to represent the signal accurately
Sampling Rate

Nyquist sampling theorem: To obtain an accurate


representation of a time-varying analogue signal, its
amplitude must be sampled at a minimum that is equal to or
greater than twice the highest sinusoidal frequency
component that is present in the signal

Nyquist rate is represented either in Hz or more correctly in


samples per seconds (sps)

Antialiasing filter: Another name for bandlimiting filter.


Since it passes frequencies that are within the Nyquist rate
Alias signal generation due to undersampling

In reality the transmission channel used often has a lower bandwidth

To avoid distortion the source signal is first passed through the BLF which is
designed to pass only the frequency components that are within the channel
bandwidth
This avoids alias signals caused by undersampling
Quantization Intervals
• Representation of the analogue samples require an infinite
number of digits
Quantization Intervals
Three bits are used to represent each sample ( 1 bit for the sign and
two bits to represent the magnitude)

If Vmax is the maximum positive and negative signal amplitude and n is


the number of binary bits used then the quantization interval, q, is
defined as q = 2Vmax/ 2n
A signal anywhere within the quantization interval will be represented
by the same binary codeword

Each cordword is at the centre of the corresponding quantization


interval

Therefore a difference of q/2 from the actual signal level is present.


This difference is known as the quantization error
Quantization noise polarity

• Quantization error is the difference between the actual


signal amplitude and the corresponding nominal amplitude
(also known as quantization noise since values vary
randomly)
Dynamic Range

• With high-fidelity music it is important to be able to hear very


quiet passages without any distortion created by quantization
noise

• Dynamic range is defined as the ratio of the maximum signal


amplitude to the minimum.
D = 20 log10 (Vmax/Vmin) dB
Decoder Design

Encoder+Decoder= Codec

A signal decoder is an electronic circuit that performs the


conversion prior to their output back again into their analogue form
through a digital-to-analogue converter and a low pass filter

Low-pass filter: Only passes those frequency components that were


filtered through the bandlimiting filter in the encoder
Analog Bandlimiting Sampler
Quantizer Encoder
input filter (sample-and-hold)
signal clock
A B D E F
C

A time

D Decoder
4 7
0 3 5 Lowpass
DAC
filter
E
-4 -5 -3
G H
F 0 000 0 100 0 111 0 011 1 100 1 101 1 011 0 101

G
0 101(1-bit sign & 3-bit
H amplitude magnitude)
19
Text

Unformatted text: Known as plain text; enables pages to


be created which comprise strings of fixed-sized
characters from a limited character set

Formatted Text: Known as richtext; enables pages to be


created which comprise of strings of characters of
different styles, sizes and shape with tables, graphics, and
images inserted at appropriate points

Hypertext: Enables an integrated set of documents (Each


comprising formatted text) to be created which have
defined linkages between them
Unformatted Text – The basic ASCII character set

• The American Standard Code for Information Interchange is one of the


most widely used character sets and the table includes the binary
codewords used to represent each character (7 bit binary code)
Unformatted Text – Supplementary set of Mosaic
characters

• The characters in columns 010/011 and 110/111 are


replaced with the set of mosaic characters; and then
used, together with the various uppercase characters
illustrated, to create relatively simple graphical images
Unformatted Text – Examples of Videotext/Teletext

Although in practice the total page is made up of a matrix of symbols and


characters which all have the same size, some simple graphical symbols and text of
larger sizes can be constructed by the use of groups of the basic symbols
Formatted Text

It is produced by most word processing packages and used extensively in the


publishing sector for the preparation of papers, books, magazines, journals and so
on..
Documents of mixed type (characters, different styles, fonts, shape etc) possible.

Format control characters are used


Hypertext – Electronic Document in hypertext

• Hypertext can be used to create an electronic version of documents with the


index, descriptions of departments, courses on offer, library, and other facilities
all written in hypertext as pages with various defined hyperlinks
Hypertext – Electronic Document in hypertext

• An example of a hypertext language is HTML used to describe how the


contents of a document are presented on a printer or a display; other mark-up
languages are: Postscript, SGML (Standard Generalized Mark-up language, Tex,
Latex
Hypertext – Electronic Document in hypertext

• An example of a hypertext language is HTML used to describe how the


contents of a document are presented on a printer or a display; other mark-up
languages are: Postscript, SGML (Standard Generalized Mark-up language, Tex,
Latex
Images

• Images include computer-generated images (referred to as


computer graphics or simply graphics) and digitized images of both
documents and pictures
• All types of images are displayed in the form of a two-dimensional
matrix of individual picture elements (pixels or pels), but
represented differently within the computer memory (file)
• Each type of these images is created differently
Images-
graphics
• Different S/W packages and programs are
available for the creation of computer graphics.
• Easy-to-use tools to create graphics- lines, circles,
arcs, oval, diamond etc., as well free form objects
• Paint brush or mouse can be used to create
shapes required
• Textual images, precreated tables, graphs,
digitized pictures and photographs can be included.
• Objects can be made to look in layers
• Shadows can be added to give a 3D effect
Graphics

• VGA is a common type of display that


consists of a matrix of 640 horizontal pixels by
480 vertical pixels with for example, 8 bits per
pixel which allows each pixel to have one of
256 different colours
Graphics

• Colouring a solid block with the same colour is known as rendering

All objects are made up of a series of lines that are connected to each
other and, what appear as a curved line, in practice is a series of short
lines each made up of a string of pixels

Each object has a number of attributes associated with it. These


include its shape, size in terms of pixel position, colour of the border
etc..
Graphics - Conclusions
• There are two forms of representation
• High-level representation (similar to a source code of a program) –
requires less memory to store the image and less bandwidth for
transmission
• Actual picture image of the graphic ( similar to the low-level
machine code and generally known as bit-map format) – e.g. GIF
(graphical interchange format), TIFF ( tagged image format)
• A graphic can be transferred over the network in either form
• A software called SRGP (simple raster graphics package) - used to
convert high-level form into a pixel-image form
Graphics - Conclusions
Graphics - Conclusions
Digitized Documents- Fax
Principles
Digitized Documents- Fax Principles
Digitized
Documents
- Fax
Principles
Digitized Documents- Fax Principles
1. Document Placement
•The user places a paper document face down on the fax machine’s
scanning area.

2. Line-by-Line Scanning
•The machine uses a CIS (Contact Image Sensor) or CCD (Charge-
Coupled Device) scanner to capture the image.
•A light source (LED or fluorescent lamp) illuminates the paper.
•A moving sensor scans one line at a time, converting the light into
an electrical signal.

3. Conversion to Digital Data


•The reflected light is converted into grayscale or black & white
pixels.
•The intensity of light determines pixel values (dark areas absorb
more light, appearing as black; white areas reflect more light).
•The image is stored as a bitmap (raster) format.
Digitized Documents- Fax Principles
4. Encoding & Compression
•The scanned data is compressed using Modified Huffman (MH) or
JBIG compression to reduce transmission size.
•Black and white pixels are represented as binary data (0s and 1s).

5. Transmission Over Phone Line


•The encoded data is converted into analog audio signals and sent
via a telephone network.
•The receiving fax machine decodes the signals back into a digital
image.

6. Printing on the Receiver’s End


•The receiving fax machine reconstructs the image line by line.
•A thermal printer or laser printer prints the document.
Digitized Documents- Digitization format

• Fax machines uses a single binary digit to represent each pel, a 0 for a white
pel and a 1 for a black pel. Hence the digital representation of a scanned page
produces a stream about 2 million bits
• Single binary digit per pel means fax machines are best suited for bitonal
images
Color Derivative Principles
Colour Derivative Principles – additive colour mixing
( R + G + B)
• Black is produced when all three primary colours (R,G,B) are zero.
• Useful for producing a colour image on a black surface as is the case in display
applications
Digitised Pictures- Subtractive colour mixing

White is produced when the three chosen primary colours cyan,magenta


and yellow are all zero

Useful for producing a colour image on a white surface as is the case in


printing applications
Digitised Pictures
Digitized Pictures- Television/computer monitor
principles

• The picture tubes used in most television sets operate using what is known as a
raster-scan; this involves a finely-focussed electron beam being scanned over the
Digitized Pictures- Raster Scan

• Progressive scanning is performed by repeating the scanning


operation that starts at the top left corner of the screen and ends at the
bottom right corner follows by the beam being deflected back again to
the top left corner
Digitized Pictures- Raster Scan
• The set of three related colour-
Digitized Pictures- sensitive phosphors associated with each
Pixel format on pixel is called a phosphor triad and the
typical arrangement of the triads on each
each scan
scan line is shown
• The set of three related colour-
Raster Scan sensitive phospors associated with each
Display pixel is called a phospor triad and the
Architecture typical arrangement of the triads on each
scan line is shown
• Video controller is a H/W sub system that
Raster Scan reads the pixel values stored in the VRAM in time-
Display synchronism with the scanning process and for
each set of pixel values converts these into the
Architecture equi set of red, green and blue analog signals for
output to display.
Frame: Each complete set of horizontal scan lines
(either 525 for North & South America and most of
Asia, or 625 for Europe and other countries)

Flicker: Caused by the previous image fading from


the eye retina before the following image is
Digitized displayed, after a low refresh rate ( to avoid this a
refresh rate of 50 times per second is required)
Pictures – Pixel depth: Number of bits per pixel that
Concepts determines the range of different colours that can be
produced

Colour Look-up Table (CLUT): Table that stores the


selected colours in the subsets as an address to a
location reducing the amount of memory required to
store an image
Digitized Pictures

Aspect Ratio: This is the ratio of the screen width to the screen height
(television tubes and PC monitors have an aspect ratio of 4/3 and wide screen
television is 16/9)
Dgitized Pictures
Digitized Pictures – Screen Resolutions

NTSC = 525 lines per frame (480 Visible)

PAL,CCIR,SECAM=625 lines ( 576 visible)

Example display resolutions: VGA (640x480x8), XGA (1024x768x8) and SVGA (1024x768x24
Digitized Pictures • Typical arrangement that is used to
– Colour Image capture and store a digital image produced
Capture: by a scanner or a digital camera (either a
still camera or a video camera)
Schematic
Digitized Pictures
– Colour Image • RGB signal generation alternatives
Capture:
Schematic
Audio
▪ Typical Audio Types
▪ Speech signal for interpersonal application such as (video) telephony
▪ Music-quality audio such as CD-on-demand & broadcast TV
▪ synthesizer
▪ microphone
▪ loudspeaker
Basics on Audio Signals
1. Human speech: 50Hz -10KHz (4Khz in a plain-old-telephone system)
- 2 x 10K or 2 x 8K sps  monaural (mono) speech
- (2 x 10K) x 2 or (2 x 8K) x 2 sps  stereophonic speech
- ideally, 12 bits/sample

2. Human audible music: 15Hz - 20KHz


- 2 x 20K sps  monaural (mono) music
- (2 x 20K) x 2 sps  stereophonic music
- ideally, 16 bits/sample
60
PCM Speech(1)
▪ Human Voice over PSTN
▪ 200Hz-3.4Khz bandlimiting channel: about less than 4Khz
▪ 8K(2x4K) sps, 8bits/sample : ITU-T G.711(PCM) recommendation
▪ Companding (compressing/expanding)
▪ 1-bit: polarity, 3-bit: segment code, 4-bit: quantization code
Pure Compander Enhanced
PCM signals (compressor/expander) PCM signals

Equal (linear) interval


quantization & same Non-linear (unequal)
level of quantization interval quantization
error & narrower intervals
for smaller amplitude
signals
Irrespective of the magnitude of the input
signal , the same error level for both low
(quiet) signals and high (loud) signals is
produced Why companding ?
Because the human ears are more sensitive to noise on quiet
signals than it is on loud signals. Hence the effect of quantization
noise (error) can be reduced with companding 62
PCM Speech(2)
▪ Companding Example: 5-bit per sample(1-bit polarity, 2-bit segment code, & 2-bit
quantization code)

+ signal
11
Linear
quantizati
1
1 V
1
on

Polarity: 1
0
intervals 0
1
10 1
Segment 0
1
0
codes(+) 0
1
01 1
0
1
0
0
1
00 1
0
1
0
- 0
1
0
0 +
00
V 0
0
0
1 V

Polarity: 0
1
0
Narrower 0
1
0 01
intervals 1
1
0
Segmen
for 0
1
0 10 t
smaller 1
1
0
codes(-)
amplitude 0
1
0 11
-
1
1
0
V 1
1

63
PCM Speech(3)
▪ Companding Example: 5-bit per sample(1-bit polarity, 2-bit segment code,
& 2-bit quantization code)

+ signal
11
Linear
quantizatio
1
1 V
1
n intervals 0

Polarity: 1
01
11
Segmen 10 01
00
t 01
1
01 1
codes(+) 01
00
01
11
00 01
00
0 0
1 0
0
0 0
1
00
10

Polarity: 0
Wider
00
10
01
intervals 1
1
10 Segmen
for smaller 00
10 t
amplitude 11 10
10 codes(-)
00
10
11 11
- 1
0
V 1
1

64
CD-Quality Audio

• Human audible bandwidth: 15Hz-20Khz  40Ksps

• In CD-ROMs, more higher, say, 44.1Ksps & 16-bit/sample used


• bit rate for channel = sampling rate x bits per sample
• = 44.1 x 103 x 16 = 705.6 Kbps
• total rate required for stereophonic music
• = 2 x 705.6 = 1.411 Mbps
• storage capacity for a 1 hour CD-ROM title
• = 1.411 x 60 x 60 = 634.95 Mbytes
• this takes (634.95 x 106 x 8)/(10 x 106) = 8.5 min. down-loading time
66
via a 10Mbps link network !
Synthesized Audio
• A digitized audio requires a large amount of memory while
a synthesized audio is
1) 2 or 3 orders of magnitude less
2) much easier to edit & to mix several passes together

• An audio/sound synthesizer: computer + keyboard + a set of


sound generators + interfaces for instruments (elec. guitar)

• * MIDI (Music Instrument Digital Interface): Standard I/O interfaces


• Messages (status byte + data bytes)
• Connectors, Cables, & Electrical Signals

67
68
2.6 Video (Motion): Broadcast TV
Video Applications

▪ Entertainment: Broadcast TV, VCR/DVD Recordings


▪ Interpersonal: Video Telephony & Videoconferencing
▪ Interactive: Video Clips on PC Windows

▪ Scanning Sequences: Interlaced Scanning


▪ To minimize the amount of tx bandwidth, a frame is divided into two
halves called fields
e.g) 525-line 50-time frame refresh rate/sec.
- 262.5 odd lines 50-time field rate/sec.
- 262.5 even lines 50-time field rate/sec.
 In reality,
525-line, 25-time frame refresh rate/sec.
69
Broadcast TV(2)
▪ Color Signals
▪ Three properties of a color
- Brightness, Hue (Tint) & Saturation

▪ Color production: an equation of R, G, and B phosphors


- 0.299 R + 0.587 G + 0.114 B where, 0.299+0.587+0.114=1
▪ Luminance refers to the brightness of a source, the hue & the saturation
called, chrominance characteristics
-say, luminance Ys = 0.299 Rs + 0.587 Gs + 0.114 Bs
Ys: magnitude of luminance signal
Rs, Gs, Bs: magnitudes of three major colors
▪ Two color difference signals: Blue chrominance Cb and Red chrominance Cr
- Cb = Bs-Ys, Cr = Rs -Ys

70
Broadcast TV(3)
▪ Chrominance Components
▪ Composite Video Signal for Transmission
- Ys, Cb, and Cr signals are combined together and signal differences are
scaled down before transmission
▪ In PAL(Phase Alternating Line)
- Y = 0.299 R + 0.587 G + 0.114 B
- U(Cb) = 0.493(B-Y) = -0.147R-0.289G+0.437B
- V(Cr ) = 0.877(R-Y) = 0.615R-0.515G-0.1B
▪ In NTSC(National Television Standards Committee)
- Y = 0.299 R + 0.587 G + 0.114 B
- I(Cb) = 0.74(R-Y)-0.27(B-Y) = 0.599R-0.276G-0.324B
- Q(Cr ) = 0.48(R-Y)+0.41(B-Y) = 0.212R-0.528+0.311B
71
72
Digital Video
➢ Advantages of DV
▪ Easy to store in computer
▪ Easy to edit and integrate with other types
▪ Easy to digitize three RGB component signals
▪ The resolution of eyes are less sensitive for color than it is for
luminance. Hence, two chrominance signals can tolerate a
reduced resolution
▪ Transmission bandwidth is achieved by using the luminance
and two color difference signals, instead of the RGB
signals directly.
▪ CCIR-601 Recommendations: standard for the digitization
of video pictures
73
Digital Video
▪ 4:2:2 format(CCIR-601)
▪ Recommendation for use in TV studio
▪ Three component (analog) video signals may have bandwidths
▪ up to 6Mhz for the luminance ⇒ 12Mhz sps
▪ less than 3Mhz for the two chrominance signals ⇒ 6 Mhz sps
▪ In reality, 13.5M sps for luminance, 6.75 M sps for the two
chrominance signals
▪ In NTSC(525-line) system, total line sweep time 63.56μsec =
retrace time 11.56 μsec + an active line sweep time 52 μsec
▪ In PAL(625-line) system, total line sweep time 64μsec =
retrace time 12 μsec + an active line sweep time 52 μsec
Orthogonal
sampling
Line sampling rate: Line sampling rate:
5210-613.5106 = 702 samples/line 5210-66.75106 = 351 samples/line
In reality, 720 samples/line In reality, 360 samples/line

4Y samples for every 2Cb and 2Cr samples(4:2:2) 74


Digital Video
▪ 4:2:2 Format Bit Rate & Storage (NTSC 525-line)
▪ The number of active (visible) lines: 480
▪ The number of samples per line: 720
 Resolution of luminance Y = 720480
Two chrominance signals Cb = Cr = 360480
▪ Line sampling rate: 13.5sps for Y & 6.75sps for both Cb & Cr
▪ Bits per sample: 8 bits
 Bit rate per line = 13.51068 + 2(6.751068) = 216Mbps
 Bits per line = 7208 + 2(3608) = 11.52Kbits

 Bits per frame = 48011.52 = 5.5296Mbits


 Bits for 1.5 hrs Video assuming 60 refresh rate = 5.5296601.53600
= 223.9488GBytes

75
Digital Video
▪ 4:2:0 Format
▪ used in Digital Broadcast Applications
▪ interlaced scanning with the absence of chrominance samples in
alternative lines
▪ 525-line system
▪ Y = 720480(the same as 4:2:2 format), Cb = Cr = 360240
▪ 625-line system
▪ Y = 720576, Cb = Cr = 360288
▪ bit rate per line: 13.51068 + 2(3.3751068) = 162Mbps
▪ HDTV Format
▪ used in High-Definition Television (four times bit rate)
▪ 4/3 14401152 pixels(50/60 Hz refresh rate) & 16/9 wide-screen
19201152 pixels(25/30 Hz) with # of visible lines per frame 1080

76
Digital Video
▪ SIF (Source Intermediate Format), 4:1:1 Format
▪ used in Video Cassette Recorders (VCRs)
▪ progressive (non-interlaced) scanning since it is intended for storage
applications
▪ Half of 4:2:0 format: “Subsampling & Temporal Resolution”
▪ 525-line system
▪ Y = 360240, Cb = Cr = 180120
▪ 625-line system
▪ Y = 360288, Cb = Cr = 180144
▪ bit rate per line
▪ 6.751068 + 2(1.68751068) = 81Mbps

77
Digital Video
▪ CIF (Common Intermediate Format), 4:1:1 format
▪ used in Video Conferencing applications
▪ spatial resolution of the SIF 625-line system plus temporal
resolution of the SIF 525-line system
▪ Y = 360288, Cb = Cr = 180144
▪ refresh rate: 30 Hz
▪ bit rate per line: 6.751068 + 2(1.68751068) = 81Mbps
▪ many variants for videoconferencing using desktop PCs or
ISDN/PSTN
▪ say, typically 4 or 16 64Kbps channels used
▪ 4CIF: Y = 720576, Cb = Cr = 360288
▪ 16CIF: Y = 14401152, Cb = Cr = 720576

78
Digital Video
▪ QCIF (Quarter CIF), 4:1:1 Format
▪ used in Video Telephony applications
▪ half spatial resolution of the CIF and either half
or quarter temporal resolution of the CIF
▪ Y = 180144, Cb = Cr = 9072
▪ refresh rate: 15 or 7.5 Hz
▪ bit rate per line:
3.3751068 + 2(0.843751068) = 81Mbps
▪ a lower version is typically used for single 64Kbps channel
ISDN or PSTN with modems: sub-QCIF(SQCIF)
▪ Y = 12896, Cb = Cr = 6448
79
Digital Video
▪ PC Video Digitization

Digitization System Spatial Resolution Temporal


Format Resolution

525-line Y = 640480, Cb = Cr = 320240 60Hz


4:2:2
625-line Y = 768576, Cb = Cr = 384288 50Hz

525-line Y = 320240, Cb = Cr = 160240 30Hz


SIF
625-line Y = 384288, Cb = Cr = 192144 25Hz

CIF Y = 384288, Cb = Cr = 192144 30Hz

QCIF Y = 192144, Cb = Cr = 9672 15/7.5Hz

- Video capture board or S/W required


- All PC monitors use “progressive (non-interlaced) scanning” 80

You might also like