0% found this document useful (0 votes)
23 views119 pages

Forouzan6e ch11 PPTs Accessible

Chapter 11 PPT for CN

Uploaded by

Dhananjaya GM
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views119 pages

Forouzan6e ch11 PPTs Accessible

Chapter 11 PPT for CN

Uploaded by

Dhananjaya GM
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 119

Because learning changes everything.

Optional: Include Cover Here

Chapter 11

Multimedia
Data Communications and
Networking, With TCP/IP
protocol suite
Sixth Edition
Behrouz A. Forouzan

© 2022 McGraw Hill, LLC. All rights reserved. Authorized only for instructor use in the classroom.
No reproduction or further distribution permitted without the prior written consent of McGraw Hill, LLC.
Chapter 11: Outline

11.1 Compression

11.2 Multimedia Data

11.3 Multimedia in the Internet

11.4 Real-Time Interactive Protocols

© McGraw Hill, LLC 2


11-1 COMPRESSION

In this section, we discuss compression, which plays


a crucial role in multimedia communication due to
the large volume of data exchanged. In compression,
we reduce the volume of data to be exchanged. We
can divide compression into two broad categories:
lossless and lossy compression.

© McGraw Hill, LLC 3


11.1.1 Lossless Compression

In lossless compression, the integrity of the data is preserved


because the compression and decompression algorithms are exact
inverses of each other: no part of the data is lost in the process.
Lossless compression methods are normally used when we cannot
afford to lose any data. For example, we must not lose data when
we compress a text file or an application program. Lossless
compression is also applied as the last step in some lossy
compression procedures to further reduce the size of the data.

© McGraw Hill, LLC 4


Run-Length Coding

Run-length coding, sometimes referred to as run-length encoding


(RLE), is the simplest method of removing redundancy. It can be
used to compress data made of any combination of symbols. The
method replaces a repeated sequence, run, of the same symbol with
two entities: a count and the symbol itself.

© McGraw Hill, LLC 5


Figure 11.1 A version of run-length coding to compress binary
patterns

Access the text alternative for slide images.

© McGraw Hill, LLC 6


Dictionary Coding

There is a group of compression methods based on creation of a


dictionary (array) of strings in the text. The idea is to encode
common sequences of characters instead of encoding each
character separately. The dictionary is created as the message is
scanned, and if a sequence of characters that is an entry in the
dictionary is found in the message, the code (index) of that entry is
sent instead of the sequence. The one we discuss here was invented
by Lempel and Ziv and refined by Welch. It is referred to as
Lempel-Ziv-Welch (LZW).

© McGraw Hill, LLC 7


Table 11.1 LZW encoding
LZWEncoding (message)
{
Initialize (Dictionary)
Char = Input (first character)
S = char // S is the encodable sequence
while (more characters in message)
{
char = Input (next character);
if ((S + char) is in Dictionary) // S is not the encodable sequence
{
S = S + char;
}
Else // S is the encodable sequence
{
addToDictionary (S + char);
Output (index of S in Dictionary);
S = char;
}
}
Output (index of S in Dictionary);
}

© McGraw Hill, LLC 8


Least Cost Trees

If there are N routers in an internet, there are (N - 1) least-cost


paths from each router to any other router. This means we need N *
(N - 1) least-cost paths for the whole internet. If we have only 10
routers in an internet, we need 90 least-cost paths. A better way to
see all of these paths is to combine them in a least-cost tree. A
least-cost tree is a tree with the source router as the root that spans
the whole graph (visits all other nodes) and in which the path
between the root and any other node is the shortest. Figure 8.2
shows the seven least-cost trees for the internet in Figure 8.1.

© McGraw Hill, LLC 9


Figure 11.2 Example 11.1

Access the text alternative for slide images.

© McGraw Hill, LLC 10


Example 11.2

Let us show how the code in Example 11.1 can be decoded and the
original message recovered (Figure 11.3). The box called PreC
holds the codeword from the previous iteration, which is not
needed in the pseudocode, but needed here to better show the
process. Note that in this example there is only the special case in
which the codeword is not in the dictionary. The new entry for the
dictionary needs to be made from the string and the first character
in the string. The output is also the same as the new entry.

© McGraw Hill, LLC 11


Figure 11.3 Example 11.2

Access the text alternative for slide images.

© McGraw Hill, LLC 12


Table 11.2 LZW decoding
LZWDecoding (code)
{
Initialize (Dictionary);
C = Input (first codeword);
Output (Dictionary [C]);
while (more codewords in code)
{
S = Dictionary[C];
C = Input (next codeword);
if (C is in Dictionary) // Normal case
{
addToDictionary (S + firstSymbolOf Dictionary[C]);
Output (Dictionary [C]);
}
else // Special case
{
addToDictionary (S + firstSymbolOf (S));
Output (S + firstSymbolOf (S));
}
}
}

© McGraw Hill, LLC 13


Huffman Coding

When we encode data as binary patterns, we normally use a fixed


number of bits for each symbol. To compress data, we can consider
the frequency of symbols and the probability of their occurrence in
the message. Huffman coding assigns shorter codes to symbols that
occur more frequently and longer codes to those that occur less
frequently.

© McGraw Hill, LLC 14


Figure 11.4 Huffman tree

Access the text alternative for slide images.

© McGraw Hill, LLC 15


Table 11.3 Coding table

Symbol Code Symbol Code Symbol Code


A 00 C 011 E 11
B 010 D 10

© McGraw Hill, LLC 16


Figure 11.5 Encoding and decoding in Huffman coding

Access the text alternative for slide images.

© McGraw Hill, LLC 17


Arithmetic Coding

In the previous compression methods, each symbol or sequence of


symbols is encoded separately. In arithmetic coding, introduced by
Rissanen and Langdon in 1981, the entire message is mapped to a
small interval inside [0,1). The small interval is then encoded as a
binary pattern. Arithmetic coding is based on the fact that we can
have an infinite number of small intervals inside the half-open
interval [0,1). Each of these small intervals can represent one of
the possible messages we can make using a finite set of symbols.

© McGraw Hill, LLC 18


Figure 11.6 Arithmetic coding

Access the text alternative for slide images.

© McGraw Hill, LLC 19


Table 11.4 Arithmetic encoding

ArithmeticEncoding (message)
{
currentInterval = [0,1);
while (more symbols in the message)
{
s = Input (next symbol);
divide currentInterval into subintervals
subInt = subinterval related to s
currentInterval = subInt
}
Output (bits related to the currentInterval)
}

© McGraw Hill, LLC 20


Example 11.3

For the sake of simplicity, let us assume that our set of symbols is S
= {A, B, ∗}, in which the asterisk is the terminating symbol. We
assign probability of occurrence for each symbol as

PA = 0.4 PB = 0.5 P*  0.1

Figure 11.7 shows how we find the interval and the code related to
the short message "BBAB*".

© McGraw Hill, LLC 21


Figure 11.7 Example 11.3

Access the text alternative for slide images.

© McGraw Hill, LLC 22


Table 11.5 Arithmetic decoding

ArithmeticDecoding (code)
{
c = Input (code)
num = find real number related to code
currentInterval = [0,1);
while (true)
{
divide the currentInterval into subintervals;
subInt = subinterval related to num;
Output (symbol related to subInt);
if (symbol is the terminating symbol) return;
currentInterval = subInt;
}
}

© McGraw Hill, LLC 23


Example 11.4

Figure 11.8 shows how we use the decoding process to decode the
message in Example 11.3. Note that the hand shows the position of
the number in the corresponding interval.

© McGraw Hill, LLC 24


Figure 11.8 Example 11.4

Access the text alternative for slide images.

© McGraw Hill, LLC 25


11.1.2 Lossy Compression

Lossless compression has limits on the amount of compression.


However, in some situations, we can sacrifice some accuracy to
increase the compression rate. Although we cannot afford to lose
information in text compression, we can afford it when we are
compressing images, video, and audio. For example, human vision
cannot detect some small distortions that can result from lossy
compression of an image. In this section, we discuss a few ideas
behind lossy compression.

© McGraw Hill, LLC 26


Predictive Coding

Predictive coding is used when we digitize an analog signal. We


discussed pulse code modulation (PCM) as a technique that
converts an analog signal to a digital signal, using sampling. After
sampling, each sample needs to be quantized to create binary
values. Compression can be achieved in the quantization step by
using predictive coding.

© McGraw Hill, LLC 27


Figure 11.9 Encoding and decoding in delta modulation

Access the text alternative for slide images.

© McGraw Hill, LLC 28


Figure 11.10 Reconstruction of quantization of xn − xn−1
versus xn − yn-1

Access the text alternative for slide images.

© McGraw Hill, LLC 29


Figure 11.11 Slope overload and granular noise

Access the text alternative for slide images.

© McGraw Hill, LLC 30


Transform Coding

In transform coding, a mathematical transformation is applied to


the input signal to produce the output signal. The transformation
needs to be invertible, to allow the original signal to be recovered.
The transformation changes the signal representation from one
domain to another (time domain to frequency domain, for
example), which results in reducing the number of bits in encoding.

© McGraw Hill, LLC 31


Figure 11.12 One-dimensional DCT

Access the text alternative for slide images.

© McGraw Hill, LLC 32


Figure 11.13 Formulas for one-dimensional forward and inverse
transformation

Access the text alternative for slide images.

© McGraw Hill, LLC 33


Example 11.5

Figure 11.14 shows the transformation matrix for N = 4. As the


figure shows, the first row has four equal values, but the other rows
have alternate positive and negative values. When each row is
multiplied by the source data matrix, we expect that the positive
and negative values result in values close to zero if the source data
items are close to each other. This is what we expect from the
transformation: to show that only some values in the source data
are important and most values are redundant.

© McGraw Hill, LLC 34


Figure 11.14 Example 11.5

Access the text alternative for slide images.

© McGraw Hill, LLC 35


Figure 11.15 Two-dimensional DCT

Access the text alternative for slide images.

© McGraw Hill, LLC 36


Figure 11.16 Formulas for forward and inverse two-dimensional
DCT

Access the text alternative for slide images.

© McGraw Hill, LLC 37


11-2 MULTIMEDIA DATA

Today, multimedia data consists of text, images,


video, and audio, although the definition is changing
to include futuristic media types.

© McGraw Hill, LLC 38


11.2.1 Text

The Internet stores a large amount of text that can be downloaded


and used. One often refers to plaintext, as a linear form, and
hypertext, as a nonlinear form, of textual data. Text stored in the
Internet uses a character set, such as Unicode, to represent
symbols in the underlying language. To store a large amount of
textual data, the text can be compressed using one of the lossless
compression methods we discussed earlier.

© McGraw Hill, LLC 39


11.2.2 Image

In multimedia parlance, an image (or a still image as it is often


called) is the representation of a photograph, a fax page, or a
frame in a moving picture.

© McGraw Hill, LLC 40


Digital Image

To use an image, it first must be digitized. Digitization in this case


means to represent an image as a two-dimensional array of dots,
called pixels. Each pixel then can be represented as a number of
bits, referred to as the bit depth. In a black-and-white image, the
bit depth = 1. In a gray picture, one normally uses a bit depth of 8
with 256 levels. In a color image, the image is normally divided
into three channels, with each channel representing one of the three
primary colors of red, green, or blue (RGB).

© McGraw Hill, LLC 41


Example 11.6

The following shows the time required to transmit an image of


1280 × 720 pixels using the transmission rate of 100 kbps.

a. Using a black-and-white image with a bit depth of 1,


Transmission time (1280 720 1) 100,000 9 seconds

b. Using a gray image with a bit depth of 8,


Transmission time (1280 720 8) 100,000 74 seconds

c. Using a color image with a bit depth of 24,


Transmission time (1280 720 24) 100,000 215 seconds

© McGraw Hill, LLC 42


Image Compression: JPEG

Although there are both lossless and lossy compression algorithms


for images, in this section we discuss the lossy compression method
called JPEG. The Joint Photographic Experts Group (JPEG)
standard provides lossy compression that is used in most
implementations. The JPEG standard can be used for both color
and gray images. However, for simplicity, we discuss only the
grayscale pictures; the method can be applied to each of the three
channels in a color image.

© McGraw Hill, LLC 43


Figure 11.17 Compression in each channel of JPEG

Access the text alternative for slide images.

© McGraw Hill, LLC 44


Figure 11.18 Three different quantization matrices

Access the text alternative for slide images.

© McGraw Hill, LLC 45


Figure 11.19 Reading the table

© McGraw Hill, LLC 46


Example 11.7

To show the idea of JPEG compression, we use a block of gray


image in which the bit depth for each pixel is 20. We have used a
Java program to transform, quantize, and reorder the values in
zigzag sequence; we have shown the encoding (Figure 11.20).

© McGraw Hill, LLC 47


Figure 11.20 Example 11.7: uniform gray scale

Access the text alternative for slide images.

© McGraw Hill, LLC 48


Example 11.8

As the second example, we have a block that changes gradually;


there is no sharp change between the values of neighboring pixels.
We still get a lot of zero values, as shown in Figure 11.21.

© McGraw Hill, LLC 49


Figure 11.21 Example 8.8: Gradient gray scale

Access the text alternative for slide images.

© McGraw Hill, LLC 50


Image Compression: GIF

The JPEG standard uses images in which each pixel is represented


as 24 bits (8 bits for each primary color). This means that each
pixel can be one of the 224 (16,777,216) complex colors. For
example, a magenta pixel, which is made of red and blue
components (but contains no green component) is represented as
the integer (FF00FF)16.

© McGraw Hill, LLC 51


11.2.3 Video

Video is composed of multiple frames; each frame is one image.


This means that a video file requires a high transmission rate.

© McGraw Hill, LLC 52


Digitizing Video

A video consists of a sequence of frames. If the frames are


displayed on the screen fast enough, we get an impression of
motion. The reason is that our eyes cannot distinguish the rapidly
flashing frames as individual ones. There is no standard number of
frames per second; in North America 25 frames per second is
common. However, to avoid a condition known as flickering, a
frame needs to be refreshed. The TV industry repaints each frame
twice. This means 50 frames need to be sent, or if there is memory
at the sender site, 25 frames with each frame repainted from the
memory.

© McGraw Hill, LLC 53


Example 11.9

Let us show the transmission rate for some video standards:


a. Color broadcast television takes 720 * 480 pixels per frame,
30 frames per second, and 24 bits per color. The transmission
rate without compression is as shown below.
720 * 480 * 30 * 24 = 248,832,000 bps = 249 Mbps

b. High definition color broadcast television takes 1920 * 1080


pixels per frame, 30 frames per second, and 24 bits per color:
The transmission rate without compression is as shown below.
1920 * 1080 * 30 * 24 = 1,492,992,000 bps = 1.5 Gbps

© McGraw Hill, LLC 54


Video Compression: MPEG

Motion Picture Experts Group (MPEG) is a method to compress


video. In principle, a motion picture is a rapid flow of a set of
frames, where each frame is an image. In other words, a frame is a
spatial combination of pixels, and a video is a temporal
combination of frames that are sent one after another. Compressing
video, then, means spatially compressing each frame and
temporally compressing a set of frames

© McGraw Hill, LLC 55


Figure 11.22 MPEG frames

Access the text alternative for slide images.

© McGraw Hill, LLC 56


11.2.4 Audio

Audio (sound) signals are analog signals that need a medium to


travel; they cannot travel through a vacuum. The speed of the
sound in the air is about 330 m/s (740 mph). The audible frequency
range for normal human hearing is from about 20Hz to 20kHz with
maximum audibility around 3300 Hz.

© McGraw Hill, LLC 57


Digitizing Audio

To be able to provide compression, audio analog signals are


digitized using an analog-to-digital converter. The analog-to-
digital conversion consists of two processes: sampling and
quantizing. A digitizing process known as pulse code modulation
(PCM) was discussed before. This process involved sampling an
analog signal, quantizing the sample, and coding the quantized
values as streams of bits. Voice signal is sampled at the rate of
8,000 samples per second with 8 bits per sample; the result is a
digital signal of 8,000 * 8 = 64 kbps. Music is sampled at 44,100
samples per second with 16 bits per sample; the result is a digital
signal of 44,100 * 16 = 705.6 kbps for monaural and 1.411 Mbps
for stereo.

© McGraw Hill, LLC 58


Audio Compression

Both lossy and lossless compression algorithms are used in audio


compression. Lossless audio compression allows one to preserve
an exact copy of the audio files; it has a small compression ratio of
about 2 and is mostly used for archival and editing purposes. Lossy
algorithms provide far greater compression ratios (5 to 20) and are
used in mainstream consumer audio devices. Lossy algorithms
sacrifice a little bit of quality, but substantially reduce space and
bandwidth requirements. For example, on a CD, one can fit one
hour of high fidelity music, 2 hours of music using lossless
compression, or 8 hours of music compressed with a lossy
technique.

© McGraw Hill, LLC 59


Figure 11.23 Threshold of audibility

Access the text alternative for slide images.

© McGraw Hill, LLC 60


11-3 MULTIMEDIA IN THE INTERNET

We can divide audio and video services into three


broad categories: streaming stored audio/video,
streaming live audio/video, and interactive
audio/video. Streaming means a user can listen (or
watch) the file after the downloading has started.

© McGraw Hill, LLC 61


11.3.1 Streaming Stored Audio/Video

In the first category, streaming stored audio/video, the files are


compressed and stored on a server. A client downloads the files
through the Internet. This is sometimes referred to as on-demand
audio/video. Examples of stored audio files are songs, symphonies,
books on tape, and famous lectures. Examples of stored video files
are movies, TV shows, and music video clips. We can say that
streaming stored audio/ video refers to on-demand requests for
compressed audio/video files.

© McGraw Hill, LLC 62


First Approach: Using a Web Server

A compressed audio/video file can be downloaded as a text file.


The client (browser) can use the services of HTTP and send a GET
message to download the file. The Web server can send the
compressed file to the browser. The browser can then use a help
application, normally called a media player, to play the file. Figure
11.24 shows this approach.

© McGraw Hill, LLC 63


Figure 11.24 Using a Web server

Access the text alternative for slide images.

© McGraw Hill, LLC 64


Second Approach: Using a Web Server with Meta File

In another approach, the media player is directly connected to the


Web server for downloading the audio/video file. The Web server
stores two files: the actual audio/video file and a metafile that
holds information about the audio/video file. shows the steps in
this approach.

© McGraw Hill, LLC 65


Figure 11.25 Using a Web server with a metafile

Access the text alternative for slide images.

© McGraw Hill, LLC 66


Third Approach: Using a Media Server

The problem with the second approach is that the browser and the
media player both use the services of HTTP. HTTP is designed to
run over TCP. This is appropriate for retrieving the metafile, but
not for retrieving the audio/video file. The reason is that TCP
retransmits a lost or damaged segment, which is counter to the
philosophy of streaming. We need to dismiss TCP and its error
control; we need to use UDP. However, HTTP, which accesses the
Web server, and the Web server itself are designed for TCP; we
need another server, a media server. shows the concept.

© McGraw Hill, LLC 67


Figure 11.26 Using a media server

Access the text alternative for slide images.

© McGraw Hill, LLC 68


Fourth Approach: Using a Web Server with RTSP

The Real-Time Streaming Protocol (RTSP) is a control protocol


designed to add more functionalities to the streaming process.
Using RTSP, we can control the playing of audio/video. RTSP is an
out-of-band control protocol that is similar to the second
connection in FTP. shows a media server and RTSP.

© McGraw Hill, LLC 69


Figure 11.27 Using a media server and RTSP

Access the text alternative for slide images.

© McGraw Hill, LLC 70


Example: Video on Demand

Video On Demand (VOD) allows viewers to select a video from a


large number of available videos and watch it interactively: pause,
rewind, fast forward, etc. A viewer may watch the video in real time
or she may download the video into her computer, portable media
player, or to a device such as a digital video recorder (DVR) and
watch it later. Cable TV, satellite TV, and IPTV providers offer both
pay-per-view and free content VOD streaming. Many other
companies, such as Amazon video and video rental companies such
as Blockbuster video, also provide VOD. Internet television is an
increasingly popular form of video on demand.

© McGraw Hill, LLC 71


11.3.2 Streaming Live Audio/Video

In the second category, streaming live audio/video, a user listens to


broadcast audio and video through the Internet. Good examples of
this type of application are Internet radio and Internet TV.

© McGraw Hill, LLC 72


Example: Internet Radio

Internet radio or web radio is a webcast of audio broadcasting


service that offers news, sports, talk, and music via the Internet. It
involves a streaming medium that is accessible from anywhere in
the world. Web radio is offered via the Internet but is similar to
traditional broadcast media: it is noninteractive and cannot be
paused or replayed like on-demand services.

© McGraw Hill, LLC 73


Example: Internet Television

Internet television or ITV allows viewers to choose the show they


want to watch from a library of shows. The primary models for
Internet television are streaming Internet TV or selectable video on
an Internet location.

© McGraw Hill, LLC 74


Example: IPTV

Internet protocol television (IPTV) is the next-generation


technology for delivering real time and interactive television.
Instead of the TV signal being transmitted via satellite, cable, or
terrestrial routes, the IPTV signal is transmitted over the Internet.
Note that IPTV differs from the ITV. Internet TV is created and
managed by service providers that cannot control the final
delivery; it is distributed via existing infrastructure of the open
Internet. An IPTV, on the other hand, is highly managed to provide
guaranteed quality of service over a complex and expensive
network. The network for IPTV is engineered to ensure efficient
delivery of large amounts of multicast video traffic and HDTV
content to subscribers.

© McGraw Hill, LLC 75


11.3.3 Real-Time Audio/Video

In the third category, interactive audio/video, people use the


Internet to interactively communicate with one another. The
Internet phone or voice over IP is an example of this type of
application. Video conferencing is another example that allows
people to communicate visually and orally.

© McGraw Hill, LLC 76


Characteristics

Before discussing the protocols used in this class of applications,


we discuss some characteristics of real-time audio/video
communication..

© McGraw Hill, LLC 77


Figure 11.28 Time relationship

Access the text alternative for slide images.

© McGraw Hill, LLC 78


Figure 11.29 Jitter

Access the text alternative for slide images.

© McGraw Hill, LLC 79


Figure 11.30 Timestamp

Access the text alternative for slide images.

© McGraw Hill, LLC 80


Figure 11.31 Playback buffer

© McGraw Hill, LLC 81


Figure 11.32 The time line of packets

Access the text alternative for slide images.

© McGraw Hill, LLC 82


Example of a Real Time Application: Skype

Skype (abbreviation of the original project Sky peer-to-peer) is a


peer-to-peer VoIP application software that was originally
developed by Ahti Heinla, Priit Kasesalu, and Jaan Tallinn, who
had also originally developed Kazaa (a P2P file-sharing
application software). The application allows registered users who
have audio input and output devices on their PCs to make free PC-
to-PC voice calls to other registered users over the Internet.

© McGraw Hill, LLC 83


11-4 REAL-TIME INTERACTIVE PROTOCOLS

After discussing the three approaches to using


multimedia through the Internet, we now concentrate
on the last one, which is the most interesting: real-
time interactive multimedia. This application has
evoked a lot of attention in the Internet society and
several application-layer protocols have been
designed to handle it.

© McGraw Hill, LLC 84


Figure 11.33 Schematic diagram of a real-time multimedia
system

Access the text alternative for slide images.

© McGraw Hill, LLC 85


11.4.1 Rationale for New Protocols

We discussed the protocol stack for general Internet applications in


Chapter 2. In this section, we want to show why we need some new
protocols to handle interactive real-time multimedia applications
such as audio and video conferencing.

© McGraw Hill, LLC 86


Application Layer

It is clear that we need to develop some application-layer protocols


for interactive real-time multimedia because the nature of audio
conferencing and video conferencing is different from some
applications, such as file transfer and electronic mail, which we
discussed in Chapter 2. Several proprietary applications have been
developed by the private sector, and more and more applications
are appearing in the market every day. Some of these applications,
such as MPEG audio and MPEG video, use some standards
defined for audio and video data transfer. There is no specific
standard that is used by all applications, and there is no specific
application protocol that can be used by everyone.

© McGraw Hill, LLC 87


Transport Layer

The lack of a single standard and the general features of


multimedia applications raise some questions about the transport-
layer protocol to be used for all multimedia applications. The two
common transport-layer protocols, UDP and TCP, were developed
at the time when no one even thought about the use of multimedia
in the Internet. Can we use UDP or TCP as a general transport-
layer protocol for real-time multimedia applications? To answer
this question, we first need to think about the requirements for this
type of multimedia application and then see if either UDP or TCP
can respond to these requirements.

© McGraw Hill, LLC 88


Table 11.6 Capability of UDP or TCP to handle real-time data

Requirements UDP TCP


1. Sender-receiver negotiation for selecting the encoding type No No
2. Creation of packet stream No No
3. Source synchronization for mixing different sources No No
4. Error control No Yes
5. Congestion control No Yes
6. Jitter removal No No
7. Sender identification No No

© McGraw Hill, LLC 89


11.4.2 RTP

Real-time Transport Protocol (RTP) is the protocol designed to


handle real-time traffic on the Internet. RTP does not have a
delivery mechanism (multicasting, port numbers, and so on); it
must be used with UDP. RTP stands between UDP and the
multimedia application. The literature and standards treat RTP as
the transport protocol (not a transport-layer protocol) that can be
thought of as located in the application layer
(see Figure 11.34).

© McGraw Hill, LLC 90


Figure 11.34 RTP location in the TCP/IP protocol suite

Access the text alternative for slide images.

© McGraw Hill, LLC 91


RTP Packet Format

Before we discuss how RTP can help the multimedia applications,


let us discuss its packet format. We can then relate the functions of
the fields with the requirements we discussed in the previous
section. Figure 11.35 shows the format of the RTP packet header.
The format is very simple and general enough to cover all real-time
applications. An application that needs more information adds it to
the beginning of its payload.

© McGraw Hill, LLC 92


Figure 11.35 RTP packet header format

Access the text alternative for slide images.

© McGraw Hill, LLC 93


Table 11.7 Payload types

Type Application Type Application Type Application


0 PCMμ Audio 7 LPC audio 15 G728 audio
1 1016 8 PCMA audio 26 Motion JPEG
2 G721 audio 9 G722 audio 31 H.261
3 GSM audio 10-11 L16 audio 32 MPEG1 video
5-6 DV14 audio 14 MPEG audio 33 MPEG2 video

© McGraw Hill, LLC 94


UDP Port 1

Although RTP is itself a transport-layer protocol, the RTP packet is


not encapsulated directly in an IP datagram. Instead, RTP is
treated like an application program and is encapsulated in a UDP
user datagram. However, unlike other application programs, no
well-known port is assigned to RTP. The port can be selected on
demand with only one restriction: The port number must be an
even number. The next number (an odd number) is used by the
companion of RTP, Real-time Transport Control Protocol (RTCP),
which we will discuss in the next section.

© McGraw Hill, LLC 95


11.4.3 RTCP

RTP allows only one type of message, one that carries data from
the source to the destination. To really control the session, we need
more communication between the participants in a session. Control
communication in this case is assigned to a separate protocol
called Real-time Transport Control Protocol (RTCP).

© McGraw Hill, LLC 96


RTCP Packet

After discussing the main functions and purpose of RTCP, let us


discuss its packets. Figure 11.36 shows five common packet types.
The number next to each box defines the numeric value of each
packet. We need to mention that more than one RTCP packet can
be packed as a single payload for UDP because the RTCP packets
are smaller than RTP packets.

© McGraw Hill, LLC 97


Figure 11.36 RTCP packet types

Access the text alternative for slide images.

© McGraw Hill, LLC 98


UDP Port 2

RTCP, like RTP, does not use a well-known UDP port. It uses a
temporary port. The UDP port chosen must be the number
immediately following the UDP port selected for RTP, which makes
it an odd-numbered port

© McGraw Hill, LLC 99


Bandwidth Utilization

The RTCP packets are sent not only by the active senders, but also
by passive receivers, whose numbers are normally greater than the
active senders. This means that if the RTCP traffic is not
controlled, it may get out of hand. To control the situation, RTCP
uses a control mechanism to limit its traffic to the small portion
(normally 5 percent) of the traffic used in the session (for both RTP
and RTCP). A larger part of this small percentage, x percent, is
then assigned to the RTCP packets generated by the passive
receiver, a smaller part, (1 - x) percent, is assigned to the RTCP
packets generated by the active senders.

© McGraw Hill, LLC 100


Example 11.10

Let us assume that the total bandwidth allocated for a session is 1


Mbps. RTCP traffic gets only 5 percent of this bandwidth, which is
50 Kbps. If there are only 2 active senders and 8 passive receivers,
it is natural that each sender or receiver gets only 5 Kbps. If the
average size of the RTCP packet is 5 Kbits, then each sender or
receiver can send only 1 RTCP packet per second. Note that we
need to consider the packet size at the data-link layer.

© McGraw Hill, LLC 101


Requirement Fulfillment

As we promised, let us see how the combination of RTP and RTCP


can respond to the requirements of an interactive real-time
multimedia application. A digital audio or video stream, a
sequence of bits, is divided into chunks. Each chunk has a
predefined boundary that distinguishes the chunk from the previous
chunk or the next one. A chunk is encapsulated in an RTP packet,
which defines a specific encoding (payload type), a sequence
number, a timestamp, a synchronization source (SSRC) identifier,
and one or more contributing source (CSRC) identifiers.

© McGraw Hill, LLC 102


11.4.4 Session Initialization Protocol (SIP)

We discussed how to use the Internet for audio-video conferencing.


Although RTP and RTCP can be used to provide these services, one
component is missing: a signaling system required to call the
participants. The Session Initiation Protocol (SIP) is a protocol
devised by IETF to be used in conjunction with the RTP/SCTP.

© McGraw Hill, LLC 103


Communicating Parties

One difference that we may have noticed between the interactive


real-time multimedia applications and other applications is
communicating parties. In an audio or video conference, the
communication is between humans, not devices. For example, in
HTTP or FTP, the client needs to find the IP address of the server
(using DNS) before communication. There is no need to find a
person before communicating. In the SMTP, the sender of an e-mail
sends the message to the receiver mailbox on an SMTP without
controlling when the message will be picked up. In an audio or
video conference, the caller needs to find the callee.

© McGraw Hill, LLC 104


Addresses

In a regular telephone communication, a telephone number


identifies the sender, and another telephone number identifies the
receiver. SIP is very flexible. In SIP, an e-mail address, an IP
address, a telephone number, and other types of addresses can be
used to identify the sender and receiver. However, the address
needs to be in SIP format (also called scheme). Figure 11.37 shows
some common formats.

© McGraw Hill, LLC 105


Figure 11.37 SIP formats

Access the text alternative for slide images.

© McGraw Hill, LLC 106


Messages

SIP is a text-based protocol like HTTP. SIP, like HTTP, uses


messages. Messages in SIP are divided into two broad categories:
Requests and responses. The format of both message categories is
shown below (note the similarity with HTTP messages).

© McGraw Hill, LLC 107


First Scenario: Simple Session

In the first scenario, we assume that Alice needs to call Bob and
the communication uses the IP addresses of Alice and Bob as the
SIP addresses. We can divide the communication into three
modules: establishing, communicating, and terminating. Figure
11.38 shows a simple session using SIP.

© McGraw Hill, LLC 108


Figure 11.38 SIP simple session

Access the text alternative for slide images.

© McGraw Hill, LLC 109


Second Scenario: Tracking the Callee

What happens if Bob is not sitting at his terminal? He may be away


from his system or at another terminal. He may not even have a
fixed IP address if DHCP is being used. SIP has a mechanism
(similar to one in DNS) that finds the IP address of the terminal at
which Bob is sitting. To perform this tracking, SIP uses the concept
of registration. SIP defines some servers as registrars. At any
moment a user is registered with at least one registrar server; this
server knows the IP address of the callee.

© McGraw Hill, LLC 110


Figure 11.39 Tracking the callee

Access the text alternative for slide images.

© McGraw Hill, LLC 111


SDP Message Format and SDP Protocol

As we discussed before, the SIP request and response messages are


divided into four sections: start or status line, header, a blank line,
and the body. Since a blank line needs no more information, let us
briefly describe the format of the other sections.

© McGraw Hill, LLC 112


11.4.5 H.323

H.323 is a standard designed by ITU to allow telephones on the


public telephone network to talk to computers connected to the
Internet. Figure 11.40 shows the general architecture of H.323 for
audio, but it can also be used for video.

© McGraw Hill, LLC 113


Figure 11.40 H.323 architecture

Access the text alternative for slide images.

© McGraw Hill, LLC 114


Protocols

H.323 uses a number of protocols to establish and maintain voice


(or video) communication. Figure 11.41 shows these protocols.
H.323 uses G.71 or G.723.1 for compression. It uses a protocol
named H.245, which allows the parties to negotiate the
compression method. Protocol Q.931 is used for establishing and
terminating connections. Another protocol, called H.225, or
Registration/Administration/Status (RAS), is used for registration
with the gatekeeper.

© McGraw Hill, LLC 115


Figure 11.41 H.323 protocols

Access the text alternative for slide images.

© McGraw Hill, LLC 116


Operation

Let us use a simple example to show the operation of a telephone


communication using H.323. Figure 11.42 shows the steps used by
a terminal to communicate with a telephone.

© McGraw Hill, LLC 117


Figure 11.42 H.323 example

Access the text alternative for slide images.

© McGraw Hill, LLC 118


End of Main Content

Because learning changes everything. ®

www.mheducation.com

© 2022 McGraw Hill, LLC. All rights reserved. Authorized only for instructor use in the classroom.
No reproduction or further distribution permitted without the prior written consent of McGraw Hill, LLC.

You might also like