Chap5 VoIP
Chap5 VoIP
and Multimedia
over IP
Digital voice
Elements of the sampling theory PCM and TDM, Classical PSTN architecture Quality, Codec, Technical issues
Software allows voice calls from one PC to another - Softphone Voice is converted to IP packets at PC or at PSTN Gateway Works best on broadband connections Carried like normal data traffic by ISP (best effort)
Phone-to-Phone VoIP
Use existing telephone with an Analogue Telephone Adapter (ATA) or use an IP phone Both connect to Broadband modem Called party may be another VoIP user Or, via a gateway, a traditional PSTN customer
Many traditional PSTN calls are carried as VoIP in part For efficiency reasons they may travel with other IP traffic Different from normal VoIP Invisible to end customer Private IP network, not internet Controlled quality
Corporate VoIP
Increasingly growing popularity Inter-site voice carried as IP over leased line or VPN (huge moneysaving especially for international communications) Additionally, a single desk wiring infrastructure (LAN) may carry both data and voice (as VoIP) Voice may stay as IP or be converted to PSTN
Digital voice
Digital voice
Sampling
One sample each 125 s (8KHz), 8 bits per sample x 8KHz 64Kbps
Nyquists theorem
To be able to restore without loss a signal with a cut frequency fc, we need to sample it at a frequency of at least 2fc. Sampling at 8KHz we loose frequency components above 4KHz. For music audio this is not acceptable (MP3:44.1 KHz)
Quantization
We give more levels (more precision) to small values => better SNR
single digital voice channel in the PCM system Synchronously one sample each 125s 64 Kbps total throughput
Analog only at the local loop ADC and DAC at the toll office (a word about Echo here)
Multiplexes synchronously 32 DS0 channels, but channels 0 and 16 reserved for signaling (remains 30 voice channels) Sample rate must remain 8000 samples (Bytes) /s for all channels => the frame time must always remain 125s (at all the levels DS2 DS3 etc) E1 bit rate: 8bits x 32 / 125s = 2.048 Mbps
Ex: T1 = 24 DS0
Simply what we saw so far.. 8000 Byte samples, 64 Kbps encoded in the DS0 format
Many encoding algorithms and corresponding Codecs have been invented and standardized, all seeking the reduce bandwidth with a minor (or no) loss of quality.
Voice Quality
5 4 3 2 1
Perfect. Like face-to-face conversation or radio reception. Fair. Imperfections can be perceived, but sound still clear. Annoying. Very annoying. Nearly impossible to communicate. Impossible to communicate
Objective
Delay
Sources of delay
Jitter
What is Jitter?
Varying queue sizes Different paths introduce jitter, but load balancing algorithms work on a micro-flow level, not on packet level (coarse grain) TCP transport not suitable
Congestion window management increases jitter retransmitted packets will be dropped anyway. VoIP uses RTP/UDP
Solution
Jitter buffer at the receiver Deliberately adding some delay to ensure continuous playback
Jitter Buffer
The bigger the playback delay, the lesser the jitter It is important to have low jitter from the network, otherwise we need a big jitter delay.
Packets arriving too late are deliberately droped by receiver Loss rate tolerable up to 5%
Transport mechanism designed for Soft Realtime applications The most important feature is Timestamping
Manage jitter Synchronize multiple streams, i.e. audio and image in a movie
RTP
P: padding (multiple of 4 bytes) X: exist extension headers (unused) M: app-specific marker (ex: start of video frame) Payload Type: specifies type and encoding algorithm (ex: G.711 voice) Sequence numbering to detect loss of packets Timestamp: jitter management and synchronization of streams SSRC: identifies a stream (many streams may be multplexed on a RTP stream, such as video and audio) CSRC combined with CC: permits many sources to be mixed (CC is the count and SSRC is a list). The mixer would be the SSRC
H.323 architecture
11.
12. 13. 14.
PC discovers Gatekeeper (broadcast discovery request packet) Gatekeeper responds Client registers in the Gatekeeper zone (RAS protocol H225) Client requests bandwidth (this is to manage call admission and QoS!) Client establishes TCP session with Gatekeeper for signalling Client sends SETUP message (Q.931 signalling) Gatekeeper contacts Gateway and sends CALL_PROCEEDING to client Gateway Calls destination number Callee rings, Q.931 ALERT message (GWclient) Callee responds, Q.931 CONNECT message (GWclient) H245 messages to negotiate capabilities (codec, video, teleconf) Data flow proceeds using RTP Call terminates (callee hangs), GW alerts client (Q.931) Client releases Bandwidth to gatekeeper (RAS message)
SIP
The Session Initiation Protocol
LOOKUP and REPLY methods not specified in SIP (free implementation) REGISTER method allows clients to register to proxy and inform of location Another option to use SIP is P2P. Caller and callee exchange directly INVITE/OK/ACK through a TCP connection and then start exchanging data
SIP Methods
JPEG -- DTC
JPEG
The order in which the quantized values are transmitted Run-length encoding: exploiting repetitions
I (Intracoded) frames: Self-contained JPEG-encoded still pictures. P (Predictive) frames: Block-by-block difference with the last frame. B (Bidirectional) frames: Differences between the last and next frame. D (DC-coded) frames: Block averages used for fast forward.