0% found this document useful (0 votes)
32 views13 pages

Voice Capacity Enhancement

Uploaded by

Sandeep Kadam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views13 pages

Voice Capacity Enhancement

Uploaded by

Sandeep Kadam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 13

Voice Capacity Enhancement

from GSM to UMTS


In the late 50s and early 60s with the invention of the transistor, it was realised telephone
conversations could be transimitted as digital data, which allows lots of benefits like easy
multiplexing, switching etc. Human speech contains energy in frequencies from about
50Hz up to 15kHz. But it is found that most of the energy in a speech signal lies in the
range of about 300 Hz to 3000Hz. If we pass a speech signal thorugh a band pass filter
with pass band 300 to 3000Hz, the result is still perfectly intelligible. Anything less and
the quality is too low to have a comfortable conversation. So for ordinary human speech,
we filter the analogue signal to between 300 and 4000Hz. Then (because of Nyquist) we
sample at 8000 Hz. (the filter actually has a corner frequency of about 3300Hz, we allow
an extra 700 Hz becuase no filter is ideal, and there is some response above 3300Hz). So
for the last 40 or so years, a plain voice signal has had a sampling rate of 8kHz( 8K
samples per seconds). It also happend to be 8 bits per sample, which gives 64kb per
second. So now as a result, 64kbit/second is the basic digital signal rate in
telecommunications!

Anything more than 300 to 3300Hz gives a better quality sound (of course), but then needs
a higher data rate. For speech it is unnecessary. Now days, we use clever digital vocoders
(voice coder/decoder, an audio codec designed specifically to human voice) which can
operate at a much lower bit rate and still give good results (for human speech). These work
by measuring certain parameters of the speech signal and then synthesising an artificial
replicate at the other end. They are able to mimic human speech sounds well, but cant
reproduce other sounds nicely! But this doesnt matter since we use the telephone for voice,
not hifi music
PCM Encoder
Input
analogue
speech

Band Limited Sampling 8K


Compressor A-
Filter samples per
Law or U-Law
seconds

•A Law & U Law are companding techniques ( 2


PCM options) which compress band limited Digital to Output
signal before sampling & expands after DTA Analogue 64kbps
conversion Converter PCM
• Without A Law, 13 bits * 8 Khz = 104 kbps data
rate would have required for intelligence speech
communication. 8 bit A law tech allows to reduce
data rate to 64 kbps = 8 bits & 8Khz .
• It effectively reduces the dynamic range of the
signal, thereby increasing the coding efficiency
• GSM & other cellular tech use advanced speech
encoding tech & not PCM
Voice Coding
• Stage 1: Speech coding/digital encoding
– Speech coding means the rate at which analogue speech data is converted
into digital data by maintaining acceptable speech quality and mapping
fewer bits for each digitized voice sample
– Speech data is sampled and quantized before mapping each sample
– The oldest technique used in all the telephone exchanges is PCM which
provides speech data at 64 kbps. Later techniques were designed to
decrease the speech rate due to limitation of bandwidth in the air
interface standards of various wireless technologies such as GSM,
CDMA, LTE and more. The reduction in speech codec data rate should
not impact the quality of the speech. This is the utmost priority of all the
speech codec.
– GSM uses 13 kbps speech data rate using CELP technique. The other
speech codec available in GSM include FR (13kbps), HR (6.5kbps), EFR
(12.2 kbps) and AMR (Adaptive Multi Rate). AMR provides from 4.75
to about 12.2 kbps, each of these rates uses fewer bits for encoding
– CDMA uses various speech codec rates such as
8.55kbps/9.6kbps/13.3kbps with CELP speech codec.
Voice Coding

• Stage 2: Channel coding


– Used to provide coding gain that is defined as reduction in Eb/Io to
achieve a specified error probability.
– The channel coding process usualy falls into 2 phases BLOCK
CODES & CONVOLUTIONAL CODES
Speech Transmission
Speech Transmission Overview: When we speak into the microphone on a GSM phone, the speech is
converted to a digital signal with a resolution of 13 bits, sampled at a rate of 8Khz…this 104kbps forms the
input to all the GSM speech codecs.
However in practice 2 basic variations of 64kbps PCM are commonly used.
µ-law, the standard used in North America; and a-law, the standard used in Europe. These methods allow
logarithmic compression to achieve worth of 13 to 14 bits of linear PCM quality using only 8 bits per
sample instead.
The codec analyses the voice, and builds up a bit-stream composed of a number of parameters
that describe aspects of the voice. The output rate of the codec is dependent on its type (see Table 1), with a
range of between 4.75 kbit/s and 13 kbit/s.
Compression( A
Codec Input (kbps) Output(kbps) Code Type
law /U law)

Old Telephone/PSTN 104 1.62 64

Full Rate 104 8 13.00 RTE-LTP

Enhance Full Rate ( EFR) 104 8.5 12.24 ACELP


Half Rate 104 18.4 5.65 VSELP
AMR 12.2 104 8 13.00 ACELP
AMR 10.2 104 10.2 10.20 ACELP
AMR 7.95 104 13.1 7.94 ACELP
AMR 7.4 104 14.1 7.38 ACELP
AMR 6.7 104 15.5 6.71 ACELP
AMR 5.9 104 17.6 5.91 ACELP
AMR 5.15 104 20.2 5.15 ACELP
Table: Different Codec types
AMR 4.75 104 21.9 4.75 ACELP
Note: Basic digital signal rate in telecommunications = 8bpsample * 8 Khz = 64kbps PCM signal
in PCM each sample is represented by 8 bits /sample
Voice Codecs
FR Codec: Full Rate coder is old GSM Coder…..13kbps *20ms =260 speech bits transmitted
/user in 20ms frame. Rate=260/20ms= 13kbps
EFR Codec: Enhanced Full Rate Coder provides a speech service that has improved from the
original Full Rate speech coding, whilst using the same air interface bandwidth…..244
speech bits transmitted/user in 20ms frame…Rate= 244/20ms = 12.2 kbps
The BTS receives transcoded speech over the A-bis interface from the BSC. At this point
the speech is organized into its individual logical channels by the BTS. These logical
channels of information are then channel coded before being transmitted over the air
interface.
The transcoded speech information is received in frames, each containing 260 bits. The
speech bits are grouped into three classes of sensitivity to errors, depending on their
importance to the intelligibility of speech:
The PCM Analogue to Digital Coverter used for GSM has following characteristics
8000 samples ---- >>>1 sec
160 samples ----->> 20ms ( 20ms frame considered as per GSM 06.6)
1 Sample -------->> 13 bits
160 samples ----->> 160 * 13 = 2080 bits
How-ever 160 samples-->> 244 bits ( GSM 06.60)
Hence Compression = 2080/244 = 8.5

Due to A law commanding technique, instead of 13 bits we need only 8 bits to speech encode
the signal increasing the encoding efficiency
EFR Codec

Class 1a: Three parity bits are derived from the 50 class 1a bits. Transmission errors within
these bits are catastrophic to speech intelligibility, therefore, the speech decoder is able
to detect uncorrectable errors within the class 1a bits. If there are class 1a bit errors, the
whole block is usually ignored.

Class 1b: The 132 class 1b bits are not parity checked, but are fed together with the class
1a and parity bits to a convolution encoder. Four tail bits are added which set the
registers in the receiver to a known state for decoding purposes.

Class 2: The 78 least sensitive bits are not protected at all.


The EFR Frame is treated to some preliminary coding to build it up to 260 bits before
being applied to the same channel coding as Full Rate.
The encoded speech now occupies 456 bits but is still transmitted in 20 ms thus raising the
transmission rate to 22.8 kbit/s.
Coding Rate = 189/378 = ½
EFR Codec
Mapping with Burst

The speech information for one 20ms speech block is divided over 8 GSM bursts.
This ensures that if bursts are lost due to interference over the air interface the speech
can still be accurately reproduce.
Thus 456 bits are transmitted into 8 Normal Burst of 57 bits payload each..
Read CP02 ( Moto doc) to understand this concept better

Fig: GSM Normal Burst


AMR Codec
AMR Codec: This is one of the most important innovation for GSM. AMR is not only used
in GSM , but also in EDGE and UMTS networks. Its designed to work with
• GSM Full Rate [1 User / Time Slot in each Radio Channel]
• GSM Half Rate [2 User / Time Slot in each Radio Channel]
AMR uses multiple voice encoding rates, each with a different level of error control.
AMR dynamically responds to different radio conditions , using the most effective mode
of operation at each moment of time.
If the radio conditions are bad, source coding is reduced and channel coding
increased. This improves the quality and the robustness of the network connection
while sacrificing some voice clarity
AMR Benefits AMR FR
Modes
Payload
/Frame
Sampling
Frequency
AMR HR
Modes
12.2 244 8000 NA
AMR Benefits kbps=244/20ms samples/sec
• Greater Spectral Efficiency, 10.2 204 8000 NA
• Better Voice quality throughout kbps=204/20ms samples/sec
the cell, especially at cell edges
and deep inside buildings, and 7.95 159 8000 Same as
increased overall coverage kbps=159/20ms samples/sec FR
• The potential of operating with 7.4 148 8000 Same as
toll-quality voice in half rate kbps=148/20ms samples/sec FR
mode, which reduces network 6.7 134 8000 Same as
costs kbps=134/20ms samples/sec FR
• AMR’s Dynamic capability 5.9 118 8000 Same as
allows to compensate for the kbps=118/20ms samples/sec FR
higher error rates arising from
techniques such as tighter 5.15 103 8000 Same as
frequency re-use and higher kbps=103/20ms samples/sec FR
fractional loading , which
inherently forces mobile to 4.75 95 8000 Same as
operate at lower C/I kbps=95/20ms samples/sec FR
Understanding AMR Better
Full Rate Channel gross bit rate of
the channel / Time slot = 22.8Kbps
22.8 Kbps = Voice information(speech
coding) + Error Control( channel
coding)

Half Rate Channel gross bit rate of


the channel / Time slot = 11.4Kbps
11.4 Kbps = Voice information + Error
Control( Encoding)

AMR Utilizes Following Methods to


reduce bandwidth usage for silent
periods
• DTX( Discontinuous Transmission)
• VAD ( Voice Activity Detection)
• CNG( Comfort Noise Generation)

Source Coding Channel Coding

You might also like