Source coding - Suton
Source coding - Suton
Communication schemes
L7/12: p1/20 1
ELEC1011 Communications: Source Coding Rob Maunder
• Source encoder converts the information to a format that is suitable for transmission.
• e.g.
– Multiplexing combines several signals into one. e.g. the left and right channels
of stereo audio, the red, green and blue components of component video or the
audio and video of a television signal.
– Low Pass Filtering (LPF) to limit the bandwidth of the signal in order to avoid
aliasing or to reduce the amount of spectrum required. (Lecture 6)
– Analogue-to-Digital Conversion (ADC) if we want to use digital modulation to
transmit an analogue signal. This uses sampling and quantisation. (Lecture 1)
L7/12: p2/20 2
ELEC1011 Communications: Source Coding Rob Maunder
L7/12: p3/20 3
ELEC1011 Communications: Source Coding Rob Maunder
• For example, suppose you wanted your friend to know when your evening class
starts. You could say “My evening class starts at 7pm”. Here, the “pm” part is
redundant. It wouldn’t be an evening class if it started at 7am. The message can be
shortened to “My evening class starts at 7” without losing any of the information.
L7/12: p4/20 4
ELEC1011 Communications: Source Coding Rob Maunder
Compression
L7/12: p5/20 5
ELEC1011 Communications: Source Coding Rob Maunder
Quantifying information
• For example, a message that conveys the result of flipping a coin contains k = 1
bit of information. This is because there are N = 2 (equally likely) outcomes of
flipping a coin (heads and tails) and N = 2 values that k = log2(N ) = 1 bit can
have (0 and 1).
• If somebody asked me which season I was born in, my reply would contain k = 2 bits
of information. This is because there are N = 4 (equally likely) replies that I could
give (Winter, Spring, Summer, Autumn) and N = 4 values that k = log2(N ) = 2
bits can have (00, 01, 10 and 11).
• A message that conveys the result of throwing an N = 6-sided dice would contain
k = log2(N ) = 2.59 bits of information. A message doesn’t have to contain an
integer number of bits of information!
L7/12: p6/20 6
ELEC1011 Communications: Source Coding Rob Maunder
L7/12: p7/20 7
ELEC1011 Communications: Source Coding Rob Maunder
Entropy
The entropy H of a source is equal to the expected (i.e. average) information content
of its messages.
XN N
X
H= p i ki = pi log2(1/pi)
i=1 i=1
L7/12: p8/20 8
ELEC1011 Communications: Source Coding Rob Maunder
L7/12: p9/20 9
ELEC1011 Communications: Source Coding Rob Maunder
L7/12: p10/20 10
ELEC1011 Communications: Source Coding Rob Maunder
10 5 9 6 8 7
3 11 4
2 12
L7/12: p11/20 11
ELEC1011 Communications: Source Coding Rob Maunder
Huffman coding
• The example on the previous slides is a Huffman code, which is a special type of
variable length code.
• The design of Huffman codes is not within the scope of ELEC1011, but the use of
them is.
L7/12: p12/20 12
ELEC1011 Communications: Source Coding Rob Maunder
Arithmetic coding
• The coding efficiency of Huffman coding is limited because it has to use an integer
number of bits li to represent the message possibility i.
L7/12: p13/20 13
ELEC1011 Communications: Source Coding Rob Maunder
• Divide the number line into N portions, where the ith portion has a width equal to
the probability of the ith message possibility pi.
1/36
2/36
3/36
4/36
5/36
6/36
5/36
4/36
3/36
2/36
1/36
2 3 4 5 6 7 8 9 10 11 12
0.0000000000
0.0277777778
0.0833333333
0.1666666667
0.2777777778
0.4166666667
0.5833333333
0.7222222222
0.8333333333
0.9166666667
0.9722222222
1.0000000000
L7/12: p14/20 14
L7/12: p15/20
ELEC1011 Communications: Source Coding
3
3
3
3
3
3
3
3
3
15
ELEC1011 Communications: Source Coding Rob Maunder
• Find the shortest binary fraction that represents a number in the identified decimal
range.
1 1 1 1 1 1 1 1 1 1 1
• This gives 2 7 + 2 8 + 2 11 + 2 18 + 2 20 + 2 21 + 2 22 + 2 26 + 2 27 + 2 30 + 231
= 0.0122125386.
• Since the binary fraction will always start with “0.” we only transmit the bits after
the binary point.
L7/12: p16/20 16
ELEC1011 Communications: Source Coding Rob Maunder
+ 218 ?
0.0117187500
+ 219 ?
0.0136718750
+ 2110 ?
0.0117187500 0.0126953125
0.0122070313
+ 2112 ?
0.0124511719
0.0122070313 0.0122680664
+ 2115 ?
0.0122070313 0.0122375488
+ 2116 ?
the desired range then output a bit value of 1 and 0.0122070313 0.0122108459
0.0122108459
+ 2119 ?
0.0122127533
+ 2120 ?
0.0122117996
+ 2121 ?
0.0122122765
+ 2122 ?
0.0122122765 0.0122125149
+ 2123 ?
0.0122125149 0.0122126341
0.0122125149 0.0122125447
branch. + 2126 ?
0.0122125149 0.0122125298
+ 2127 ?
0.0122125298 0.0122125372
+ 2128 ?
0.0122125372 0.0122125410
+ 2129 ?
0.0122125372 0.0122125391
+ 2130 ?
0.0122125372 0.0122125382
+ 2131 ?
0.0122125382 0.0122125386
L7/12: p17/20 17
ELEC1011 Communications: Source Coding Rob Maunder
Arithmetic decoding
Step 2 Divide a number line from 0 to 1 into portions according to the message
probabilities.
Step 3 Repeatedly select the portion having the range into which the decimal fraction
falls and divide the portion according to the message probabilities. Output the
messages that correspond to the selected portions and stop once the required
number of messages have been output (this assumes that the required number is
known to the receiver).
L7/12: p18/20 18
ELEC1011 Communications: Source Coding Rob Maunder
Exercise
Four friends, Hamilton, Button, Schumacher and Alonso have a race every fortnight.
Hamilton tends to win most often and Alonso tends to win least often, as specified
by the probability pi provided for each racer i in the table below. Some source coding
schemes have been devised to transmit the victor of a sequence of races.
i pi cFLC
i cHuff
i
Hamilton 0.5 00 0
Button 0.25 01 10
Schumacher 0.125 10 110
Alonso 0.125 11 111
3. Determine the coding efficiencies R associated with the fixed length codewords
cFLC
i and the Huffman codewords cHuff
i .
L7/12: p19/20 19
ELEC1011 Communications: Source Coding Rob Maunder
Exercise continued
5. Determine the bit sequences that result when the sequence of victors
[Schumacher,Hamilton,Button,Hamilton] is represented using the fixed length
codewords cFLC
i , the Huffman codewords cHuff
i and an arithmetic code.
6. Determine the sequence of victors that is represented by the bit sequence 00110001,
which was obtained using the fixed length codewords cFLC i .
7. Draw a binary tree for the Huffman codewords cHuff i and use it to determine the
sequence of victors that is represented by the bit sequence 11111000.
8. Determine the sequence of four victors that is represented by the bit sequence
010111, which was obtained using an arithmetic code.
L7/12: p20/20 20