Chapter 2 Data Encoding Techniques
Chapter 2 Data Encoding Techniques
2.1
Encoding is the process of converting the data or
a given sequence of characters, symbols, alphabets
etc., into a specified format, for the secured
transmission of data.
Decoding is the reverse process of encoding which
is to extract the information from the converted
format.
Data Encoding
Encoding is the process of using various patterns of
voltage or current levels to represent 1s and 0s of the
digital signals on the transmission link.
The common types of line encoding are Unipolar, Polar,
Bipolar, and Manchester.
Encoding Techniques
The data encoding technique is divided into the following
types, depending upon the type of data conversion.
2.2
Analog data to Analog signals − The modulation
techniques such as Amplitude Modulation AM, Frequency
Modulation FM and Phase Modulation PM of analog signals,
fall under this category. [Assignment for G1]
Analog data to Digital signals − This process can be
termed as digitization, which is done by Pulse Code
Modulation PCM. Hence, it is nothing but digital modulation.
Sampling and quantization are the important factors in this.
Delta Modulation gives a better output than PCM. [Assignment
for G2]
Digital data to Analog signals − The modulation
techniques such as Amplitude Shift Keying ASK, Frequency
Shift Keying FSK, Phase Shift Keying PSK, etc., fall under this
category. [Assignment G3]
Digital data to Digital signals − These are discussed in this
section. There are several ways to map digital data to digital
signals. Some of them are −NRZ Encoding, BI-phase Encoding
and block encoding. [Assignment for G4 on Block encoding].
2.3
2-1 DIGITAL-TO-DIGITAL CONVERSION
2.4
Line Coding
Converting a string of 1’s and 0’s
(digital data) into a sequence of
signals that denote the 1’s and 0’s.
For example a high voltage level
(+V) could represent a “1” and a
low voltage level (0 or -V) could
represent a “0”.
2.5
Figure 2.1 Line coding and decoding
2.6
Mapping Data symbols
onto Signal levels
2.7
Relationship between data
rate and signal rate
The data rate defines the number of bits
sent per sec - bps. It is often referred to
the bit rate.
The signal rate is the number of signal
elements sent in a second and is
measured in bauds. It is also referred to
as the modulation rate.
Goal is to increase the data rate whilst
reducing the baud rate.
2.8
Figure 2.2 Signal element versus data element
2.9
Data rate and Baud rate
The baud or signal rate can be expressed
as:
S = c x N x 1/r bauds where
N is data rate
c is the case factor (worst, best & avg.)
Note c = 1/2 for the avg. case as worst case is 1 and best
case is 0
r is the ratio between data element &
signal element
2.10
Example 2.1
Solution
We assume that the average value of c is 1/2 . The baud
rate is then
2.11
Considerations for choosing a good signal
element referred to as line encoding
Baseline wandering - a receiver will evaluate
the average power of the received signal
(called the baseline) and use that to determine
the value of the incoming data elements.
If the incoming signal does not vary over a long period
of time, the baseline will drift and thus cause errors in
detection of incoming data elements.
A good line encoding scheme will prevent long runs of
fixed amplitude.
2.12
Line encoding C/Cs
DC components - when the voltage
level remains constant for long
periods of time, there is an increase in
the low frequencies of the signal. Most
channels are bandpass and may not
support the low frequencies.
This will require the removal of the dc
component of a transmitted signal.
2.13
Line encoding C/Cs
Self synchronization - the clocks at
the sender and the receiver must
have the same bit interval.
If the receiver clock is faster or slower
it will misinterpret the incoming bit
stream.
2.14
Figure 2.3 Effect of lack of synchronization
2.15
Example 2.2
2.16
Line encoding C/Cs
Error detection - errors occur during
transmission due to line impairments.
Some codes are constructed such that when
an error occurs it can be detected. For
example: a particular signal transition is not
part of the code. When it occurs, the receiver
will know that a symbol error has occurred.
It is good to add extra bits to the Tx data for error
detection (and possibly correct).
2.17
Line encoding
Noise and interference immunity -
there are line encoding techniques
that make the transmitted signal
“immune” to noise and interference.
This means that the signal cannot be
corrupted, it is stronger than error
detection.
Encoding/ Decoding complexity: complex
high cost
2.18
Line encoding C/Cs
Complexity - the more robust and
resilient the code, the more
complex it is to implement and the
price is often paid in baud rate or
required bandwidth.
2.19
Figure 2.4 Line coding schemes
2.20
Unipolar
All signal levels are on one side of the time
axis - either above or below
NRZ - Non Return to Zero scheme is an
example of this code.
The signal level does not return to zero during
a symbol transmission.
Scheme is prone to baseline wandering
and DC components. It has no
synchronization or any error detection. It is
simple but costly in power consumption.
2.21
Figure 2.5 Unipolar NRZ scheme
2.22
Polar - NRZ
The voltages are on both sides of the time
axis.
Polar NRZ scheme can be implemented
with two voltages. E.g. +V for 1 and -V for
0.
There are two versions:
NZR - Level (NRZ-L) - positive voltage for one
symbol and negative for the other
NRZ - Inversion (NRZ-I) - the change or lack of
change in polarity determines the value of a
symbol. E.g. a “1” symbol inverts the polarity a
“0” does not.
2.23
Figure 2.6 Polar NRZ-L and NRZ-I schemes
2.24
Note
2.25
Note
2.26
Note
2.27
Example 2.3
Solution
The average signal rate is S= c x N x R = 1/2 x N x 1 =
500 kbaud. The minimum bandwidth for this average
baud rate is Bmin = S = 500 kHz.
Note c = 1/2 for the avg. case as worst case is 1 and best
case is 0
2.28
Polar - RZ
The Return to Zero (RZ) scheme uses
three voltage values. +, 0, -.
Each symbol has a transition in the
middle. Either from high to zero or from
low to zero.
This scheme has more signal transitions
(two per symbol) and therefore requires a
wider bandwidth.
No DC components or baseline wandering.
Self synchronization - transition indicates
symbol value.
More complex as it uses three voltage level.
It has no error detection capability.
2.29
Figure 2.7 Polar RZ scheme
2.30
Polar - Biphase: Manchester and Differential
Manchester coding consists of combining the NRZ-L and
Manchester
RZ schemes.
Every symbol has a level transition in the middle: from high
to low or low to high. Uses only two voltage levels.
Differential Manchester coding consists of combining the
NRZ-I and RZ schemes.
Every symbol has a level transition in the middle. But
2.31
Figure 2.8 Polar biphase: Manchester and differential Manchester schemes
2.32
Note
2.33
Note
2.34
Bipolar - AMI and
Pseudoternary
Code uses 3 voltage levels: - +, 0, -, to
represent the symbols (note not
transitions to zero as in RZ).
Voltage level for one symbol is at “0”
and the other alternates between + & -.
Bipolar Alternate Mark Inversion (AMI) -
the “0” symbol is represented by zero
voltage and the “1” symbol alternates
between +V and -V.
Pseudoternary is the reverse of AMI.
2.35
Figure 4.9 Bipolar schemes: AMI and pseudoternary
2.36
Bipolar C/Cs
It is a better alternative to NRZ.
Has no DC component or baseline
wandering.
Has no self synchronization
because long runs of “0”s results
in no signal transitions.
No error detection.
2.37
Assignment for G5
2.38
2.2 Digital Data Communication Techniques
DATA TRANSMISSION MODES
Transmission of digital data through a transmission medium
can be performed either in serial or in parallel mode.
In the serial mode, one bit is sent per clock tick, whereas in
parallel mode multiple bits are sent per clock tick.
There are two subclasses of transmission for both the serial
and parallel modes, as shown in Fig
2.39
Asynchronous and Synchronous Transmission
Receiver samples the medium at the center of each bit
time.
Transmitter’s and receiver’s clocks may not be precisely
aligned.
In Synchronous Transmission,
data is sent in form of blocks or frames and is the full-duplex type.
Between sender and receiver, synchronization is compulsory. There is
no gap present between data, is more efficient and more reliable than
asynchronous transmission to transfer a large amount of data.
Examples: Chat Rooms, Telephonic Conversations, Video Conferencing
In Asynchronous Transmission,
data is sent in form of byte or character.
is the half-duplex type transmission.
start bits and stop bits are added with data.
It does not require synchronization.
Examples: email, forums and letters
40
Asynchronous and Synchronous Transmission (2)
Multiple-bit Error
Burst Error
More than one consecutive bit is corrupted in the received frame.
2.44
Error Detection
Vertical Redundancy Check(VRC): adds a parity bit to every
data unit so that the total number of 1s becomes even- for even
parity checking or odd-for odd parity checking.
Even Parity Check: Data sent from the sender undergoes
parity check :
1 is added as a parity bit to the data block if the data block has an odd
number of 1's.
0 is added as a parity bit to the data block if the data block has
an even number of 1's.
This procedure is used for making the number of 1's even. This is
commonly known as even parity checking.
2.45
Error Detection
Even Parity Check
Disadvantage:
Only single-bit error is detected by
this method, it fails in multi-bit error
detection.
2.46
Longitudinal Redundancy Check(LRC)
is also known as 2-D parity check. A block of bit is divided into
table or matrix of rows and columns.
In order to detect an error, a redundant bit is added to the whole
block and this block is transmitted to receiver. The receiver uses
this redundant row to detect error. After checking the data for
errors, receiver accepts the data and discards the redundant row
of bits.
Example :
If a block of 32 bits is to be transmitted, it is divided into matrix of
four rows and eight columns which as shown in the following
figure :
2.47
Longitudinal Redundancy Check(LRC)
In this matrix of bits, a parity bit (odd or even) is calculated for each column. It
means 32 bits data plus 8 redundant bits are transmitted to receiver. Whenever
data reaches at the destination, receiver uses LRC to detect error in data.
Example : Suppose 32 bit data plus LRC that was being transmitted is hit by a
burst error of length 5 and some bits are corrupted as shown in the following
figure :
The LRC received by the destination does not match with newly corrupted LRC.
The destination comes to know that the data is erroneous, so it discards the data.
2.48
Longitudinal Redundancy Check(LRC)
Disadvantage :
The main problem with LRC is that, it is not able to detect error
if two bits in a data unit are damaged and two bits in exactly
the same position in other data unit are also damaged.
Example : If data 110011 010101 is changed to
010010110100.
2.49
Checksum
Checksum is an error detection which detects the error by dividing
the data into segments of equal size and then use 1's
complement to find the sum of the segments and then the sum is
transmitted with the data to the receiver and same process is done
by the receiver and at the receiver side, all zeros in the sum
indicates the correctness of the data.
First of all data is divided into k segments in a checksum error
detection scheme and each segment has m bits.
For finding out the sum at the sender’s side, all segments are added
through 1's complement arithmetic. And for determining the
checksum we complement the sum.
Along with data segments, the checksum segments are also
transferred.
All the segments that are received on the receiver's side are added
through 1’S complement arithmetic to determine the sum. Then
complement the sum.
The received data is accepted only on the condition that the result is
found to be 0. And if the result is not 0 then it will be discarded.
kaka
2.50
Checksum: Example
2.51
Checksum
Disadvantages: In checksum
error is not detected, if one sub-unit
of the data has one or more
corrupted bits and corresponding bits
of the opposite value are also
corrupted in another sub-unit. Error is
not detected in this situation because
in this case the sum of columns is not
affected by corrupted bits.
2.52
Cyclic Redundancy Check-CRC
The checksum scheme uses the addition method but CRC uses
binary division. A bit sequence commonly known as cyclic
redundancy check is added to the end of the bits in CRC. This is
done so that the resulting data unit will be divisible by the second
binary number that is predetermined.
The receiving data units on the receiver's side need to be divided
by the same number. These data units are accepted and found to
be correct only on the condition that the remainder of this division
is zero. The remainder shows that the data is not correct. So, they
need to be discarded.
Disadvantages: Cyclic Redundancy Check may lead to overflow
of data.
2.53
Cyclic Redundancy Check-CRC-Examples
2.54
Error Correction
Error Correction codes are used to detect and correct the errors
when data is transmitted from the sender to the receiver.
Error Correction can be handled in two ways:
Backward error correction: Once the error is discovered, the
receiver requests the sender to retransmit the entire data unit.
Forward error correction: In this case, the receiver uses the
error-correcting code which automatically corrects the errors.
A single additional bit can detect the error, but cannot correct
it.
For correcting the errors, one has to know the exact position of
the error. For example, If we want to calculate a single-bit
error, the error correction code will determine which one of
seven bits is in error.
To achieve this, we have to add some additional redundant
bits.
2.55
Error Correction
Suppose r is the number of redundant bits and d is the total number of
the data bits. The number of redundant bits r can be calculated by
using the formula:
2r>=d+r+1
The value of r is calculated by using the above formula. For example, if
the value of d is 4, then the possible smallest value that satisfies the
above relation would be 3.
To determine the position of the bit which is in error, Hamming code
can be applied to any length of the data unit and uses the relationship
between data units and redundant units.
Hamming Code
Parity bits: The bit which is appended to the original data of binary
bits so that the total number of 1s is even or odd.
Even parity: if the total number of 1s is even, then the value of the
parity bit is 0. If the total number of 1s occurrences is odd, then the
value of the parity bit is 1.
Odd Parity: if the total number of 1s is even, then the value of parity
bit is 1. If the total number of 1s is odd, then the value of parity bit is 0.
2.56
Hamming Code
Algorithm of Hamming code:
An information of 'd' bits are added to the redundant bits 'r' to form d+r.
The location of each of the (d+r) digits is assigned a decimal value.
The 'r' bits are placed in the positions 1,2,.....2k-1.
At the receiving end, the parity bits are recalculated. The decimal value
of the parity bits determines the position of an error.
Relationship b/w Error position & binary number.
2.57
Hamming Code
Determining the position of the redundant bits
The number of redundant bits is 3. The three bits are represented
by r1, r2, r4. The position of the redundant bits is calculated with
corresponds to the raised power of 2. Therefore, their
corresponding positions are 1, 21, 22.
The position of r1 = 1
The position of r2 = 2
The position of r4 = 4
Representation of Data on the addition of parity bits:
2.58
Hamming Code
We observe from the above figure that the bit positions that includes 1
in the first position are 1, 3, 5, 7. Now, we perform the even-parity
check at these bit positions. The total number of 1 at these bit positions
corresponding to r1 is even, therefore, the value of the r1 bit is 0.
Determining r2 bit
The r2 bit is calculated by performing a parity check on the bit positions
whose binary representation includes 1 in the second position.
We observe from the above figure that the bit positions that includes 1
in the second position are 2, 3, 6, 7. Now, we perform the even-parity
check at these bit positions. The total number of 1 at these bit positions
corresponding to r2 is odd, therefore, the value of the r2 bit is 1.
2.59
Hamming Code
Determining r4 bit
The r4 bit is calculated by performing a parity check on the bit
positions whose binary representation includes 1 in the third
position.
We observe from the above figure that the bit positions that
includes 1 in the third position are 4, 5, 6, 7. Now, we perform the
even-parity check at these bit positions. The total number of 1 at
these bit positions corresponding to r4 is even, therefore, the
value of the r4 bit is 0.
2.60
Hamming Code
Data transferred is given below:
Suppose the 4th bit is changed from 0 to 1 at the receiving end,
then parity bits are recalculated.
R1 bit: The bit positions of the r1 bit are 1,3,5,7
2.61
Hamming Code
R4 bit: The bit positions of r4 bit are 4,5,6,7.
2.62
Assignment
Assignment for G6
Implement VRC and LRC using C with appropriate menu
driven program
Assignment for G7
Implement checksum and CRC using C with appropriate
menu driven program
Assignment for G8
Briefly discuss Encapsulation, Decapsulation and the
roles of each of the OSI model layers. Also, list
networking devices, equipment and common protocols
in each layer.
Exam:Dec 10
2.63