Coding Systems Student Notes
Coding Systems Student Notes
Coding Systems
A coding system is the patterns of 0s and 1s combinations used to represent
characters.
The three most popular coding schemes used to represent data are: ASCII, EBCDIC,
and Unicode.
Bits Codes
1 2
2 4
3
4
5
6
Exercise:
1. How many codes can be represented using 5 bits?
2. How many bits do I need to represent the 10 different digits?
3. How many bits do I need to represent the 10 fingers together with the 10 toes?
Characters
Characters include the digits 0 to 9 (these are the numeric characters), letters (these are
alphabetic characters) and punctuation marks (these are the special characters). The
numeric and alphabetic characters together are called alphanumeric characters. a, b, c,
A, B, C, 0, 1, 2, 9, &, $,* are some characters.
Character Code
A single code number – the character code, represents individual characters. Character
codes are the binary patterns used to represent the character set - a list of all the
characters that a computer can process and store. Any textual data will be stored as a
sequence of these codes. When the data is displayed or printed the code is converted
into the appropriate shape.
Page 1 of 3 K Aquilina
Student Notes Theory
Character Sets
Different types of computer have slightly different character sets. Many character sets
also have codes for control characters. These non-printing characters are used for
special purposes.
Examples of control characters are, end of record and end of file markers in a file,
carriage return and line feed for a printer, begin and end transmission with a modem and
cursor movement on a screen. Over the years, different computer designers have used
different sets of codes for representing characters, which has led to great difficulty in
transferring information from one computer to another. Most computers nowadays use
internationally agreed character sets. Unless the coding scheme (character set) is
standard, it will not be possible to exchange textual data between computers.
AS CI I
The 128 different combinations that can be represented in 7 bits are plenty to allow for
all the letters, numbers and special symbols. An eight bit was added. This allowed extra
128 characters to be represented. The extra 128 combinations are used for symbols
such as Ç ü è ©, ®, Æ, etc.
The codes for the alphabetical characters indicate their relative positions in the alphabet
in ASCII. This is known as collating sequence, thus, sorting textual items can be
transformed into sorting the corresponding character codes. Also, in ASCII, uppercase
characters, lowercase characters and digits etc, are grouped together. So it is easy to
map between upper and lower case characters.
Page 2 of 3 K Aquilina
Student Notes Theory
The characters with codes 0 through 31 are known as control characters (because
historically they were used to control teletype operations). They are referred to by their
abbreviations (CR for carriage return, LF for linefeed, ESC for escape, and so on) or by
the word "Ctrl" followed by a corresponding letter (meaning the letter produced by
adding 64 to the control code). For example, the control character with ASCII code 7 is
known as BEL or Ctrl-G.
EBCDIC
The Extended Binary Coded Decimal Interchange Code is an 8-bit code which therefore
permits 28 = 256 distinct characters. This code allowed for the coding of international
characters. EBCDIC has a wider range of control characters than ASCII
The important difference between the two coding systems lies in the 8-bit combinations
assigned to represent the various alphabetic, numeric, and special characters. When
using ASCII 8-bit code, you will notice the selection of bit patterns used in the positions
differs from those used in EBCDIC. For example, let's look at the characters D – P – 3 in
both EBCDIC and ASCII to see how they compare.
Character D P 3
EBCDIC 1100 0100 1101 0111 1111 0011
ASCII 0100 0100 0101 0000 0011 0011
Page 3 of 3 K Aquilina