0% found this document useful (0 votes)
13 views20 pages

Ch01 - Lecture 5 ASCII and UNI Codes

The document discusses ASCII and Unicode character encoding systems, detailing how ASCII uses 7 bits to represent 128 characters and how Unicode provides a unique code point for each character across multiple languages. It also explains the concept of parity bits for error detection in data communication and introduces Gray codes to minimize errors in binary counting. The document highlights the importance of these encoding systems in ensuring accurate data representation and communication.

Uploaded by

reve8ls
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views20 pages

Ch01 - Lecture 5 ASCII and UNI Codes

The document discusses ASCII and Unicode character encoding systems, detailing how ASCII uses 7 bits to represent 128 characters and how Unicode provides a unique code point for each character across multiple languages. It also explains the concept of parity bits for error detection in data communication and introduces Gray codes to minimize errors in binary counting. The document highlights the importance of these encoding systems in ensuring accurate data representation and communication.

Uploaded by

reve8ls
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 20

ASCII and UNICODES

Dr. Khursheed Aurangzeb


ASCII Character Code

 The standard binary code for the alphanumeric


characters is called ASCII (American Standard Code for
Information Interchange)

 It uses seven bits to code 128 characters, as shown in


Table 1-5. The seven bits of the code are designated
by B1 through B7, with B7 being the most significant
bit

 Note that the most significant three bits of the code


determine the column of the table and the least
significant four bits the row of the table
ASCII Character Code

 The letter A, for example, is represented in ASCII as


1000001 (column 100, row 0001)

 The ASCII code contains 94 characters that can be


printed and 34 nonprinting characters used for various
control functions

 The printing characters consist of the 26 uppercase


letters, the 26 lowercase letters, the 10 numerals, and
32 special printable characters such as %, @, and $
ASCII Codes
ASCII Codes Cont’d…
ASCII Character Code

 ASCII is a 7-bit code, but most computers manipulate an


8-bit quantity as a single unit called a byte
 Therefore, ASCII characters most often are stored one
per byte, with the most significant bit set to 0
 The extra bit is sometimes used for specific purposes,
eg. some printers recognize an additional 128 8-bit
characters, with the most significant bit set to 1 for
enabling printer to produce additional symbols, such as
those from the Greek alphabet or characters with accent
marks as used in languages other than English
Unicode

 Unicode was developed as an industry standard for


providing a common representation of symbols and
ideographs for the most of the world’s languages
 By providing a standard representation for different
languages, Unicode removes the need to convert
between different character sets and eliminates the
conflicts that arise from using the same numbers for
different character sets
 Unicode provides a unique number called a code point
for each character, as well as a unique name
Unicode

 There are several standard encodings of the code points


that range from 8 to 32 bits (1 to 4 bytes)

 For example, UTF-8 (UCS Transformation Format, where


UCS stands for Universal Character Set) is a variable-length
encoding that uses from 1 to 4 bytes for each code point

 UTF-16 is a variable-length encoding that uses either 2 or 4


bytes for each code point, while UTF-32 is a fixed-length
that uses 4 bytes for every code point
UTF-8 Encoding for Unicode Code Points
Unicode

 A common notation for referring to a code point is the


characters “U+” followed by the four to six
hexadecimal digits of the code point.
 For example, U+0030 is the character “0”, named
Digit Zero.
 The first 128 code points of Unicode, from U+0000 to
U+007F, correspond to the ASCII characters.
 Unicode currently supports over a million code points
from a hundred scripts worldwide.
Unicode

 To illustrate the UTF-8 encoding, consider a couple of


examples. The code point U+0054, Latin capital letter T,
“T”, is in the range of U+0000 0000 to U+0000 007F.
 So it would be encoded with one byte with a value of
(01010100)2.
 The code point U+00B1, plus-minus sign, “±”, is in the
range of U+0000 0080 to U+0000 07FFF.
 So, it would be encoded with two bytes with a value of
(11000010 10110001)2.
Parity Bit

 To detect errors in data communication and


processing, an additional bit is sometimes added
to a binary code word to define its parity

 A parity bit is the extra bit included to make the


total number of 1s in the resulting code word
either even or odd. Eg. Consider following even
and odd parity
Binary Codes

 As we count up or down using binary codes, the number


of bits that change from one binary value to the next
varies

 This is illustrated by the binary code for the octal digits


on the left in Table 1-7

 As we count from 000 up to 111 and “roll over” to 000,


the number of bits that change between the binary
values ranges from 1 to 3
Problem with Binary Codes

 For many applications, multiple bit changes as the


circuit counts is not a problem.
 There are applications, however, in which a change
of more than one bit when counting up or down can
cause serious problems.
 This is illustrated by the binary code for the octal
digits in the Table on next slid.
 One such problem is illustrated by an optical shaft-
angle encoder shown in Figure in coming slides

Gray Codes
Optical Shaft-Angle Encoder Gray Codes
Gray Codes

 The encoder is a disk attached to a rotating shaft for


measurement of the rotational position of the shaft.
 The disk contains areas that are clear for binary 1
and opaque for binary 0.
 An illumination source is placed on one side of the
disk, and optical sensors, one for each of the bits to
be encoded, are placed on the other side of the disk.
 When a clear region lies between the source and a
sensor, the sensor responds to the light with a
binary 1 output.

Gray Codes

 When an opaque region lies between the


source and the sensor, the sensor responds to
the dark with a binary 0.
 The rotating shaft, however, can be in any
angular position. For example, suppose that
the shaft and disk are positioned so that the
sensors lie right at the boundary between 011
and 100.
 In this case, sensors in positions B2, B1, and
B0 have the light partially blocked.
Gray Codes

 In such a situation, it is unclear whether the


three sensors will see light or dark.
 As a consequence, each sensor may produce

either a 1 or a 0.
 Thus, the resulting encoded binary number for

a value between 3 and 4 may be 000, 001, 010,


011, 100, 101, 110, or 111.
 Either 011 or 100 will be satisfactory in this

case, but
the other six values are clearly erroneous!
Gray Codes

 To see the solution to this problem, notice that in


those cases in which only a single bit changes when
going from one value to the next or previous value,
this problem cannot occur.
 For example, if the sensors lie on the boundary
between 2 and 3, the resulting code is either 010 or
011, either of which is satisfactory.
 If we change the encoding of the values 0 through
7 such that only one bit value changes as we count
up or down (including rollover from 7 to 0), then the
encoding will be satisfactory for all positions.

You might also like