CHAPTER 2 : Data Representation
DIGITAL NUBER SYSTEMS
Decimal
- deca = ten
- MSD : most significant digit
- LSD : least significant digit
- the various positions relative to the decimal point carry weights that can be expressed as
powers of 10.
Binary
- 10 diff voltage levels is difficult to implement
- MSB , LSB
- positional values and powers of two
- use of subscripts
- binary point
Octal number
- base 8
- octal point
Hexadecimal ( base 16 )
- 0 --> 9 and A B C D E F
NUMBER CONVERSIONS
Decimal to Binary
• Power of 2 table method
• Repeated Division Method
• the number is repeatedly divided by 2 and the remainders are recorded
• last remainder = MSB
•
Binary to Decimal
• by summing together the weights of the various positions in the binary number
• Dibble-dobble method
•
Decimal to Octal
• same repeated division method as with binary
• but using a factor of 8
Octal to Decimal
• same method by summing up the weights of the positional values of the number.
• power of eights
Octal to Binary
• Using the technique of 3-bit sub-situation for each octal character.
• Same process for fractions
•
Binary to Octal
• simply reverse the upper process by taking group of 3 bits from leftmost bit and convert
that to individual octal digits.
• same cases for fraction but the 0 is added to the leftmost side.
Decimal to Hex
• By repeated division with a factor of 16 and recording the remainders
•
Hex to Decimal
• same process by summing all the positional weights of the digits of exponents of 16.
Hex to Binary
• Like octal, each hex digit is converted into its 4 bit binary equivalent.
• 4-bit substitution
Binary to Hex
• like octal, by grouping binary bits in group of four from the leftmost side.
Representing unsigned integers in Binary
• always positive or 0
• a n-bit binary pattern can represent 2n distinct integers => 0 to 2n - 1
• analogously in decimal => n digits => 0 to 10n - 1 number
Binary Addition
• 0+0=0
• 0+1=1
• 1+0=1
• 1 + 1 = 10 ( carry 1 )
• 1 + 1 + 1 = 11 ( carry 1)
Character Representation
• alphanumeric codes
• 26 English uppercase and lowercase letters, 10 numeric digits, 7 punctuations, 20 - 40
other special char @ # % etc.
ASCII code
• standard : 7-bit code => 128 possible codes => represent control functions too RETURN,
LINEFEED
• extended ASCII : 8-bit code => 256 codes.
• A -> Z => 65 -> 90 in decimal , 101 -> 132 in octal
• 0 -> 9 => 48 -> 57 in decimal , 060 -> 071 in octal
ISCII code ( Indian Standard Code for Information Interchange )
1. by Bureau of Indian Standards in 1991
2. 256 codes
3. All GIST products, IBM PC-DOS, ILK etc.
4. mandatory for The ECI and Land Records Project
5. support for Devanagari, Gurumukhi, Gujrati, Oriya, Bengali, Assamese, Telgu etc...
6. transliteration between scripts due to their similarity
Unicode
• universal character set
• several language -> more bits -> lack of storge in the 1 byte approach
• no standardisation between languages -> conflicts in uses of same codes for diff char
• format : U+<codepoint number> e.g: U+0041
• UTF-8 -> variable length encoding => different no. of octets or bytes to represent diff
char sets to save memory
• 1 Octet
•
• 2-Octets
•
• 3-Octets
•
• 4-Octets
•
• UTF-32 : fixed length encoding scheme => uses exactly 4 bytes
• UTF-16 : variable length encoding => using 2 or 4 bytes