0% found this document useful (0 votes)
56 views

A Tutorial - Data Representation

This document discusses different number systems used in computing such as binary, decimal, hexadecimal and their conversions. It explains that computers use binary while hexadecimal is used as a compact shorthand for binary. Different data types like integers can be represented using a fixed number of bits and their representation depends on the chosen bit-length and format. The document also discusses the Rosetta Stone which helped decipher Egyptian hieroglyphs by providing the same text in three scripts including Greek.

Uploaded by

isaacwylliam
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views

A Tutorial - Data Representation

This document discusses different number systems used in computing such as binary, decimal, hexadecimal and their conversions. It explains that computers use binary while hexadecimal is used as a compact shorthand for binary. Different data types like integers can be represented using a fixed number of bits and their representation depends on the chosen bit-length and format. The document also discusses the Rosetta Stone which helped decipher Egyptian hieroglyphs by providing the same text in three scripts including Greek.

Uploaded by

isaacwylliam
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

A Tutorial on Data

Representation
Integers, Floating-point Numbers,
and Characters
1. Number Systems
Human beings use decimal (base 10) and duodecimal (base 12) number systems for counting and
measurements (probably because we have 10 fingers and two big toes). Computers use binary (base
2) number system, as they are made from binary digital components (known as transistors) operating
in two states - on and off. In computing, we also use hexadecimal (base 16) or octal (base 8) number
systems, as acompact form for represent binary numbers.
1.1 Decimal (Base 10) Number System
Decimal number system has ten symbols: 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9, called digits. It uses positional
notation. That is, the least-significant digit (right-most digit) is of the order of 10^0 (units or ones),
the second right-most digit is of the order of 10^1 (tens), the third right-most digit is of the order
of 10^2 (hundreds), and so on. For example,

735 = 7×10^2 + 3×10^1 + 5×10^0

We shall denote a decimal number with an optional suffix D if ambiguity arises.


1.2 Binary (Base 2) Number System
Binary number system has two symbols: 0 and 1, called bits. It is also a positional notation, for
example,

10110B = 1×2^4 + 0×2^3 + 1×2^2 + 1×2^1 + 0×2^0

We shall denote a binary number with a suffix B. Some programming languages denote binary
numbers with prefix 0b (e.g., 0b1001000), or prefix b with the bits quoted (e.g., b'10001111').
A binary digit is called a bit. Eight bits is called a byte (why 8-bit unit? Probably because 8=23).
1.3 Hexadecimal (Base 16) Number System
Hexadecimal number system uses 16 symbols: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F, called hex
digits. It is a positional notation, for example,

A3EH = 10×16^2 + 3×16^1 + 14×16^0

We shall denote a hexadecimal number (in short, hex) with a suffix H. Some programming languages
denote hex numbers with prefix 0x (e.g.,0x1A3C5F), or prefix x with hex digit quoted
(e.g., x'C3A4D98B').
Each hexadecimal digit is also called a hex digit. Most programming languages accept
lowercase 'a' to 'f' as well as uppercase 'A' to 'F'.
Computers uses binary system in their internal operations, as they are built from binary digital
electronic components. However, writing or reading a long sequence of binary bits is cumbersome
and error-prone. Hexadecimal system is used as a compact form or shorthand for binary bits. Each
hex digit is equivalent to 4 binary bits, i.e., shorthand for 4 bits, as follows:
0H (0000B) 1H (0001B) 2H (0010B) 3H (0011B)
(0D) (1D) (2D) (3D)
4H (0100B) 5H (0101B) 6H (0110B) 7H (0111B)
(4D) (5D) (6D) (7D)
8H (1000B) 9H (1001B) AH (1010B) BH (1011B)
(8D) (9D) (10D) (11D)
CH (1100B) DH (1101B) EH (1110B) FH (1111B)
(12D) (13D) (14D) (15D)
1.4 Conversion from Hexadecimal to Binary
Replace each hex digit by the 4 equivalent bits, for examples,

A3C5H = 1010 0011 1100 0101B


102AH = 0001 0000 0010 1010B

1.5 Conversion from Binary to Hexadecimal


Starting from the right-most bit (least-significant bit), replace each group of 4 bits by the equivalent
hex digit (pad the left-most bits with zero if necessary), for examples,

1001001010B = 0010 0100 1010B = 24AH


10001011001011B = 0010 0010 1100 1011B = 22CBH

It is important to note that hexadecimal number provides a compact form or shorthand for
representing binary bits.
1.6 Conversion from Base r to Decimal (Base 10)
Given a n-digit base r number: dn-1 dn-2 dn-3 ... d3 d2 d1 d0 (base r), the decimal equivalent is
given by:

dn-1 × r^(n-1) + dn-2 × r^(n-2) + ... + d1 × r^1 + d0 × r^0

For examples,

A1C2H = 10×16^3 + 1×16^2 + 12×16^1 + 2 = 41410 (base 10)


10110B = 1×2^4 + 1×2^2 + 1×2^1 = 22 (base 10)

1.7 Conversion from Decimal (Base 10) to Base r


Use repeated division/remainder. For example,

To convert 261D to hexadecimal:


261/16 => quotient=16 remainder=5
16/16 => quotient=1 remainder=0
1/16 => quotient=0 remainder=1 (quotient=0 stop)
Hence, 261D = 105H

The above procedure is actually applicable to conversion between any 2 base systems. For example,

To convert 1023(base 4) to base 3:


1023(base 4)/3 => quotient=25D remainder=0
25D/3 => quotient=8D remainder=1
8D/3 => quotient=2D remainder=2
2D/3 => quotient=0 remainder=2 (quotient=0 stop)
Hence, 1023(base 4) = 2210(base 3)

1.8 General Conversion between 2 Base Systems with Fractional


Part
1. Separate the integral and the fractional parts.
2. For the integral part, divide by the target radix repeatably, and collect the ramainder in reverse
order.
3. For the fractional part, multiply the fractional part by the target radix repeatably, and collect
the integral part in the same order.

Example 1:

Convert 18.6875D to binary


Integral Part = 18D
18/2 => quotient=9 remainder=0
9/2 => quotient=4 remainder=1
4/2 => quotient=2 remainder=0
2/2 => quotient=1 remainder=0
1/2 => quotient=0 remainder=1 (quotient=0 stop)
Hence, 18D = 10010B
Fractional Part = .6875D
.6875*2=1.375 => whole number is 1
.375*2=0.75 => whole number is 0
.75*2=1.5 => whole number is 1
.5*2=1.0 => whole number is 1
Hence .6875D = .1011B
Therefore, 18.6875D = 10010.1011B

Example 2:

Convert 18.6875D to hexadecimal


Integral Part = 18D
18/16 => quotient=1 remainder=2
1/16 => quotient=0 remainder=1 (quotient=0 stop)
Hence, 18D = 12H
Fractional Part = .6875D
.6875*16=11.0 => whole number is 11D (BH)
Hence .6875D = .BH
Therefore, 18.6875D = 12.BH

1.9 Exercises (Number Systems Conversion)


1. Convert the following decimal numbers into binary and hexadecimal numbers:
a. 108
b. 4848
c. 9000
2. Convert the following binary numbers into hexadecimal and decimal numbers:
. 1000011000
a. 10000000
b. 101010101010
3. Convert the following hexadecimal numbers into binary and decimal numbers:
. ABCDE
a. 1234
b. 80F
4. Convert the following decimal numbers into binary equivalent:
. 19.25D
a. 123.456D
Answers: You could use the Windows' Calculator (calc.exe) to carry out number system
conversion, by setting it to the scientific mode. (Run "calc" ⇒ Select "View" menu ⇒ Choose
"Programmer" or "Scientific" mode.)
1. 1101100B, 1001011110000B, 10001100101000B, 6CH, 12F0H, 2328H.
2. 218H, 80H, AAAH, 536D, 128D, 2730D.
3. 10101011110011011110B, 1001000110100B, 100000001111B, 703710D, 4660D, 2063D.
4. ??

2. Computer Memory & Data Representation


Computer uses a fixed number of bits to represent a piece of data, which could be a number, a
character, or others. A n-bit storage location can represent up to 2^n distinct entities. For example, a
3-bit memory location can hold one of these eight binary patterns: 000, 001, 010, 011, 100, 101,110,
or 111. Hence, it can represent at most 8 distinct entities. You could use them to represent numbers
0 to 7, numbers 8881 to 8888, characters 'A' to 'H', or up to 8 kinds of fruits like apple, orange,
banana; or up to 8 kinds of animals like lion, tiger, etc.
Integers, for example, can be represented in 8-bit, 16-bit, 32-bit or 64-bit. You, as the programmer,
choose an appropriate bit-length for your integers. Your choice will impose constraint on the range
of integers that can be represented. Besides the bit-length, an integer can be represented in
variousrepresentation schemes, e.g., unsigned vs. signed integers. An 8-bit unsigned integer has a
range of 0 to 255, while an 8-bit signed integer has a range of -128 to 127 - both representing 256
distinct numbers.
It is important to note that a computer memory location merely stores a binary pattern. It is entirely
up to you, as the programmer, to decide on how these patterns are to be interpreted. For example,
the 8-bit binary pattern "0100 0001B" can be interpreted as an unsigned integer 65, or an ASCII
character 'A', or some secret information known only to you. In other words, you have to first decide
how to represent a piece of data in a binary pattern before the binary patterns make sense. The
interpretation of binary pattern is called data representation or encoding. Furthermore, it is important
that the data representation schemes are agreed-upon by all the parties, i.e., industrial standards
need to be formulated and straightly followed.
Once you decided on the data representation scheme, certain constraints, in particular, the precision
and range will be imposed. Hence, it is important to understand data representation to
write correct and high-performance programs.
Rosette Stone and the Decipherment of Egyptian Hieroglyphs
Egyptian hieroglyphs (next-to-left) were used by the ancient Egyptians since 4000BC. Unfortunately,
since 500AD, no one could longer read the ancient Egyptian hieroglyphs, until the re-discovery of the
Rosette Stone in 1799 by Napoleon's troop (during Napoleon's Egyptian invasion) near the town of
Rashid (Rosetta) in the Nile Delta.
The Rosetta Stone (left) is inscribed with a decree in 196BC on behalf of King Ptolemy V. The decree
appears in three scripts: the upper text is Ancient Egyptian hieroglyphs, the middle portion Demotic
script, and the lowestAncient Greek. Because it presents essentially the same text in all three scripts,
and Ancient Greek could still be understood, it provided the key to the decipherment of the Egyptian
hieroglyphs.

The moral of the story is unless you know the encoding scheme, there is no way that you can decode
the data.

Reference and images: Wikipedia.

3. Integer Representation
Integers are whole numbers or fixed-point numbers with the radix point fixed after the least-
significant bit. They are contrast to real numbers or floating-point numbers, where the position of the
radix point varies. It is important to take note that integers and floating-point numbers are treated
differently in computers. They have different representation and are processed differently (e.g.,
floating-point numbers are processed in a so-called floating-point processor). Floating-point
numbers will be discussed later.
Computers use a fixed number of bits to represent an integer. The commonly-used bit-lengths for
integers are 8-bit, 16-bit, 32-bit or 64-bit. Besides bit-lengths, there are two representation schemes
for integers:
1. Unsigned Integers: can represent zero and positive integers.
2. Signed Integers: can represent zero, positive and negative integers. Three representation
schemes had been proposed for signed integers:
a. Sign-Magnitude representation
b. 1's Complement representation
c. 2's Complement representation

You, as the programmer, need to decide on the bit-length and representation scheme for your
integers, depending on your application's requirements. Suppose that you need a counter for
counting a small quantity from 0 up to 200, you might choose the 8-bit unsigned integer scheme as
there is no negative numbers involved.

3.1 n-bit Unsigned Integers


Unsigned integers can represent zero and positive integers, but not negative integers. The value of
an unsigned integer is interpreted as "the magnitude of its underlying binary pattern".
Example 1: Suppose that n=8 and the binary pattern is 0100 0001B, the value of this unsigned
integer is 1×2^0 + 1×2^6 = 65D.
Example 2: Suppose that n=16 and the binary pattern is 0001 0000 0000 1000B, the value of this
unsigned integer is 1×2^3 + 1×2^12 = 4104D.
Example 3: Suppose that n=16 and the binary pattern is 0000 0000 0000 0000B, the value of this
unsigned integer is 0.
An n-bit pattern can represent 2^n distinct integers. An n-bit unsigned integer can represent integers
from 0 to (2^n)-1, as tabulated below:
n Minimum Maximum
8 0 (2^8)-1 (=255)
16 0 (2^16)-1 (=65,535)
32 0 (2^32)-1 (=4,294,967,295) (9+ digits)
64 0 (2^64)-1 (=18,446,744,073,709,551,615) (19+
digits)
3.2 Signed Integers
Signed integers can represent zero, positive integers, as well as negative integers. Three
representation schemes are available for signed integers:
1. Sign-Magnitude representation
2. 1's Complement representation
3. 2's Complement representation
In all the above three schemes, the most-significant bit (msb) is called the sign bit. The sign bit is used
to represent the sign of the integer - with 0 for positive integers and 1 for negative integers.
The magnitude of the integer, however, is interpreted differently in different schemes.
3.3 n-bit Sign Integers in Sign-Magnitude Representation
In sign-magnitude representation:
 The most-significant bit (msb) is the sign bit, with value of 0 representing positive integer and 1
representing negative integer.
 The remaining n-1 bits represents the magnitude (absolute value) of the integer. The absolute
value of the integer is interpreted as "the magnitude of the (n-1)-bit binary pattern".
Example 1 : Suppose that n=8 and the binary representation is 0 100 0001B.
Sign bit is 0 ⇒ positive
Absolute value is 100 0001B = 65D
Hence, the integer is +65D
Example 2 : Suppose that n=8 and the binary representation is 1 000 0001B.
Sign bit is 1 ⇒ negative
Absolute value is 000 0001B = 1D
Hence, the integer is -1D
Example 3 : Suppose that n=8 and the binary representation is 0 000 0000B.
Sign bit is 0 ⇒ positive
Absolute value is 000 0000B = 0D
Hence, the integer is +0D
Example 4 : Suppose that n=8 and the binary representation is 1 000 0000B.
Sign bit is 1 ⇒ negative
Absolute value is 000 0000B = 0D
Hence, the integer is -0D

The drawbacks of sign-magnitude representation are:


1. There are two representations (0000 0000B and 1000 0000B) for the number zero, which could
lead to inefficiency and confusion.
2. Positive and negative integers need to be processed separately.
3.4 n-bit Sign Integers in 1's Complement Representation
In 1's complement representation:
 Again, the most significant bit (msb) is the sign bit, with value of 0 representing positive integers
and 1 representing negative integers.
 The remaining n-1 bits represents the magnitude of the integer, as follows:
for positive integers, the absolute value of the integer is equal to "the magnitude of the (n-
o
1)-bit binary pattern".
o for negative integers, the absolute value of the integer is equal to "the magnitude of
the complement (inverse) of the (n-1)-bit binary pattern" (hence called 1's complement).
Example 1 : Suppose that n=8 and the binary representation 0 100 0001B.
Sign bit is 0 ⇒ positive
Absolute value is 100 0001B = 65D
Hence, the integer is +65D
Example 2 : Suppose that n=8 and the binary representation 1 000 0001B.
Sign bit is 1 ⇒ negative
Absolute value is the complement of 000 0001B, i.e., 111 1110B = 126D
Hence, the integer is -126D
Example 3 : Suppose that n=8 and the binary representation 0 000 0000B.
Sign bit is 0 ⇒ positive
Absolute value is 000 0000B = 0D
Hence, the integer is +0D
Example 4 : Suppose that n=8 and the binary representation 1 111 1111B.
Sign bit is 1 ⇒ negative
Absolute value is the complement of 111 1111B, i.e., 000 0000B = 0D
Hence, the integer is -0D
Again, the drawbacks are:
1. There are two representations (0000 0000B and 1111 1111B) for zero.
2. The positive integers and negative integers need to be processed separately.

3.5 n-bit Sign Integers in 2's Complement Representation


In 2's complement representation:
 Again, the most significant bit (msb) is the sign bit, with value of 0 representing positive integers
and 1 representing negative integers.
 The remaining n-1 bits represents the magnitude of the integer, as follows:
o for positive integers, the absolute value of the integer is equal to "the magnitude of the (n-
1)-bit binary pattern".
o for negative integers, the absolute value of the integer is equal to "the magnitude of
the complement of the (n-1)-bit binary pattern plus one" (hence called 2's complement).
Example 1 : Suppose that n=8 and the binary representation 0 100 0001B.
Sign bit is 0 ⇒ positive
Absolute value is 100 0001B = 65D
Hence, the integer is +65D
Example 2 : Suppose that n=8 and the binary representation 1 000 0001B.
Sign bit is 1 ⇒ negative
Absolute value is the complement of 000 0001B plus 1, i.e., 111 1110B + 1B = 127D
Hence, the integer is -127D
Example 3 : Suppose that n=8 and the binary representation 0 000 0000B.
Sign bit is 0 ⇒ positive
Absolute value is 000 0000B = 0D
Hence, the integer is +0D
Example 4 : Suppose that n=8 and the binary representation 1 111 1111B.
Sign bit is 1 ⇒ negative
Absolute value is the complement of 111 1111B plus 1, i.e., 000 0000B + 1B = 1D
Hence, the integer is -1D

3.6 Computers use 2's Complement Representation for Signed


Integers
We have discussed three representations for signed integers: signed-magnitude, 1's complement
and 2's complement. Computers use 2's complement in representing signed integers. This is
because:
1. There is only one representation for the number zero in 2's complement, instead of two
representations in sign-magnitude and 1's complement.
2. Positive and negative integers can be treated together in addition and subtraction. Subtraction
can be carried out using the "addition logic".
Example 1: Addition of Two Positive Integers: Suppose that n=8, 65D + 5D = 70D
65D → 0100 0001B
5D → 0000 0101B(+
0100 0110B → 70D (OK)

Example 2: Subtraction is treated as Addition of a Positive and a Negative


Integers: Suppose that n=8, 5D - 5D = 65D + (-5D) = 60D
65D → 0100 0001B
-5D → 1111 1011B(+
0011 1100B → 60D (discard carry - OK)

Example 3: Addition of Two Negative Integers: Suppose that n=8, -65D - 5D = (-


65D) + (-5D) = -70D

-65D → 1011 1111B


-5D → 1111 1011B(+
1011 1010B → -70D (discard carry - OK)

Because of the fixed precision (i.e., fixed number of bits), an n-bit 2's complement signed integer has a
certain range. For example, for n=8, the range of 2's complement signed integers is -128 to +127.
During addition (and subtraction), it is important to check whether the result exceeds this range, in
other words, whether overflow or underflow has occurred.
Example 4: Overflow: Suppose that n=8, 127D + 2D = 129D (overflow - beyond the range)
127D → 0111 1111B
2D → 0000 0010B(+
1000 0001B → -127D (wrong)

Example 5: Underflow: Suppose that n=8, -125D - 5D = -130D (underflow - below the
range)

-125D → 1000 0011B


-5D → 1111 1011B(+
0111 1110B → +126D (wrong)

FOR MORE INFORMATION VISIT

https://www3.ntu.edu.sg/home/ehchua/programming/java/DataRepres
entation.html

You might also like