COA Unit2 - G Maity

Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

DR.

MEGHNAD SAHA INSTITUTE OF TECHNOLOGY, HALDIA


DEBHOG, HALDIA, PURBA MEDINIPORE, PIN-721657
DEPARTMENT OF COMPUTER SCIENCE & TECHNOLOGY
2ND YEAR-3RD SEMESTER
E-CONTENTS: COMPUTER ORGANIZATION AND ARCHITECTURE
UNIT 2: INSTRUCTION STRUCTURE AND ADDRESSING MODES,NUMBER REPRESENTATION

DEVELOPED BY
GITIKA MAITY, LECTURER IN CST
DEPARTMENT OF COMPUTER SCIENCE & TECHNOLOGY
DR. MEGHNAD SAHA INSTITUTE OF TECHNOLOGY, HALDIA
DEBHOG, HALDIA, PURBA MEDINIPORE, PIN-721657
Unit: 2 2.1 Instruction Format. 0,1,2,3 address
Name of the Topics: instruction.
Instruction structure and Execution steps of a typical instruction
addressing modes, Number through different parts of CPU and
Representation memory.
2.2 Different addressing modes with
example.
2.3 Representation of Integers in
Computer system.
2.4 Representation of Floating point
numbers in computer system.
2.5 Biased exponent, IEEE format for
single and double precision numbers.
2.1 Instruction Format. 0,1,2,3 address instruction.
Execution steps of a typical instruction through different parts of
CPU and memory.

Computer perform task on the basis of instruction provided. An instruction in


computer comprises of groups called fields. These fields contains different
information as for computers everything is in 0 and 1 so each field has different
significance on the basis of which a CPU decide what to perform. The most
common fields are:
 Operation field which specifies the operation to be performed like addition.
 Address field which contain the location of operand, i.e., register or memory
location.
 Mode field which specifies how operand is to be founded.
An instruction is of various lengths depending upon the number of addresses it
contains. Generally CPU organizations are of three types on the basis of number
of address fields:
1. Single Accumulator organization
2. General register organization
3. Stack organization
 In first organization operation is done involving a special register called
accumulator.
 In second on multiple registers are used for the computation purpose.
 In third organization the work on stack basis operation due to which it does
not contain any address field.

To illustrate the influence of the number of addresses on computer programs, we


will evaluate the arithmetic statement X = (A + B) * (C + D) using zero, one, two,
or three address instruction. We will use the symbols ADD, SUB, MUL, and DIV
for the four arithmetic operations; MOV for the transfer-type operation; and
LOAD and STORE for transfers to and from memory and AC register. We will
assume that the operands are in memory addresses A, B, C, and D, and the
result must be stored in memory at address X.

THREE-ADDRESS INSTRUCTIONS

Computers with three-address instruction formats can use each address field to
specify either a processor register or a memory operand. The program in
assembly language that evaluates X = (A + B) * (C + D) is shown below, together
with comments that explain the register transfer operation of each instruction.
ADD R1, A, B R1 ← M [A] + M [B]
ADD R2, C, D R2 ← M [C] + M [D]
MUL X, R1, R2 M [X] ← R1 * R2
It is assumed that the computer has two processor registers, R1 and R2. The
symbol M [A] denotes the operand at memory address symbolized by A.
The advantage of the three-address format is that it results in short programs
when evaluating arithmetic expressions.
The disadvantage is that the binary-coded instructions require too many bits to
specify three addresses.
An example of a commercial computer that uses three-address instructions is the
Cyber 170. The instruction formats in the Cyber computer are restricted to either
three register address fields or two register address fields and one memory
address field.

TWO-ADDRESS INSTRUCTIONS

Two address instructions are the most common in commercial computers. Here
again each address field can specify either a processor register or a memory word.
The program to evaluate X = (A + B) * (C + D) is as follows:
MOV R1, A R1 ← M [A]
ADD R1, B R1 ← R1 + M [B]
MOV R2, C R2 ← M [C]
ADD R2, D R2 ← R2 + M [D]
MUL R1, R2 R1 ← R1* R2
MOV X, R1 M [X] ← R1
The MOV instruction moves or transfers the operands to and from memory and
processor registers. The first symbol listed in an instruction is assumed to be
both a source and the destination where the result of the operation is transferred.

ONE-ADDRESS INSTRUCTIONS

One-address instructions use an implied accumulator (AC) register for all data
manipulation. For multiplication and division there is a need for a second
register. However, here we will neglect the second and assume that the AC
contains the result of tall operations. The program to evaluate X = (A + B) * (C +
D) is
LOAD A AC ← M [A]
ADD B AC ← A [C] + M [B]
STORE T M [T] ← AC
LOAD C AC ← M [C]
ADD D AC ← AC + M [D]
MUL T AC ← AC * M [T]
STORE X M [X] ← AC
All operations are done between the AC register and a memory operand. T is the
address of a temporary memory location required for storing the intermediate
result.

ZERO-ADDRESS INSTRUCTIONS
A stack-organized computer does not use an address field for the instructions
ADD and MUL. The PUSH and POP instructions, however, need an address field
to specify the operand that communicates with the stack. The following program
shows how X = (A + B) * (C + D) will be written for a stack organized computer.
(TOS stands for top of stack)
PUSH A TOS ← A
PUSH B TOS ← B
ADD TOS ← (A + B)
PUSH C TOS ← C
PUSH D TOS ← D
ADD TOS ← (C + D)
MUL TOS ← (C + D) * (A + B)
POP X M [X] ← TOS
To evaluate arithmetic expressions in a stack computer, it is necessary to convert
the expression into reverse Polish notation. The name “zero-address” is given to
this type of computer because of the absence of an address field in the
computational instructions.

Instruction Cycle
An instruction cycle, also known as fetch-decode-execute cycle is the
basic operational process of a computer. This process is repeated
continuously by CPU from boot up to shut down of computer.
Following are the steps that occur during an instruction cycle:

1. Fetch the Instruction


The instruction is fetched from memory address that is stored in PC
(Program Counter) and stored in the instruction register IR. At the end of the
fetch operation, PC is incremented by 1 and it then points to the next
instruction to be executed.

2. Decode the Instruction


The instruction in the IR is executed by the decoder.

3. Read the Effective Address


If the instruction has an indirect address, the effective address is read from
the memory. Otherwise operands are directly read in case of immediate
operand instruction.

4. Execute the Instruction


The Control Unit passes the information in the form of control signals to the
functional unit of CPU. The result generated is stored in main memory or
sent to an output device.
The cycle is then repeated by fetching the next instruction. Thus in this way
the instruction cycle is repeated continuously.

2.2 Different addressing modes with example.


The operation field of an instruction specifies the operation to be performed.
This operation will be executed on some data which is stored in computer
registers or the main memory. The way any operand is selected during the
program execution is dependent on the addressing mode of the instruction.
The purpose of using addressing modes is as follows:

1. To give the programming versatility to the user.


2. To reduce the number of bits in addressing field of instruction.

3. Types of Addressing Modes


Below we have discussed different types of addressing modes one by one:
 Immediate Mode
In this mode, the operand is specified in the instruction itself. An
immediate mode instruction has an operand field rather than the address
field.
For example: ADD 7, which says Add 7 to contents of accumulator. 7 is
the operand here.

 Register Mode
In this mode the operand is stored in the register and this register is
present in CPU. The instruction has the address of the Register where the
operand is stored.

Advantages

 Shorter instructions and faster instruction fetch.


 Faster memory access to the operand(s)

Disadvantages

 Very limited address space


 Using multiple registers helps performance but it complicates the
instructions.

 Register Indirect Mode


In this mode, the instruction specifies the register whose contents give us
the address of operand which is in memory. Thus, the register contains the
address of operand rather than the operand itself.

 Auto Increment/Decrement Mode


In this the register is incremented or decremented after or before its value is
used.

 Direct Addressing Mode


In this mode, effective address of operand is present in instruction itself.

 Single memory reference to access data.


 No additional calculations to find the effective address of the operand.
For Example: ADD R1, 4000 - In this the 4000 is effective address of
operand.
NOTE: Effective Address is the location where operand is present.

 Indirect Addressing Mode


In this, the address field of instruction gives the address where the effective
address is stored in memory. This slows down the execution, as this
includes multiple memory lookups to find the operand.

 Displacement Addressing Mode


In this the contents of the indexed register is added to the Address part of
the instruction, to obtain the effective address of operand.
EA = A + (R), In this the address field holds two values, A(which is the base
value) and R(that holds the displacement), or vice versa.
 Relative Addressing Mode
It is a version of Displacement addressing mode.
In this the contents of PC(Program Counter) is added to address part of
instruction to obtain the effective address.
EA = A + (PC), where EA is effective address and PC is program counter.
The operand is A cells away from the current cell(the one pointed to by PC).

 Base Register Addressing Mode


It is again a version of Displacement addressing mode. This can be defined
as EA = A + (R), where A is displacement and R holds pointer to base
address.

 Stack Addressing Mode


In this mode, operand is at the top of the stack. For example: ADD, this
instruction will POP top two items from the stack, add them, and will
then PUSH the result to the top of the stack.

2.3 Representation of Integers in Computer system.

Integer Representation
Integers are whole numbers or fixed-point numbers with the radix
point fixed after the least-significant bit. They are contrast to real
numbers or floating-point numbers, where the position of the radix point
varies. It is important to take note that integers and floating-point numbers
are treated differently in computers. They have different representation and
are processed differently (e.g., floating-point numbers are processed in a so-
called floating-point processor). Floating-point numbers will be discussed
later.
Computers use a fixed number of bits to represent an integer. The
commonly-used bit-lengths for integers are 8-bit, 16-bit, 32-bit or 64-bit.
Besides bit-lengths, there are two representation schemes for integers:
1. Unsigned Integers: can represent zero and positive integers.
2. Signed Integers: can represent zero, positive and negative integers.
Three representation schemes had been proposed for signed integers:
a. Sign-Magnitude representation
b. 1's Complement representation
c. 2's Complement representation
You, as the programmer, need to decide on the bit-length and representation
scheme for your integers, depending on your application's requirements.
Suppose that you need a counter for counting a small quantity from 0 up to
200, you might choose the 8-bit unsigned integer scheme as there is no
negative numbers involved.

n-bit Unsigned Integers


Unsigned integers can represent zero and positive integers, but not negative
integers. The value of an unsigned integer is interpreted as "the magnitude of
its underlying binary pattern".
Example 1: Suppose that n=8 and the binary pattern is 0100 0001B, the
value of this unsigned integer is 1×2^0 + 1×2^6 = 65D.
Example 2: Suppose that n=16 and the binary pattern is 0001 0000 0000
1000B, the value of this unsigned integer is 1×2^3 + 1×2^12 = 4104D.
Example 3: Suppose that n=16 and the binary pattern is 0000 0000 0000
0000B, the value of this unsigned integer is 0.
An n-bit pattern can represent 2^n distinct integers. An n-bit unsigned
integer can represent integers from 0 to (2^n)-1, as tabulated below:
n Minimum Maximum
8 0 (2^8)-1 (=255)
16 0 (2^16)-1 (=65,535)
32 0 (2^32)-1 (=4,294,967,295) (9+ digits)
64 0 (2^64)-1 (=18,446,744,073,709,551,615) (19+ digits)

Signed Integers
Signed integers can represent zero, positive integers, as well as negative
integers. Three representation schemes are available for signed integers:
1. Sign-Magnitude representation
2. 1's Complement representation
3. 2's Complement representation
In all the above three schemes, the most-significant bit (msb) is called
the sign bit. The sign bit is used to represent the sign of the integer - with 0
for positive integers and 1 for negative integers. The magnitude of the
integer, however, is interpreted differently in different schemes.

n-bit Sign Integers in Sign-Magnitude Representation


In sign-magnitude representation:
 The most-significant bit (msb) is the sign bit, with value of 0 representing
positive integer and 1 representing negative integer.
 The remaining n-1 bits represents the magnitude (absolute value) of the
integer. The absolute value of the integer is interpreted as "the
magnitude of the (n-1)-bit binary pattern".
Example 1: Suppose that n=8 and the binary representation is 0 100
0001B.
Sign bit is 0 ⇒ positive
Absolute value is 100 0001B = 65D
Hence, the integer is +65D
Example 2: Suppose that n=8 and the binary representation is 1 000
0001B.
Sign bit is 1 ⇒ negative
Absolute value is 000 0001B = 1D
Hence, the integer is -1D
Example 3: Suppose that n=8 and the binary representation is 0 000
0000B.
Sign bit is 0 ⇒ positive
Absolute value is 000 0000B = 0D
Hence, the integer is +0D
Example 4: Suppose that n=8 and the binary representation is 1 000
0000B.
Sign bit is 1 ⇒ negative
Absolute value is 000 0000B = 0D
Hence, the integer is -0D
The drawbacks of sign-magnitude representation are:
1. There are two representations (0000 0000B and 1000 0000B) for the
number zero, which could lead to inefficiency and confusion.
2. Positive and negative integers need to be processed separately.

n-bit Sign Integers in 1's Complement Representation


In 1's complement representation:
 Again, the most significant bit (msb) is the sign bit, with value of 0
representing positive integers and 1 representing negative integers.
 The remaining n-1 bits represents the magnitude of the integer, as
follows:
o for positive integers, the absolute value of the integer is equal to "the
magnitude of the (n-1)-bit binary pattern".
o for negative integers, the absolute value of the integer is equal to "the
magnitude of the complement (inverse) of the (n-1)-bit binary pattern"
(hence called 1's complement).
Example 1: Suppose that n=8 and the binary representation 0 100 0001B.
Sign bit is 0 ⇒ positive
Absolute value is 100 0001B = 65D
Hence, the integer is +65D
Example 2: Suppose that n=8 and the binary representation 1 000 0001B.
Sign bit is 1 ⇒ negative
Absolute value is the complement of 000 0001B, i.e., 111 1110B = 126D
Hence, the integer is -126D
Example 3: Suppose that n=8 and the binary representation 0 000 0000B.
Sign bit is 0 ⇒ positive
Absolute value is 000 0000B = 0D
Hence, the integer is +0D
Example 4: Suppose that n=8 and the binary representation 1 111 1111B.
Sign bit is 1 ⇒ negative
Absolute value is the complement of 111 1111B, i.e., 000 0000B = 0D
Hence, the integer is -0D

Again, the drawbacks are:


1. There are two representations (0000 0000B and 1111 1111B) for zero.
2. The positive integers and negative integers need to be processed
separately.

n-bit Sign Integers in 2's Complement Representation


In 2's complement representation:
 Again, the most significant bit (msb) is the sign bit, with value of 0
representing positive integers and 1 representing negative integers.
 The remaining n-1 bits represents the magnitude of the integer, as
follows:
o for positive integers, the absolute value of the integer is equal to "the
magnitude of the (n-1)-bit binary pattern".
o for negative integers, the absolute value of the integer is equal to "the
magnitude of the complement of the (n-1)-bit binary pattern plus one"
(hence called 2's complement).
Example 1: Suppose that n=8 and the binary representation 0 100 0001B.
Sign bit is 0 ⇒ positive
Absolute value is 100 0001B = 65D
Hence, the integer is +65D
Example 2: Suppose that n=8 and the binary representation 1 000 0001B.
Sign bit is 1 ⇒ negative
Absolute value is the complement of 000 0001B plus 1, i.e., 111 1110B +
1B = 127D
Hence, the integer is -127D
Example 3: Suppose that n=8 and the binary representation 0 000 0000B.
Sign bit is 0 ⇒ positive
Absolute value is 000 0000B = 0D
Hence, the integer is +0D
Example 4: Suppose that n=8 and the binary representation 1 111 1111B.
Sign bit is 1 ⇒ negative
Absolute value is the complement of 111 1111B plus 1, i.e., 000 0000B +
1B = 1D
Hence, the integer is -1D
Computers use 2's Complement Representation for Signed
Integers
We have discussed three representations for signed integers: signed-magnitude,
1's complement and 2's complement. Computers use 2's complement in
representing signed integers. This is because:
1. There is only one representation for the number zero in 2's
complement, instead of two representations in sign-magnitude and 1's
complement.
2. Positive and negative integers can be treated together in addition and
subtraction. Subtraction can be carried out using the "addition logic".
Example 1: Addition of Two Positive Integers: Suppose that n=8,
65D + 5D = 70D
65D → 0100 0001B
5D → 0000 0101B(+
0100 0110B → 70D (OK)

Example 2: Subtraction is treated as Addition of a Positive and


a Negative Integers: Suppose that n=8, 5D - 5D = 65D + (-5D) = 60D
65D → 0100 0001B
-5D → 1111 1011B(+
0011 1100B → 60D (discard carry - OK)

Example 3: Addition of Two Negative Integers: Suppose that n=8, -


65D - 5D = (-65D) + (-5D) = -70D
-65D → 1011 1111B
-5D → 1111 1011B(+
1011 1010B → -70D (discard carry - OK)

Because of the fixed precision (i.e., fixed number of bits), an n-bit 2's
complement signed integer has a certain range. For example, for n=8, the
range of 2's complement signed integers is -128 to +127. During addition
(and subtraction), it is important to check whether the result exceeds this
range, in other words, whether overflow or underflow has occurred.
Example 4: Overflow: Suppose that n=8, 127D + 2D = 129D (overflow -
beyond the range)
127D → 0111 1111B
2D → 0000 0010B(+
1000 0001B → -127D (wrong)

Example 5: Underflow: Suppose that n=8, -125D - 5D = -


130D (underflow - below the range)
-125D → 1000 0011B
-5D → 1111 1011B(+
0111 1110B → +126D (wrong)

The following diagram explains how the 2's complement works. By re-
arranging the number line, values from -128 to +127 are represented
contiguously by ignoring the carry bit.

Range of n-bit 2's Complement Signed Integers


An n-bit 2's complement signed integer can represent integers from -2^(n-
1) to +2^(n-1)-1, as tabulated. Take note that the scheme can represent all
the integers within the range, without any gap. In other words, there is no
missing integers within the supported range.
n minimum maximum

8 -(2^7) (=-128) +(2^7)-1 (=+127)

16 -(2^15) (=-32,768) +(2^15)-1 (=+32,767)

32 -(2^31) (=-2,147,483,648) +(2^31)-1 (=+2,147,483,647)(9+ digits)

64 -(2^63) (=-9,223,372,036,854,775,808) +(2^63)-1 (=+9,223,372,036,854,775,


807)(18+ digits)

Decoding 2's Complement Numbers


1. Check the sign bit (denoted as S).
2. If S=0, the number is positive and its absolute value is the binary value
of the remaining n-1 bits.
3. If S=1, the number is negative. you could "invert the n-1 bits and plus
1" to get the absolute value of negative number.
Alternatively, you could scan the remaining n-1 bits from the right
(least-significant bit). Look for the first occurrence of 1. Flip all the bits
to the left of that first occurrence of 1. The flipped pattern gives the
absolute value. For example,
4. n = 8, bit pattern = 1 100 0100B
5. S = 1 → negative
6. Scanning from the right and flip all the bits to the left of the first
occurrence of 1 ⇒ 011 1100B = 60D

Hence, the value is -60D

2.4 Representation of Floating point numbers in computer


system.
A floating-point number (or real number) can represent a very large
(1.23×10^88) or a very small (1.23×10^-88) value. It could also represent
very large negative number (-1.23×10^88) and very small negative number (-
1.23×10^88), as well as zero, as illustrated:

A floating-point number is typically expressed in the scientific notation, with


a fraction (F), and an exponent (E) of a certain radix (r), in the form of F×r^E.
Decimal numbers use radix of 10 (F×10^E); while binary numbers use radix
of 2 (F×2^E).
Representation of floating point number is not unique. For example, the
number 55.66 can be represented as 5.566×10^1, 0.5566×10^2,
0.05566×10^3, and so on. The fractional part can be normalized. In the
normalized form, there is only a single non-zero digit before the radix point.
For example, decimal number 123.4567 can be normalized
as 1.234567×10^2; binary number 1010.1011B can be normalized
as 1.0101011B×2^3.
It is important to note that floating-point numbers suffer from loss of
precision when represented with a fixed number of bits (e.g., 32-bit or 64-
bit). This is because there are infinite number of real numbers (even within a
small range of says 0.0 to 0.1). On the other hand, a n-bit binary pattern can
represent a finite 2^n distinct numbers. Hence, not all the real numbers can
be represented. The nearest approximation will be used instead, resulted in
loss of accuracy.
It is also important to note that floating number arithmetic is very much less
efficient than integer arithmetic. It could be speed up with a so-called
dedicated floating-point co-processor. Hence, use integers if your application
does not require floating-point numbers.
In computers, floating-point numbers are represented in scientific notation
of fraction (F) and exponent (E) with a radix of 2, in the form of F×2^E.
Both E and F can be positive as well as negative. Modern computers adopt
IEEE 754 standard for representing floating-point numbers. There are two
representation schemes: 32-bit single-precision and 64-bit double-precision.

2.5 Biased exponent, IEEE format for single and double


precision numbers.
IEEE-754 32-bit Single-Precision Floating-Point Numbers
In 32-bit single-precision floating-point representation:
 The most significant bit is the sign bit (S), with 0 for positive numbers
and 1 for negative numbers.
 The following 8 bits represent exponent (E).
 The remaining 23 bits represents fraction (F).

Normalized Form
Let's illustrate with an example, suppose that the 32-bit pattern is 1 1000
0001 011 0000 0000 0000 0000 0000, with:
 S=1
 E = 1000 0001
 F = 011 0000 0000 0000 0000 0000
In the normalized form, the actual fraction is normalized with an implicit
leading 1 in the form of 1.F. In this example, the actual fraction is 1.011
0000 0000 0000 0000 0000 = 1 + 1×2^-2 + 1×2^-3 = 1.375D.
The sign bit represents the sign of the number, with S=0 for positive
and S=1 for negative number. In this example with S=1, this is a negative
number, i.e., -1.375D.
In normalized form, the actual exponent is E-127 (so-called excess-127 or
bias-127). This is because we need to represent both positive and negative
exponent. With an 8-bit E, ranging from 0 to 255, the excess-127 scheme
could provide actual exponent of -127 to 128. In this example, E-127=129-
127=2D.
Hence, the number represented is -1.375×2^2=-5.5D.

De-Normalized Form

Normalized form has a serious problem, with an implicit leading 1 for the
fraction, it cannot represent the number zero! Convince yourself on this!
De-normalized form was devised to represent zero and other numbers.
For E=0, the numbers are in the de-normalized form. An implicit leading 0
(instead of 1) is used for the fraction; and the actual exponent is always -
126. Hence, the number zero can be represented
with E=0 and F=0 (because 0.0×2^-126=0).
We can also represent very small positive and negative numbers in de-
normalized form with E=0. For example, if S=1, E=0, and F=011 0000 0000
0000 0000 0000. The actual fraction is 0.011=1×2^-2+1×2^-3=0.375D.
Since S=1, it is a negative number. With E=0, the actual exponent is -126.
Hence the number is -0.375×2^-126 = -4.4×10^-39, which is an extremely
small negative number (close to zero).

Summary
In summary, the value (N) is calculated as follows:
 For 1 ≤ E ≤ 254, N = (-1)^S × 1.F × 2^(E-127). These numbers are in the
so-called normalized form. The sign-bit represents the sign of the
number. Fractional part (1.F) are normalized with an implicit leading 1.
The exponent is bias (or in excess) of 127, so as to represent both
positive and negative exponent. The range of exponent is -126 to +127.
 For E = 0, N = (-1)^S × 0.F × 2^(-126). These numbers are in the so-
called denormalized form. The exponent of 2^-126 evaluates to a very
small number. Denormalized form is needed to represent zero
(with F=0 and E=0). It can also represent very small positive and negative
number close to zero.
 For E = 255, it represents special values, such as ±INF (positive and
negative infinity) and NaN (not a number). This is beyond the scope of
this article.
Example 1: Suppose that IEEE-754 32-bit floating-point representation
pattern is 0 10000000 110 0000 0000 0000 0000 0000.
Sign bit S = 0 ⇒ positive number
E = 1000 0000B = 128D (in normalized form)
Fraction is 1.11B (with an implicit leading 1) = 1 + 1×2^-1 + 1×2^-2 = 1.75D
The number is +1.75 × 2^(128-127) = +3.5D

Example 2: Suppose that IEEE-754 32-bit floating-point representation


pattern is 1 01111110 100 0000 0000 0000 0000 0000.
Sign bit S = 1 ⇒ negative number
E = 0111 1110B = 126D (in normalized form)
Fraction is 1.1B (with an implicit leading 1) = 1 + 2^-1 = 1.5D
The number is -1.5 × 2^(126-127) = -0.75D

Example 3: Suppose that IEEE-754 32-bit floating-point representation


pattern is 1 01111110 000 0000 0000 0000 0000 0001.
Sign bit S = 1 ⇒ negative number
E = 0111 1110B = 126D (in normalized form)
Fraction is 1.000 0000 0000 0000 0000 0001B (with an implicit leading 1) =
1 + 2^-23
The number is -(1 + 2^-23) × 2^(126-127) = -
0.500000059604644775390625 (may not be exact in decimal!)

Example 4 (De-Normalized Form): Suppose that IEEE-754 32-bit


floating-point representation pattern is 1 00000000 000 0000 0000 0000
0000 0001.
Sign bit S = 1 ⇒ negative number
E = 0 (in de-normalized form)
Fraction is 0.000 0000 0000 0000 0000 0001B (with an implicit leading 0) =
1×2^-23
The number is -2^-23 × 2^(-126) = -2×(-149) ≈ -1.4×10^-45

Exercises (Floating-point Numbers)

1. Compute the largest and smallest positive numbers that can be


represented in the 32-bit normalized form.
2. Compute the largest and smallest negative numbers can be represented
in the 32-bit normalized form.
3. Repeat (1) for the 32-bit denormalized form.
4. Repeat (2) for the 32-bit denormalized form.
Hints:
1. Largest positive number: S=0, E=1111 1110 (254), F=111 1111 1111
1111 1111 1111.
Smallest positive number: S=0, E=0000 00001 (1), F=000 0000 0000
0000 0000 0000.
2. Same as above, but S=1.
3. Largest positive number: S=0, E=0, F=111 1111 1111 1111 1111 1111.
Smallest positive number: S=0, E=0, F=000 0000 0000 0000 0000
0001.
4. Same as above, but S=1.
IEEE-754 64-bit Double-Precision Floating-Point Numbers

The representation scheme for 64-bit double-precision is similar to the 32-


bit single-precision:
 The most significant bit is the sign bit (S), with 0 for positive numbers
and 1 for negative numbers.
 The following 11 bits represent exponent (E).
 The remaining 52 bits represents fraction (F).

The value (N) is calculated as follows:


 Normalized form: For 1 ≤ E ≤ 2046, N = (-1)^S × 1.F × 2^(E-1023).
 Denormalized form: For E = 0, N = (-1)^S × 0.F × 2^(-1022). These are in
the denormalized form.
 For E = 2047, N represents special values, such
as ±INF (infinity), NaN (not a number).

More on Floating-Point Representation

There are three parts in the floating-point representation:


 The sign bit (S) is self-explanatory (0 for positive numbers and 1 for
negative numbers).
 For the exponent (E), a so-called bias (or excess) is applied so as to
represent both positive and negative exponent. The bias is set at half of
the range. For single precision with an 8-bit exponent, the bias is 127 (or
excess-127). For double precision with a 11-bit exponent, the bias is
1023 (or excess-1023).
The fraction (F) (also called the mantissa or significant) is composed of an
implicit leading bit (before the radix point) and the fractional bits (after the
radix point). The leading bit for normalized numbers is 1; while the leading
bit for denormalized numbers is 0.
Normalized Floating-Point Numbers
In normalized form, the radix point is placed after the first non-zero digit,
e,g., 9.8765D×10^-23D, 1.001011B×2^11B. For binary number, the leading
bit is always 1, and need not be represented explicitly - this saves 1 bit of
storage.
In IEEE 754's normalized form:
 For single-precision, 1 ≤ E ≤ 254 with excess of 127. Hence, the actual
exponent is from -126 to +127. Negative exponents are used to represent
small numbers (< 1.0); while positive exponents are used to represent
large numbers (> 1.0).
N = (-1)^S × 1.F × 2^(E-127)
 For double-precision, 1 ≤ E ≤ 2046 with excess of 1023. The actual
exponent is from -1022 to +1023, and
N = (-1)^S × 1.F × 2^(E-1023)
Take note that n-bit pattern has a finite number of combinations (=2^n),
which could represent finite distinct numbers. It is not possible to represent
the infinite numbers in the real axis (even a small range says 0.0 to 1.0 has
infinite numbers). That is, not all floating-point numbers can be accurately
represented. Instead, the closest approximation is used, which leads to loss
of accuracy.

References:

Books:

1. Computer System Architecture by Rao


2. Computer Organization with Architecture by Basu
1. Computer Organization and Architecture by Stallings

Online materials :
1. https://www.iare.ac.in/sites/default/files/PPT/CO%20Lecture%20Notes.pdf

2. http://www.c-jump.com/CIS77/CPU/InstrCycle/lecture.html

3. https://www.geeksforgeeks.org/different-instruction-cycles/

4. https://www3.ntu.edu.sg/home/ehchua/programming/java/datarepresentation.
html#:~:text=Computers%20use%20a%20fixed%20number,%2Dbit%20or%2064%
2Dbit.&text=Unsigned%20Integers%3A%20can%20represent%20zero,zero%2C%2
0positive%20and%20negative%20integers.

You might also like