0% found this document useful (0 votes)
78 views66 pages

Microprocessor and Assembly Language: Lecture-2-Integer Representation

The document discusses integer representation in microprocessors and assembly language. It covers terminology for integer data types and arithmetic operations. It also describes the encoding of integers in unsigned and two's complement forms, and the numeric ranges for different bit widths. Key points covered include the signed and unsigned interpretation of bit vectors, mapping between the two representations, and casting surprises that can occur between them in C programming.

Uploaded by

Ali Sarmad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views66 pages

Microprocessor and Assembly Language: Lecture-2-Integer Representation

The document discusses integer representation in microprocessors and assembly language. It covers terminology for integer data types and arithmetic operations. It also describes the encoding of integers in unsigned and two's complement forms, and the numeric ranges for different bit widths. Key points covered include the signed and unsigned interpretation of bit vectors, mapping between the two representations, and casting surprises that can occur between them in C programming.

Uploaded by

Ali Sarmad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 66

MICROPROCESSOR AND

ASSEMBLY LANGUAGE
LECTURE-2-INTEGER REPRESENTATION

MUHAMMAD HAFEEZ
DEPARTMENT OF COMPUTER SCIENCE
GC UNIVERSITY LAHORE
TERMINOLOGY FOR
INTEGER DATA AND
ARITHMETIC OPERATIONS
C RANGES FOR 32-BIT
PROGRAMS
C RANGES FOR 64-BIT
PROGRAMS
GUARANTEED RANGES FOR
C PROGRAMS
ENCODING INTEGERS

 We
  write a bit vector as either , to denote
the entire vector, or as [x w−1 , x w−2 , . . . ,
x 0 ] to denote the individual bits within the
vector. Treating as a number written in
binary notation, we obtain the unsigned
interpretation of . In this encoding, each bit
xi has value 0 or 1, with the latter case
indicating that value 2i should be included
as part of the numeric value.
ENCODING INTEGERS
Unsigned
w 1
B2U(X )   xi  2 i

i0

Two’s Complement
w2
 xi 2
w1 i
B2T (X)   xw1 2 
i0

Sign
Bit
UNSIGNED RANGE OF VALUES

C short 2 bytes long

Decimal Hex Binary


x 15213 3B 6D 00111011 01101101
y -15213 C4 93 11000100 10010011

Sign Bit
For 2’s complement, most significant bit
indicates sign
• 0 for nonnegative
• 1 for negative
ENCODING EXAMPLE
x = 15213: 00111011 01101101
y = -15213: 11000100 10010011
Weight 15213 -15213
1 1 1 1 1
2 0 0 1 2
4 1 4 0 0
8 1 8 0 0
16 0 0 1 16
32 1 32 0 0
64 1 64 0 0
128 0 0 1 128
256 1 256 0 0
512 1 512 0 0
1024 0 0 1 1024
2048 1 2048 0 0
4096 1 4096 0 0
8192 1 8192 0 0
16384 0 0 1 16384
-32768 0 0 1 -32768
Sum 15213 -15213
NUMERIC RANGES
Unsigned Values Two’s Complement Values
 UMin = 0  TMin = –2w–1
000…0 100…0
 UMax = 2w – 1  TMax = 2w–1 –
111…1 1
011…1
Other Values
 Minus 1
Values for W = 16 111…1
Decimal Hex Binary
UMax 65535 FF FF 11111111 11111111
TMax 32767 7F FF 01111111 11111111
TMin -32768 80 00 10000000 00000000
-1 -1 FF FF 11111111 11111111
0 0 00 00 00000000 00000000
NUMERIC RANGES
W
8 16 32 64
UMax 255 65,535 4,294,967,295 18,446,744,073,709,551,615
TMax 127 32,767 2,147,483,647 9,223,372,036,854,775,807
TMin -128 -32,768 -2,147,483,648 -9,223,372,036,854,775,808

 Observations C Programming
 |TMin | = TMax +
  #include <limits.h>
1  Declares constants, e.g.,
 Asymmetric range   ULONG_MAX
  LONG_MAX
 UMax = 2*
  LONG_MIN
TMax + 1
 Values platform-specific
SIGNED AND UNSIGNED
NUMERIC VALUES
X B2U(X) B2T(X)  Equivalence
0000 0 0  Same encodings for
0001 1 1 nonnegative values
0010 2 2  Uniqueness
0011 3 3  Every bit pattern
0100 4 4 represents unique integer
0101 5 5 value
0110 6 6  Each representable integer
0111 7 7 has unique bit encoding
1000 8 –8   Can Invert Mappings
1001 9 –7  U2B(x) = B2U-1(x)
1010 10 –6  Bit pattern for unsigned
1011 11 –5 integer
1100 12 –4  T2B(x) = B2T-1(x)
1101 13 –3  Bit pattern for two’s comp
1110 14 –2 integer
1111 15 –1
MAPPING BETWEEN SIGNED
AND UNSIGNED NUMBERS
Two’s Complement T2U Unsigned

x T2B B2U ux
X

Maintain Same Bit Pattern

Unsigned U2T Two’s Complement


ux U2B B2T x
X

Maintain Same Bit Pattern

 Define mappings been unsigned and two’s


complement numbers based on their bit-level
representations
Mapping Signed  Unsigned
Bits Signed Unsigned

0000 0 0
0001 1 1
0010 2 2
0011 3 3
0100 4 4
0101 5 5
0110 6 6
0111 7 T2U 7
1000 -8 8
1001 -7 9
U2T
1010 -6 10
1011 -5 11
1100 -4 12
1101 -3 13
1110 -2 14
1111 -1 15
Mapping Signed  Unsigned
Bits Signed Unsigned

0000 0 0
0001 1 1
0010 2 2
0011 3 = 3
0100 4 4
0101 5 5
0110 6 6
0111 7 7
1000 -8 8
1001 -7 9
1010 -6 10
1011 -5 +16 11
1100 -4 12
1101 -3 13
1110 -2 14
1111 -1 15

The difference is 2w
Explanation of Casting Surprises
 2’s Comp.  UMax
Unsigned UMax – 1
 Ordering Inversion
 Negative  Big TMax + 1 Unsigned
Positive TMax TMax Range

2’s Comp.
Range 0 0
–1
–2

TMin
Signed vs. Unsigned in C
 Constants
 By default are considered to be signed integers

 Unsigned if have “U” as suffix

0U, 4294967259U
 Casting
 Explicit casting between signed & unsigned same as U2T and
T2U
int tx, ty;
unsigned ux, uy;
tx = (int) ux;
uy = (unsigned) ty;
 Implicit casting also occurs via assignments and procedure
calls
tx = ux;
uy = ty;
Signed vs. Unsigned in C
 T2U32(−1) = UMax32 = 232 − 1
 U2T32 (231 ) = − 231 = TMin32
 The C standard does not specify a particular
representation of signed numbers, almost all
machines use two’s complement. Generally,
most numbers are signed by default. For
example, when declaring a constant such as
12345 or 0x1A2B, the value is considered
signed. Adding character ‘U’ or ‘u’ as a
suffix creates an unsigned constant, e.g.,
12345U or 0x1A2Bu.

 Conversion between signed and unsigned could be


implicit or explicit. Underlying bit pattern does not
Signed vs. Unsigned in C
 When printing numeric values with printf, the
directives %d, %u, and %x are used to print a number
as a signed decimal, an unsigned decimal, and in
 Hexadecimal format, respectively.
Signed vs. Unsigned in C
 The behavior of C to handle of expressions containing
combinations of signed and unsigned quantities. When
an operation is performed where one operand is signed
and the other is unsigned, C implicitly casts the
signed argument to unsigned and performs the
operations assuming the numbers are nonnegative.

 This convention makes little difference for standard


arithmetic operations.

 But it leads to nonintuitive results for relational


operators such as < and >.

 The comparison -1 < 0U. Since the second operand is


unsigned, the first one is implicitly cast to
unsigned, and hence the expression is equivalent to
the comparison 4294967295U < 0U
Signed vs. Unsigned in C
 In C header file limits.h
Casting Surprises
Explanation of Casting Surprises
 2’s Comp.  UMax
Unsigned UMax – 1
 Ordering Inversion
 Negative  Big TMax + 1 Unsigned
Positive TMax TMax Range

2’s Comp.
Range 0 0
–1
–2

TMin
Why Should I Use Unsigned?
 Don’t Use Just Because Number Nonnegative
 Easy to make mistakes
unsigned i;
for (i = cnt-2; i >= 0; i--)
a[i] += a[i+1];
 Can be very subtle
#define DELTA sizeof(int)
int i;
for (i = CNT; i-DELTA >= 0; i-= DELTA)
. . .
 Do Use When Performing Modular Arithmetic
 Multiprecision arithmetic
 Do Use When Using Bits to Represent Sets
 Logical right shift, no sign extension
Why Should I Use Unsigned?
Expanding of a Bit-Representation
of a number
 One common operation is to convert between integers
having different word sizes while retaining the same
numeric value.
 Of course, this may not be possible when the
destination data type is too small to represent the
desired value.
 Converting from a smaller to a larger data type ,
however , should always be possible.

 To convert an unsigned number to a larger data type,


we can simply add leading zeros to the representation;
this operation is known as zero extension.

 For converting a two’s complement number to a larger


data type, the rule is to perform a sign extension,
 Adding copies of the most significant bit to the
representation.
Sign Extension
 Task:
 Given w-bit signed integer x
 Convert it to w+k-bit integer with same value
 Rule:
 Make k copies of sign bit:
 X  = xw–1 ,…, xw–1 , xw–1 , xw–2 ,…, x0
k copies of MSB w
X • • •

• • •

X • • • • • •
k w
Sign Extension Example
short int x = 15213;
int ix = (int) x;
short int y = -15213;
int iy = (int) y;

Decimal Hex Binary


x 15213 3B 6D 00111011 01101101
ix 15213 00 00 3B 6D 00000000 00000000 00111011 01101101
y -15213 C4 93 11000100 10010011
iy -15213 FF FF C4 93 11111111 11111111 11000100 10010011

 Converting from smaller to larger integer


data type
 C automatically performs sign extension
Output on 32-bit machine:
Explanation
 Although the two’s-complement representation of
−12,345 and the unsigned representation of 53,191 are
identical for a 16-bit word size, they differ for a
32-bit word size.

 −12,345 has hexadecimal representation 0xFFFFCFC7,


while 53,191 has hexadecimal representation
0x0000CFC7.

 The former has been sign extended—16 copies of the


most significant bit 1, having hexadecimal
representation 0xFFFF, have been added as leading
bits. The latter has been extended with 16 leading
zeros, having hexadecimal representation 0x0000.
Explanation
 One point worth making is that the relative order of
conversion from one data size to another and between
unsigned and signed can affect the behavior of a
program. Consider the following code:

 When run on a big-endian machine, this code causes the


following output to be printed:
 uy = 4294954951: ff ff cf c7
Explanation
 This shows that when converting from short to
unsigned, we first change the size and then from
signed to unsigned. That is, (unsigned) sx is
equivalent to (unsigned) (int) sx, evaluating to
4,294,954,951, not (unsigned) (unsigned short) sx,
which evaluates to 53,191. Indeed this convention is
required by the C standards.
Truncating Numbers
 If we want to reduce the bits in a number, look at the
code:

 On a typical 32-bit machine, when we cast x to be


short, we truncate the 32-bit int to be a 16-bit short
int.
Truncating Numbers
 
 When truncating a w-bit number x = [x w−1 , x
w−2 , . . . , x0 ]to a k-bit number, we drop
the high-order w − k bits, giving a bit
vector = [xk−1 , xk−2 , . . . , x0 ].

 For an unsigned number x, the result of truncating it


to k bits is equivalent to computing x mod 2k
Complement & Increment
 Claim: Following Holds for 2’s Complement
~x + 1 == -x
 Complement
 Observation: ~x + x == 1111…112 == -1

x 10011101

+ ~x 01100010

 Increment -1 11111111
 ~x + x == -1
 ~x + x + (-x + 1) == -1 + (-x + 1)
 ~x + 1 == -x
 Warning: Be cautious treating int’s as integers
 OK here
Comp. & Incr. Examples
x = 15213
Decimal Hex Binary
x 15213 3B 6D 00111011 01101101
~x -15214 C4 92 11000100 10010010
~x+1 -15213 C4 93 11000100 10010011
y -15213 C4 93 11000100 10010011
0
Decimal Hex Binary
0 0 00 00 00000000 00000000
~0 -1 FF FF 11111111 11111111
~0+1 0 00 00 00000000 00000000
Unsigned Addition
 Consider two nonnegative integers x and y,
such that 0 ≤ x, y ≤ 2w − 1. Each of these
numbers can be represented by w-bit unsigned
numbers. If we compute their sum, however, we
have a possible range 0 ≤ x + y ≤ 2w+1 − 2.

 Representing this sum could require w + 1 bits.


Unsigned Addition
 A plot of the function x + y when x and y
have 4-bit representations. The arguments
(shown on the horizontal axes) range from 0
to 15, but the sum ranges from 0 to 30. The
shape of the function is a sloping plane (the
function is linear in both dimensions).
Unsigned Addition
Unsigned Addition
 If we were to maintain the sum as a w+1-bit
number and add it to another value, we may
require w + 2 bits, and so on. This continued
“word size inflation” means we cannot place
any bound on the word size required to fully
represent the results of arithmetic
operations.

 Unsigned arithmetic can be viewed as a form of modular


arithmetic. Unsigned addition is equivalent to
computing the sum modulo 2w .
Unsigned Addition
 This value can be computed by simply discarding the
high-order bit in the w+1-bit representation of
 x + y. For example, consider a 4 bit number
representation with x = 9and y = 12, having bit
representations [1001]and [1100], respectively. Their
sum is 21, having a 5-bit representation [10101]. But
if we discard the high-order bit, we get [0101], that
is, decimal value 5. This matches the value 21 mod 16
= 5.

 In general, we can see that if x + y < 2 w , the


leading bit in the w+1-bit representation of the sum
will equal 0, and hence discarding it will not change
the numeric value.
Unsigned Addition
 On the other hand, if 2 w ≤ x + y < 2 w+1 , the leading
bit in the w+1-bit representation of the sum will
equal 1, and hence discarding it is equivalent to
subtracting 2w from the sum.

 This will give us a value in the range 0 ≤ x + y − 2w <


2w+1 − 2w = 2w ,which is precisely the modulo 2w sum of x
and y.
Unsigned Addition
u • • •
Operands: w bits
+ v • • •
True Sum: w+1 bits u+v • • •

Discard Carry: w bits UAddw(u , v) • • •

 Standard Addition Function


 Ignores carry output
 Implements Modular Arithmetic
s = UAddw(u , v) = u + v mod 2w

 u  v u  v  2w
UAdd w (u,v)   w
u  v  2 u  v  2w
Unsigned Addition
 An arithmetic operation is said to overflow when the
full integer result cannot fit within the word size
limits of the data type.

 Overflow occurs when the two operands sum to 2w or


more.

 The fig. shows a plot of the unsigned addition


function for word size w = 4. The sum is computed
modulo 24 = 16. When x + y < 16, there is no overflow,
and its simply x + y. This is shown as the region
forming a sloping plane labeled “Normal.”
 When x + y ≥ 16,the addition overflows, having the
effect of decrementing the sum by 16. This is
 shown as the region forming a sloping plane labeled
“Overflow.”
Unsigned Addition
Unsigned Addition
 That cliff is due to zero, and then we build up sum
again
 When executing C programs, overflows are not signaled
as errors. At times, however, we might wish to
determine whether overflow has occurred. Suppose we
compute s= x + y, and we wish to determine whether s
equals x + y. We claim that overflow has occurred if
and only if s < x (or equivalently,s < y).
 To see this, observe that x + y ≥ x, and hence if s
did not overflow, we will surely have s ≥ x. On the
other hand, if s did overflow, we have s = x + y − 2w .
 Given that y < 2w , we have y − 2w < 0, and hence s = x
+ (y − 2w ) < x. In our earlier example, we saw that
9+12 = 5. We can see that overflow occurred, since 5 <
9.
Two’s Complement Addition

 With two’s-complement addition, we must decide what to


do when the result is either too large (positive) or
too small (negative) to represent.

 Given integer values x and y in the range −2w−1 ≤ x, y ≤


2w−1 − 1, their sum is in the range
− 2w ≤ x + y ≤ 2w − 2, potentially requiring w + 1 bits
to represent exactly. As before, we avoid ever-expanding
data sizes by truncating the representation to w bits.

The w-bit two’s-complement sum of two numbers has the


exact same bit-level representation as the unsigned sum.
In fact ,most computers use the same machine instruction
to perform either unsigned or signed addition.
Two’s Complement Addition
u • • •
Operands: w bits
+ v • • •
True Sum: w+1 bits u+v • • •

Discard Carry: w bits TAddw(u , v) • • •

 TAdd and UAdd have Identical Bit-Level


Behavior
 Signed vs. unsigned addition in C:
int s, t, u, v;
s = (int) ((unsigned) u + (unsigned) v);
t = u + v
 Will give s == t
Characterizing TAdd
 Functionality True Sum
0 111…1 2w–1
 True sum PosOver
TAdd Result
requires w+1 0 100…0 2w –1 011…1
bits
 Drop off MSB 0 000…0 0 000…0

 Treat
PosOver 1 011…1
remaining bits
TAdd(u , v)
–2w –1–1 100…0

as 2’s comp. 1 000…0 –2w NegOver


> 0 integer
v
<0 u  v  2 w 1 u  v  TMin w (NegOver)

TAdd w (u,v)  u  v TMinw  u  v  TMax w
<0 >0 u  v  2 w 1
u  TMax w  u  v (PosOver)
NegOver
Visualizing 2’s Comp. Addition
NegOver

 Values
 4-bit two’s comp. TAdd4(u , v)
 Range from -8 to
+7
 Wraps Around 8

6
 If sum  2 w–1 4

 Becomes 2

0
negative -2 4
6

 At most once -4
0
2

-6
 If sum < –2w–1 -8 -4
-2

v
-8
 Becomes positive -6
-4
-2
0
-6

2 -8
 At most once u 4
6 PosOver
Unsigned Multiplication in C
u • • •
Operands: w bits
* v • • •
True Product: 2*w bits u · v • • • • • •

UMultw(u , v) • • •
Discard w bits: w bits

 Standard Multiplication
Function
 Ignores high order w bits
 Implements Modular
Arithmetic
UMultw(u , v) = u · v mod 2w
Signed Multiplication in C
u • • •
Operands: w bits
* v • • •
True Product: 2*w bits u · v • • • • • •

TMultw(u , v) • • •
Discard w bits: w bits

 Standard Multiplication
Function
 Ignores high order w bits
 Some of which are
different for signed vs.
unsigned multiplication
 Lower bits are the same
Multiplying By Constants
 On most machines integer multiplication is too slow,
requiring 10 clock cycles.
 Whereas other integer operations—such as addition,
subtraction, bit-level operations, and shifting—
require only 1 clock cycle.
 As a consequence, one important optimization used by
compilers is to attempt to replace multiplications by
constant factors with combinations of shift and
addition operations.

 We will first consider the case of multiplying by a


power of 2, and then generalize this to arbitrary
constants.
Multiplying By Constants
 Let x be the unsigned integer represented by bit
pattern [x w−1 , x w−2 , . . . , x 0 ]. Then for any k ≥
0, we claim the bit-level representation of x2k is
given by

 [x w−1 , x w−2 , . . . , x 0 , 0, . . . , 0], where k


zeros have been added to the right.
Power-of-2 Multiply with Shift
 Operation
 u << k gives u * 2k
 Both signed and unsigned k
u • • •
Operands: w bits
* 2k 0 ••• 0 1 0 ••• 0 0
True Product: w+k bits u · 2k • • • 0 ••• 0 0

UMultw(u , 2k) ••• 0 ••• 0 0


Discard k bits: w bits
TMultw(u , 2k)

 Compiler generates this code automatically


Multiplying By Constants
 Given that integer multiplication is much more costly
than shifting and adding, many C compilers try to
remove many cases where an integer is being multiplied
by a constant with combinations of shifting, adding,
and subtracting.

 For example, suppose a program contains the


expression x*14. Recognizing that 14 = 2 3 + 2 2 + 2 1 ,
the compiler can rewrite the multiplication as (x<<3)
+ (x<<2) + (x<<1), replacing one multiplication with
three shifts and two additions. The two computations
will yield the same result, regardless of whether x is
unsigned or two’s complement, and even if the
multiplication would cause an overflow.
Multiplying By Constants
 Even better,the compiler can also use the property 14
= 2 4 − 2 1 to rewrite the multiplication as (x<<4) -
(x<<1), requiring only two shifts and a subtraction.
Dividing by Power of 2
 Integer division on most machines is even slower than
integer multiplication requiring 30 or more clock
cycles.
 Dividing by a power of 2 can also be performed using
shift operations, but we use a right shift rather than
a left shift. The two different shifts—logical and
arithmetic—serve this purpose for unsigned and two’s-
complement numbers, respectively.

Integer division always rounds toward zero. For x ≥ 0


and y > 0, the result should be floor(x/y), where for
any real number a, floor(a) is defined to be the unique
integer a’ such that a’ ≤ a < a’ + 1. As
examples,floor(3.14) = 3, floor(−3.14) = −4, and
Floor(3) = 3.
Dividing by Power of 2
 Consider the effect of applying a logical right shift
by k to an unsigned number. We claim this gives the
same result as dividing by 2k .
Unsigned Power-of-2 Divide with Shift
 Quotient of Unsigned by Power of 2
 u >> k gives  u / 2k 
 Uses logical shift k
u ••• ••• Binary Point
Operands:
/ 2k 0 ••• 0 1 0 ••• 0 0
Division: u / 2k 0 ••• ••• . •••

Result:  u / 2k  0 ••• •••

Division Computed Hex Binary


x 15213 15213 3B 6D 00111011 01101101
x >> 1 7606.5 7606 1D B6 00011101 10110110
x >> 4 950.8125 950 03 B6 00000011 10110110
x >> 8 59.4257813 59 00 3B 00000000 00111011
Dividing by Power of 2
 Now Consider the fact of performing arithmetic shift
on two’s complement number.
 For a positive number, we have 0 as the most
significant bit, and so the effect is the same as for
a logical right shift. Thus, an arithmetic right shift
by k is the same as division by 2k for a nonnegative
number.
Dividing by Power of 2
 The effect of applying arithmetic right shift to a 16-
bit representation of −12,340 for different shift
amounts. As we can see,the result is almost the same
as dividing by a power of 2.For the case when no
rounding is required (k = 1), the result is correct.
But when rounding is required, shifting causes the
result to be rounded downward rather than toward zero,
as should be the convention. For example, the
expression -7/2 should yield -3 rather than -4.
Dividing by Power of 2
 We can correct for this improper rounding by “biasing”
the value before shifting. This technique exploits the
property that ceiling(x/y) = floor((x + y − 1)/y) for
integers x and y such that y > 0.
QUESTIONS

 ??????????????????????????

You might also like