0% found this document useful (0 votes)
19 views28 pages

Add04 Numbers

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views28 pages

Add04 Numbers

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Chapter 4
Number Representations
SKEE2263 Digital Systems

Mun’im/Ismahani/Izam

{[email protected],[email protected],[email protected]}

January 28, 2017


Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Table of Contents

1 Fundamentals

2 Signed Numbers

3 Fixed-Point Numbers

4 Floating Point
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Taxonomy of Number Systems

Numbers

Integers Reals

Fixed Floating
Unsigned Signed
Point Point

Signed- Ones' Two's


Magnitude Complement Complement
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Integers

Number Number
Machine
of bits of values
4 16 Intel 4004
8 256 8080, 6800
16 65536 PDP11, 8086, 68000
32 4 × 109 68020, VAX11, IEEE single
48 1 × 1014 Unisys
64 1.8 × 1019 Cray, IEEE double
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Integers

Value for integer bit pattern:


N
X −1
Vunsigned = bi × 2i
i=0

Example 101102 :

101102 = 1 × 24 + 0 × 23 + 1 × 22 + 1 × 21 + 0 × 20
= 2210
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Signed-Magnitude

Value for N -bit signed-magnitude pattern is:


N
X −2
VSM = (−1)bN −1 × bi × 2i
i=0

Example 1010SM :

VSM = (−1)b3 × [b2 × 22 + b1 × 21 + b0 × 20 ]


= (−1)1 × [0(4) + 1(2) + 0(1)]
= −1 × 2
= −2
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Signed-Magnitude

Signed Integer Signed-Magnitude


+5 0000 0101
+4 0000 0100
+3 0000 0011
+2 0000 0010
+1 0000 0001
0 0000 0000
1000 0000
-1 1000 0001
-2 1000 0010
-3 1000 0011
-4 1000 0100
-5 1000 0101
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Ones’ Complement

Value for N -bit ones’ complement pattern is:


N
X −2
V1C = −bN −1 2N −1 + bi × 2i + bN −1
i=0

Example 10101C :

V1C = −b3 × 23 + b2 × 22 + b1 × 21 + b0 × 20 + b3
= −1(8) + 0(4) + 1(2) + 0(1) + 1
= −8 + 2 + 1
= −5
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Ones’ Complement

Signed Integer Ones’ Complement


+127 0111 1111
+126 0111 1110
... ...
+2 0000 0010
+1 0000 0001
0 0000 0000
1111 1111
-1 1111 1110
-2 1111 1101
-3 1111 1100
... ...
-126 1000 0001
-127 1000 0000
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Two’s Complement

Value for N -bit Two’s complement pattern is:


N
X −2
V2C = −bN −1 2N −1 + bi × 2i
i=0

Example 10102C :

V2C = −b3 × 23 + b2 × 22 + b1 × 21 + b0 × 20
= −1(8) + 0(4) + 1(2) + 0(1)
= −8 + 2
= −6
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Twos’ Complement

Signed Integer Twos’ Complement


+127 0111 1111
+126 0111 1110
... ...
+2 0000 0010
+1 0000 0001
0 0000 0000
-1 1111 1111
-2 1111 1110
... ...
-126 1000 0010
-127 1000 0001
-128 1000 0000
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Sign Extension

convert a number to a larger format.


just copy the sign bit to fill the new “high order” bits

+ 100 in 8-bit two’s-complement binary 0110 0100


+ 100 in 16-bit two’s-complement binary 0000 0000 0110 0100
- 100 in 8-bit two’s-complement binary 1001 1100
- 100 in 16-bit two’s-complement binary 1111 1111 1001 1100
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Offset binary

a.k.a. biased-K representation


Variation of two’s complement
Uses a value K as biasing value
Applications:
Exponent of floating-point number (biased-127 or
biased-1023)
Analog interfacing
Excess-3 code (actual value = binary - 3)
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Comparing Number Systems

Decimal Signed- One’s Two’s Offset


Magnitude Complement Complement Binary
7 0111 0111 0111 1111
6 0110 0110 0110 1110
5 0101 0101 0101 1101
4 0100 0100 0100 1100
3 0011 0011 0011 1011
2 0010 0010 0010 1010
1 0001 0001 0001 1001
0 0000 0000 0000 1000
-0 1000 1111 – –
-1 1001 1110 1111 0111
-2 1010 1101 1110 0110
-3 1011 1100 1101 0101
-4 1100 1011 1100 0100
-5 1101 1010 1011 0011
-6 1110 1001 1010 0010
-7 1111 1000 1001 0001
-8 – – 1000 0000
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Signed Systems Compared

Unsigned Signed- Ones’ Two’s


Magnitude Complement Complement
Smallest 0 −(2n−1 − 1) −(2n−1 − 1) −2n−1
n
Largest 2 −1 +(2n−1 − 1) +(2n−1 − 1) +(2n−1
− 1)
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Real Numbers

Number System Format Characteristics


Fixed-point ±i.f Low-precision
Rational ±p/q Difficult to work with
Floating-point ±m · be Most common way to handle reals
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Fixed-Point Numbers

The general expression for an N -bit fixed point 2’s complement


PN −2
−bN −1 2N −1 + i=0 bi × 2i
x=
2f
where:
N = total #bits
f = #bits in fraction (0 ≤ f ≤ N − 1)
N-1 0

S int frac

imaginary binary point


Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Expression for Two’s Comp.

Value of N -bit two’s complement integer, f = 0

PN −2
−bN −1 2N −1 + i=0 bi × 2i
x=
20
N −1
= −bN −1 2 + bN −2 2N −2 + · · · + b1 21 + b0 20
N-1 0

S int

imaginary binary point


Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Expression for Q-Format

Value of N -bit fixed point, f = N − 1


PN −2
−bN −1 2N −1 + i=0 bi × 2i
x=
2N −1
= −b0 + b−1 2 + b−2 2−2 + · · · + b−(N −1) 2−(N −1)
−1

N-1 0

S frac

imaginary binary point


Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

N = 8, f = 4
Weights −23 22 21 20 · 2−1 2−2 2−3 2−4
Bit value 0 1 0 1 · 1 1 0 0
0101.11002 = 22 + 20 + 2−1 + 2−2
= 4 + 1 + 0.5 + 0.25
= 5.7510

OR
Weights −27 26 25 24 23 22 21 20
÷24
Bit value 0 1 0 1 1 1 0 0
x = (26 + 24 + 23 + 22 ) ÷ 24
= (64 + 16 + 8 + 4) ÷ 16
= 5.7510
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

N = 8, f = 7 → Q7f ormat
Weights −20 2−1 2−2 2−3 2−4 2−5 2−6 2−7
Bit value 0 1 1 1 1 1 1 1

max = 2−1 + 2−2 + 2−3 + 2−4 + 2−5 + 2−6 + 2−7


= (26 + 25 + 24 + 23 + 22 + 21 + 20 ) ÷ 27
= 127/128
= 0.9921875

Weights −20 2−1 2−2 2−3 2−4 2−5 2−6 2−7


Bit value 1 0 0 0 0 0 0 0

min = −20
= −1
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Multiplying Q15 Numbers

15 0

Q15 S

x Q15 S

31 16 15 0

Q30 S S

31 16 15 14 0

Q30 S S r

+ 1 00 0000 0000 0000

rounding by addition a '1' here


Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

What is Floating Point?

5.2510 = 101.01 × 20
= 10.101 × 21
= 1.0101 × 22 ←
= 0.10101 × 23

Binary point “floats” to a pre-defined position


Process is called normalization
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

Floating Point Parts

- 2
- 5 . 4 3 2 1 × 1 0
Sign of Exponent
mantissa
Sign of
exponent
Mantissa Radix

±X = m × be
where m = mantissa, b = number base and e = exponent.
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

IEEE Single-Precision Format

31 30 23 22 0

S 8-bit exp 23-bit frac

±X = (−1)s × 1.m × 2e−127


Sign field: 0 for positive numbers (-10 = +1)
1 for negative numbers (-11 = -1).
Exponent field: Unsigned 8 bit, biased-127.
Mantissa field: Bits to the right of normalized binary number.
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

IEEE Single-Precision Format

1 1000 0001 110 0000 0000 0000 0000 0000

X = (−1)1 × 1.112 × 2129−127


= −1 × 1.7510 × 22
= −7
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

IEEE Double-Precision Format

Double precision is more common.


63 62 52 51 0

S 11-bit exp 52-bit frac

±X = (−1)s × 1.m × 2e−1023


public class TryFP {
public static void main(String[ ] args) {
double d = 1/3.; // Java likes double-prec more
float f = 1f/3f; // Must force use of single-prec
System.out.println("Value of d="+d);
System.out.println("Value of f="+f);
}
}
Value of d=0.3333333333333333
Value of f=0.33333334
Fundamentals Signed Numbers Fixed-Point Numbers Floating Point

FX vs FP

Fixed Point Arithmetic Floating-Point Arithmetic


Simple circuit Complex circuit (due to rounding
and normalization)
Small area and faster Large area and slower
Less accurate (the result is trun- More accurate (high precision)
cated if it exceeds the size)
Smaller range of values can be Wider range of values can be han-
handled dled

You might also like