0% found this document useful (0 votes)
9 views

Lecture4

Computer Archtecture - Stack, Heap & Architecture

Uploaded by

minulo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Lecture4

Computer Archtecture - Stack, Heap & Architecture

Uploaded by

minulo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 154

CSC 252/452: Computer Organization

Fall 2024: Lecture 4

Instructor: Yanan Guo

Department of Computer Science


University of Rochester
Carnegie Mellon

Announcement
• Programming Assignment 1 is out
• Details:
https://www.cs.rochester.edu/courses/252/fall2024/labs/
assignment1.html
• Due on Sep. 16th, 11:59 PM
• You have 3 slip days
• Office Hour Location Has Changed
• For office hours on Thursday and Friday
• Check course website

2
Fractional Binary Numbers

2i
2i-1

4
••• 2
1

bi bi-1 ••• b2 b1 b0 b-1 b-2 b-3 ••• b-j


1/2
1/4 •••
1/8

2-j

3
Fixed-Point Representation
• Binary point stays fixed
• Fixed interval between representable
numbers
• The interval in this example is 0.2510

4
Fixed-Point Representation
Decimal Binary
• Binary point stays fixed 0 0000.
• Fixed interval between representable 1
2
0001.
0010.
numbers 3 0011.
• The interval in this example is 0.2510 4 0100.
5 0101.
6 0110.
7 0111.
8 1000.
9 1001.
10 1010.
11 1011.
12 1100.
13 1101.
14 1110.
15 1111.
4
Fixed-Point Representation
Decimal Binary
• Binary point stays fixed 0 0000.
• Fixed interval between representable 1
2
0001.
0010.
numbers 3 0011.
• The interval in this example is 0.2510 4 0100.
5 0101.
6 0110.
0 1 2 3 4 5 6 7 …. 15 7 0111.
8 1000.
9 1001.
10 1010.
11 1011.
12 1100.
13 1101.
14 1110.
15 1111.
4
Fixed-Point Representation
Decimal Binary
• Binary point stays fixed 0 00.00
• Fixed interval between representable 0.25
0.5
00.01
00.10
numbers 0.75 00.11
• The interval in this example is 0.2510 1 01.00
1.25 01.01
1.5 01.10
0 1 2 3 4 5 6 7 …. 15 1.75 01.11
2 10.00
2.25 10.01
2.5 10.10
2.75 10.11
3 11.00
3.25 11.01
3.5 11.10
3.75 11.11
4
Fixed-Point Representation
Decimal Binary
• Binary point stays fixed 0 00.00
• Fixed interval between representable 0.25
0.5
00.01
00.10
numbers 0.75 00.11
• The interval in this example is 0.2510 1 01.00
1.25 01.01
1.5 01.10
0 1 2 3 1.75 01.11
2 10.00
2.25 10.01
2.5 10.10
2.75 10.11
3 11.00
3.25 11.01
3.5 11.10
3.75 11.11
4
Carnegie Mellon

Limitations of Fixed-Point (#1)


• Can exactly represent numbers only of the form x/2k
• Other rational numbers have repeating bit representations

Decimal Value Binary Representation


1/3 0.0101010101[01]…
1/5 0.001100110011[0011]…
1/10 0.0001100110011[0011]…

b3b2.b1b0
0 1/4 1/2 3/4 5/4 3/2 7/4 2 …. 15/4

5
Limitations of Fixed-Point (#2)
• Can’t represent very small and very large numbers at
the same time
• To represent very large numbers, the (fixed) interval needs
to be large, making it hard to represent small numbers
• To represent very small numbers, the (fixed) interval needs
to be small, making it hard to represent large numbers

….
+∞
0

6
Limitations of Fixed-Point (#2)
• Can’t represent very small and very large numbers at
the same time
• To represent very large numbers, the (fixed) interval needs
to be large, making it hard to represent small numbers
• To represent very small numbers, the (fixed) interval needs
to be small, making it hard to represent large numbers
Unrepresentable
small numbers

….
+∞
0
A Large
Number
6
Limitations of Fixed-Point (#2)
• Can’t represent very small and very large numbers at
the same time
• To represent very large numbers, the (fixed) interval needs
to be large, making it hard to represent small numbers
• To represent very small numbers, the (fixed) interval needs
to be small, making it hard to represent large numbers

….
+∞
0
A Small
Number
6
Limitations of Fixed-Point (#2)
• Can’t represent very small and very large numbers at
the same time
• To represent very large numbers, the (fixed) interval needs
to be large, making it hard to represent small numbers
• To represent very small numbers, the (fixed) interval needs
to be small, making it hard to represent large numbers
Unrepresentable
large numbers

….
+∞
0
A Small
Number
6
Primer: Floating Point Representation
• In binary: (–1)s M 2E Sign Exponent
• Normalized form:
• 1<= M < 2
• M = 1.b0b1b2b3… (-1)s M × 2E
Fraction
• Encoding Significand Base
• MSB s is sign bit s
• exp field encodes Exponent (but not exactly the same, more later)
• frac field encodes Fraction (but not exactly the same, more later)

s exp frac

7
Carnegie Mellon

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2
• exp has 3 bits, interpreted as an unsigned value
• If exp were E, we could represent exponents from 0 to 7
• How about negative exponent?
• Subtract a bias term: E = exp - bias (i.e., exp = E + bias)
• bias is always 2k-1 - 1, where k is number of exponent bits

8
Carnegie Mellon

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2
• exp has 3 bits, interpreted as an unsigned value E exp
• If exp were E, we could represent exponents from 0 to 7 -3 000
• How about negative exponent? -2 001
• Subtract a bias term: E = exp - bias (i.e., exp = E + bias)
-1 010
0 011
• bias is always 2k-1 - 1, where k is number of exponent bits
1 100
• Example when we use 3 bits for exp (i.e., k = 3): 23 101
110
• bias = 3 4 111
• If E = -2, exp is 1 (0012)
• Reserve 000 and 111 for other purposes (more on this later)
• We can now represent exponents from -2 (exp 001) to 3 (exp 110)
8
Carnegie
Carnegie Mellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2
• frac has 2 bits, append them after “1.” to form M E exp
• frac = 10 implies M = 1.10 -3 000
• Putting it Together: An Example: -2 001
-1 010
0 011
1 100

-10.12 = (-1) 1 1.01 x 21 2


3
101
110
4 111

9
Carnegie
Carnegie Mellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s exp frac

1 3 2
• frac has 2 bits, append them after “1.” to form M E exp
• frac = 10 implies M = 1.10 -3 000
• Putting it Together: An Example: -2 001
-1 010
0 011
1 100

-10.12 = (-1) 1 1.01 x 21 2


3
101
110
4 111

9
Carnegie
Carnegie Mellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s
1 exp frac

1 3 2
• frac has 2 bits, append them after “1.” to form M E exp
• frac = 10 implies M = 1.10 -3 000
• Putting it Together: An Example: -2 001
-1 010
0 011
1 100

-10.12 = (-1) 1 1.01 x 21 2


3
101
110
4 111

9
Carnegie
Carnegie Mellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s
1 exp frac

1 3 2
• frac has 2 bits, append them after “1.” to form M E exp
• frac = 10 implies M = 1.10 -3 000
• Putting it Together: An Example: -2 001
-1 010
0 011
1 100

-10.12 = (-1) 1 1.01 x 21 2


3
101
110
4 111

9
Carnegie
Carnegie Mellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s
1 exp
100 frac

1 3 2
• frac has 2 bits, append them after “1.” to form M E exp
• frac = 10 implies M = 1.10 -3 000
• Putting it Together: An Example: -2 001
-1 010
0 011
1 100

-10.12 = (-1) 1 1.01 x 21 2


3
101
110
4 111

9
Carnegie
Carnegie Mellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s
1 exp
100 frac

1 3 2
• frac has 2 bits, append them after “1.” to form M E exp
• frac = 10 implies M = 1.10 -3 000
• Putting it Together: An Example: -2 001
-1 010
0 011
1 100

-10.12 = (-1) 1 1.01 x 21 2


3
101
110
4 111

9
Carnegie
Carnegie Mellon
Mello

6-bit Floating Point Example


v = (–1)s M 2E s
1 exp
100 frac
01

1 3 2
• frac has 2 bits, append them after “1.” to form M E exp
• frac = 10 implies M = 1.10 -3 000
• Putting it Together: An Example: -2 001
-1 010
0 011
1 100

-10.12 = (-1) 1 1.01 x 21 2


3
101
110
4 111

9
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
001 frac
00 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

0 +∞

10
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
001 frac
00 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

0 +∞

10
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
001 frac
00 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

0 +∞
1/4

10
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
010 frac
00 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2

0 +∞
1/4

11
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
011 frac
00 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2

0 1 +∞
1/4

12
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
100 frac
00 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2

0 1 2 +∞
1/4

13
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
101 frac
00 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2

0 1 2 4 +∞
1/4

14
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
110 frac
00 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2

0 1 2 4 8 +∞
1/4

15
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
110 frac
01 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2

0 1 2 4 8 1.01 x 23 +∞
1/4

16
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
110 frac
01 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2

0 1 2 4 8 10 +∞
1/4

17
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
110 frac
10 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2 1.10 x 23

0 1 2 4 8 10 +∞
1/4

18
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
110 frac
10 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2 12

0 1 2 4 8 10 +∞
1/4

19
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
110 frac
11 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2 12

0 1 2 4 8 10 14 +∞
1/4

20
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
101 frac
11 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2 1.11 x 22 12

0 1 2 4 8 10 14 +∞
1/4

21
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
101 frac
11 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2 7 12

0 1 2 4 8 10 14 +∞
1/4

22
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
101 frac
10 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2 7 12

0 1 2 4 1.10 x 22 8 10 14 +∞
1/4

23
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
101 frac
10 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2 7 12

0 1 2 4 6 8 10 14 +∞
1/4

24
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
101 frac
01 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2 1.01 x 22 7 12

0 1 2 4 6 8 10 14 +∞
1/4

25
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
101 frac
01 -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2 5 7 12

0 1 2 4 6 8 10 14 +∞
1/4

26
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
100 frac -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2 5 7 12

0 1 2 4 6 8 10 14 +∞
1/4

27
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
011 frac -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2 5 7 12

0 1 2 4 6 8 10 14 +∞
1/4

28
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
010 frac -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2 5 7 12

0 1 2 4 6 8 10 14 +∞
1/4

29
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s
0 exp
001 frac -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2 5 7 12

0 1 2 4 6 8 10 14 +∞
1/4

30
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s exp frac -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2 5 7 12

0 1 2 4 6 8 10 14 +∞
1/4

31
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s exp frac -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

1/2 5 7 12

0 1 2 4 6 8 10 14 +∞
1/4

31
Carnegie
Carnegie Mellon
Mello

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s exp frac -3 000 1 100
-2 001 2 101
1 3 2 -1 010 3 110
0 011 4 111

• Uneven interval (c.f., fixed interval in fixed-point)


• More dense toward 0, sparser toward infinite
• Allow encoding small and large numbers at the same time

1/2 5 7 12

0 1 2 4 6 8 10 14 +∞
1/4

31
Carnegie
CarnegieMellon
Mellon

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s exp frac -3 000 1 100
-2 001 2 101
-1 010 3 110
0 011 4 111

1/4 3/8 1/2

0 5/16 7/16 1

32
Carnegie
CarnegieMellon
Mellon

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s exp frac -3 000 1 100
-2 001 2 101
-1 010 3 110
0 011 4 111

Unrepresented
small numbers 1/4 3/8 1/2

0 5/16 7/16 1

32
Carnegie
CarnegieMellon
Mellon

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s exp frac -3 000 1 100
-2 001 2 101
-1 010 3 110
• Always round to 0 is inelegant 0 011 4 111

Unrepresented
small numbers 1/4 3/8 1/2

0 5/16 7/16 1

32
Carnegie
CarnegieMellon
Mellon

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s exp frac -3 000 1 100
-2 001 2 101
-1 010 3 110
• Always round to 0 is inelegant 0 011 4 111

Unrepresented
small numbers 1/4 3/8 1/2

0 5/16 7/16 1

32
Carnegie
CarnegieMellon
Mellon

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s exp frac -3 000 1 100
-2 001 2 101
-1 010 3 110
• Always round to 0 is inelegant 0 011 4 111

Unrepresented
small numbers 1/4 3/8 1/2

0 1/8 5/16 7/16 1

32
Carnegie
CarnegieMellon
Mellon

Representable Numbers (Positive Only)


E exp E exp
v = (–1)s M 2E s exp frac -3 000 1 100
-2 001 2 101
-1 010 3 110
• Always round to 0 is inelegant 0 011 4 111

• Using 000 for exp would only “delay” the


problem rather than solving it

Unrepresented
small numbers 1/4 3/8 1/2

0 1/8 5/16 7/16 1

32
Carnegie
CarnegieMellon
Mellon

Subnormal (De-normalized) Numbers


E exp E exp
v = (–1)s M 2E s exp frac -3 000 1 100
-2 001 2 101
-1 010 3 110
• Idea: Evenly divide between 0 and 1/4 rather 0 011 4 111
than exponentially decreasing when exp = 0
(subnormal/denormalized numbers)

1/4 3/8 1/2

0 5/16 7/16 1

33
Carnegie
CarnegieMellon
Mellon

Subnormal (De-normalized) Numbers


E exp E exp
v = (–1)s M 2E s exp frac -3 000 1 100
-2 001 2 101
-1 010 3 110
• Idea: Evenly divide between 0 and 1/4 rather 0 011 4 111
than exponentially decreasing when exp = 0
(subnormal/denormalized numbers)

1/8 1/4 3/8 1/2

0 1/16 3/16 5/16 7/16 1

33
Carnegie
CarnegieMellon
Mellon

Subnormal (De-normalized) Numbers


E exp E exp
v = (–1)s M 2E s exp frac -3 000 1 100
-2 001 2 101
-1 010 3 110
• Idea: Evenly divide between 0 and 1/4 rather 0 011 4 111
than exponentially decreasing when exp = 0
(subnormal/denormalized numbers)
• E = (exp + 1) – bias (instead of exp - bias)
• M = 0.frac (instead of 1.frac)

1/8 1/4 3/8 1/2

0 1/16 3/16 5/16 7/16 1

33
Carnegie
CarnegieMellon
Mellon

Subnormal (De-normalized) Numbers


E exp E exp
v = (–1)s M 2E s exp frac -3 000 1 100
-2 001 2 101
-1 010 3 110
• Idea: Evenly divide between 0 and 1/4 rather 0 011 4 111
than exponentially decreasing when exp = 0
(subnormal/denormalized numbers)
• E = (exp + 1) – bias (instead of exp - bias)
• M = 0.frac (instead of 1.frac)

1/8 1/4 3/8 1/2

0 1/16 3/16 5/16 7/16 1

0 000 01 = (-1)0 0.01 x 2(0+1-3) = 1/16


33
Carnegie
CarnegieMellon
Mellon

Subnormal (De-normalized) Numbers


E exp E exp
v = (–1)s M 2E s exp frac -3 000 1 100
-2 001 2 101
-1 010 3 110
• Idea: Evenly divide between 0 and 1/4 rather 0 011 4 111
than exponentially decreasing when exp = 0
(subnormal/denormalized numbers)
•E = (exp + 1) – bias (instead of exp - bias)
•M = 0.frac (instead of 1.frac)
• Subnormal numbers allow graceful underflow
1/8 1/4 3/8 1/2

0 1/16 3/16 5/16 7/16 1

0 000 01 = (-1)0 0.01 x 2(0+1-3) = 1/16


33
Carnegie Mellon

Special Values
E exp E exp
v = (–1)s M 2E s exp frac -2 000 1 100
-2 001 2 101
-1 010 3 110
0 011 4 111

34
Carnegie Mellon

Special Values
E exp E exp
v = (–1)s M 2E s exp frac -2 000 1 100
-2 001 2 101
-1 010 3 110
0 011 4 111

• There are many special values in scientific computing


• +/- ∞, Not-a-Numbers (NaNs) (e.g., 0 / 0, 0 / ∞, ∞ / ∞, sqrt(–1), ∞ - ∞,
∞ x 0, etc.)

34
Carnegie Mellon

Special Values
E exp E exp
v = (–1)s M 2E s exp frac -2 000 1 100
-2 001 2 101
-1 010 3 110
0 011 4 111

• There are many special values in scientific computing


• +/- ∞, Not-a-Numbers (NaNs) (e.g., 0 / 0, 0 / ∞, ∞ / ∞, sqrt(–1), ∞ - ∞,
∞ x 0, etc.)
• exp = 111 is reserved to represent these numbers

34
Carnegie Mellon

Special Values
E exp E exp
v = (–1)s M 2E s exp frac -2 000 1 100
-2 001 2 101
-1 010 3 110
0 011 4 111

• There are many special values in scientific computing


• +/- ∞, Not-a-Numbers (NaNs) (e.g., 0 / 0, 0 / ∞, ∞ / ∞, sqrt(–1), ∞ - ∞,
∞ x 0, etc.)
• exp = 111 is reserved to represent these numbers

34
Carnegie Mellon

Special Values
E exp E exp
v = (–1)s M 2E s exp frac -2 000 1 100
-2 001 2 101
-1 010 3 110
0 011 4 111

• There are many special values in scientific computing


• +/- ∞, Not-a-Numbers (NaNs) (e.g., 0 / 0, 0 / ∞, ∞ / ∞, sqrt(–1), ∞ - ∞,
∞ x 0, etc.)
• exp = 111 is reserved to represent these numbers
• exp = 111, frac = 00
• +/- ∞ (depending on the s bit). Overflow results.
• Arithmetic on ∞ is exact: 1.0/0.0 = −1.0/−0.0 = + ∞, 1.0/−0.0 = -∞

34
Carnegie Mellon

Special Values
E exp E exp
v = (–1)s M 2E s exp frac -2 000 1 100
-2 001 2 101
-1 010 3 110
0 011 4 111

• There are many special values in scientific computing


• +/- ∞, Not-a-Numbers (NaNs) (e.g., 0 / 0, 0 / ∞, ∞ / ∞, sqrt(–1), ∞ - ∞,
∞ x 0, etc.)
• exp = 111 is reserved to represent these numbers
• exp = 111, frac = 00
• +/- ∞ (depending on the s bit). Overflow results.
• Arithmetic on ∞ is exact: 1.0/0.0 = −1.0/−0.0 = + ∞, 1.0/−0.0 = -∞
• exp = 111, frac != 00
• Represent NaNs
34
Visualization: Floating Point Encodings
−¥ −Normalized −Subnorm +Subnorm +Normalized +∞

NaN NaN
-0 +0

35
Visualization: Floating Point Encodings
−¥ −Normalized −Subnorm +Subnorm +Normalized +∞

NaN NaN
-0 +0

Infinite Amount of Real Numbers

35
Visualization: Floating Point Encodings
−¥ −Normalized −Subnorm +Subnorm +Normalized +∞

NaN NaN
-0 +0

Infinite Amount of Real Numbers

Finite Amount of Floating Point Numbers


35
Visualization: Floating Point Encodings
−¥ −Normalized −Subnorm +Subnorm +Normalized +∞

NaN NaN
-0 +0

Infinite Amount of Real Numbers

Sparse Sparse

Finite Amount of Floating Point Numbers


35
Visualization: Floating Point Encodings
−¥ −Normalized −Subnorm +Subnorm +Normalized +∞

NaN NaN
-0 +0

Infinite Amount of Real Numbers

Sparse Dense Sparse

Finite Amount of Floating Point Numbers


35
Carnegie Mellon

Today: Floating Point


• Background: Fractional binary numbers and fixed-point
• Floating point representation
• IEEE 754 standard
• Rounding, addition, multiplication
• Floating point in C
• Summary

36
Carnegie Mellon

IEEE 754 Floating Point Standard


• Single precision: 32 bits
s exp frac

1 8-bit 23-bit

• Double precision: 64 bits


s exp frac

1 11-bit 52-bit

37
Carnegie Mellon

IEEE Floating Point


• IEEE Standard 754
• Established in 1985 as uniform standard for floating point arithmetic
• Before that, many idiosyncratic formats
• Supported by all major CPUs (and even GPUs and other processors)

• Driven by numerical concerns


• Nice standards for rounding, overflow, underflow
• Hard to make fast in hardware
• Numerical analysts predominated over hardware designers in
defining standard

38
Carnegie Mellon

Single Precision (32-bit) Example


v = (–1)s M 2E bias = 2(8-1)-1 = 127
s exp frac
1 8-bit 23-bit

39
Carnegie Mellon

Single Precision (32-bit) Example


v = (–1)s M 2E bias = 2(8-1)-1 = 127
s exp frac
1 8-bit 23-bit

1521310 = 111011011011012
= (-1)0 1.11011011011012 x 213

39
Carnegie Mellon

Single Precision (32-bit) Example


v = (–1)s M 2E bias = 2(8-1)-1 = 127
0s exp frac
1 8-bit 23-bit

1521310 = 111011011011012
= (-1)0 1.11011011011012 x 213

39
Carnegie Mellon

Single Precision (32-bit) Example


v = (–1)s M 2E bias = 2(8-1)-1 = 127
0s exp frac
1 8-bit 23-bit

1521310 = 111011011011012
= (-1)0 1.11011011011012 x 213

39
Carnegie Mellon

Single Precision (32-bit) Example


v = (–1)s M 2E bias = 2(8-1)-1 = 127
0s exp frac
1 8-bit 23-bit

1521310 = 111011011011012
= (-1)0 1.11011011011012 x 213
exp = E + bias = 14010

39
Carnegie Mellon

Single Precision (32-bit) Example


v = (–1)s M 2E bias = 2(8-1)-1 = 127
0s exp
10001100 frac
1 8-bit 23-bit

1521310 = 111011011011012
= (-1)0 1.11011011011012 x 213
exp = E + bias = 14010

39
Carnegie Mellon

Single Precision (32-bit) Example


v = (–1)s M 2E bias = 2(8-1)-1 = 127
0s exp
10001100 frac
1 8-bit 23-bit

1521310 = 111011011011012
= (-1)0 1.11011011011012 x 213
exp = E + bias = 14010

39
Carnegie Mellon

Single Precision (32-bit) Example


v = (–1)s M 2E bias = 2(8-1)-1 = 127
0s exp
10001100 frac
11011011011010000000000

1 8-bit 23-bit

1521310 = 111011011011012
= (-1)0 1.11011011011012 x 213
exp = E + bias = 14010

39
Carnegie Mellon

Today: Floating Point


• Background: Fractional binary numbers and fixed-point
• Floating point representation
• IEEE 754 standard
• Rounding, addition, multiplication
• Floating point in C
• Summary

40
Carnegie Mellon

Floating Point Computations


• The problem: Computing on floating point numbers might
produce a result that can’t be precisely represented
• Basic idea
• We perform the operation & produce the infinitely precise result
• Make it fit into desired precision
• Possibly overflow if exponent too large
• Possibly round to fit into frac

41
Carnegie Mellon

Rounding Modes (Decimal)


• Common ones:
• Towards zero (chop)
• Round down (-∞)
• Round up (+∞)

42
Carnegie Mellon

Rounding Modes (Decimal)


• Common ones:
• Towards zero (chop)
• Round down (-∞)
• Round up (+∞)

Rounding Mode 1.40 1.60 1.50 2.50 -1.50


Towards zero 1 1 1 2 -1
Round down (-∞) 1 1 1 2 -2
Round up (+∞) 2 2 2 3 -1
Nearest even (default) 1 2 2 2 -2

42
Carnegie Mellon

Rounding Modes (Decimal)


• Common ones:
• Towards zero (chop)
• Round down (-∞)
• Round up (+∞)
• Nearest Even: Round to nearest; if equally near, then to the
one having an even least significant digit (bit)

Rounding Mode 1.40 1.60 1.50 2.50 -1.50


Towards zero 1 1 1 2 -1
Round down (-∞) 1 1 1 2 -2
Round up (+∞) 2 2 2 3 -1
Nearest even (default) 1 2 2 2 -2

42
Carnegie Mellon

Rounding Modes (Binary Example)


• Nearest Even; if equally near, then to the one having an
even least significant digit (bit)
• Assuming 3 bits for frac

43
Carnegie Mellon

Rounding Modes (Binary Example)


• Nearest Even; if equally near, then to the one having an
even least significant digit (bit)
• Assuming 3 bits for frac

Precise Value Rounded Value Notes


1.000011 1.000 1.000 is the nearest (down)
1.000110 1.001 1.001 is the nearest (up)
1.000100 1.000 1.000 is the nearest even (down)
1.001100 1.010 1.010 is the nearest even (up)

43
Carnegie Mellon

Rounding Modes (Binary Example)


• Nearest Even; if equally near, then to the one having an
even least significant digit (bit)
• Assuming 3 bits for frac
even odd even
1.000 1.001 1.010

Precise Value Rounded Value Notes


1.000011 1.000 1.000 is the nearest (down)
1.000110 1.001 1.001 is the nearest (up)
1.000100 1.000 1.000 is the nearest even (down)
1.001100 1.010 1.010 is the nearest even (up)

43
Carnegie Mellon

Rounding Modes (Binary Example)


• Nearest Even; if equally near, then to the one having an
even least significant digit (bit)
• Assuming 3 bits for frac
even odd even
1.000 1.001 1.010

1.000011

Precise Value Rounded Value Notes


1.000011 1.000 1.000 is the nearest (down)
1.000110 1.001 1.001 is the nearest (up)
1.000100 1.000 1.000 is the nearest even (down)
1.001100 1.010 1.010 is the nearest even (up)

43
Carnegie Mellon

Rounding Modes (Binary Example)


• Nearest Even; if equally near, then to the one having an
even least significant digit (bit)
• Assuming 3 bits for frac
even odd even
1.000 1.001 1.010

1.000110

Precise Value Rounded Value Notes


1.000011 1.000 1.000 is the nearest (down)
1.000110 1.001 1.001 is the nearest (up)
1.000100 1.000 1.000 is the nearest even (down)
1.001100 1.010 1.010 is the nearest even (up)

43
Carnegie Mellon

Rounding Modes (Binary Example)


• Nearest Even; if equally near, then to the one having an
even least significant digit (bit)
• Assuming 3 bits for frac
even odd even
1.000 1.001 1.010

1.000100

Precise Value Rounded Value Notes


1.000011 1.000 1.000 is the nearest (down)
1.000110 1.001 1.001 is the nearest (up)
1.000100 1.000 1.000 is the nearest even (down)
1.001100 1.010 1.010 is the nearest even (up)

43
Carnegie Mellon

Rounding Modes (Binary Example)


• Nearest Even; if equally near, then to the one having an
even least significant digit (bit)
• Assuming 3 bits for frac
even odd even
1.000 1.001 1.010

1.001100

Precise Value Rounded Value Notes


1.000011 1.000 1.000 is the nearest (down)
1.000110 1.001 1.001 is the nearest (up)
1.000100 1.000 1.000 is the nearest even (down)
1.001100 1.010 1.010 is the nearest even (up)

43
Floating Point Addition

44
Floating Point Addition
• (–1)s1 M1 2E1 + (-1)s2 M2 2E2 1.000 x 2-1 + 1.101 x 2-3

44
Floating Point Addition
• (–1)s1 M1 2E1 + (-1)s2 M2 2E2 1.000 x 2-1 + 1.101 x 2-3


align 1.000 x 2-1 + 0.1101 x 2-1

44
Floating Point Addition
• (–1)s1 M1 2E1 + (-1)s2 M2 2E2 1.000 x 2-1 + 1.101 x 2-3


1.000 x 2-1 + 0.1101 x 2-1

add 1.1101 x 2-1

44
Floating Point Addition
• (–1)s1 M1 2E1 + (-1)s2 M2 2E2 1.000 x 2-1 + 1.101 x 2-3
• Exact Result: (–1)s M 2E
• Sign s, significand M:
• Result of signed align & add 1.000 x 2-1 + 0.1101 x 2-1
• Exponent E: E1
• Assume E1 > E2
1.1101 x 2-1

44
Floating Point Addition
• (–1)s1 M1 2E1 + (-1)s2 M2 2E2 1.000 x 2-1 + 1.101 x 2-3
• Exact Result: (–1)s M 2E
• Sign s, significand M:
• Result of signed align & add 1.000 x 2-1 + 0.1101 x 2-1
• Exponent E: E1
• Assume E1 > E2

• Fixing 1.1101 x 2-1


• If M ≥ 2, shift M right, increment E
• If M < 1, shift M left k positions, decrement E by k
• Overflow if E out of range
• Round M to fit frac precision

44
Floating Point Addition
• (–1)s1 M1 2E1 + (-1)s2 M2 2E2 1.000 x 2-1 + 1.101 x 2-3
• Exact Result: (–1)s M 2E
• Sign s, significand M:
• Result of signed align & add 1.000 x 2-1 + 0.1101 x 2-1
• Exponent E: E1
• Assume E1 > E2

• Fixing 1.1101 x 2-1


• If M ≥ 2, shift M right, increment E
• If M < 1, shift M left k positions, decrement E by k
• Overflow if E out of range
1.110 x 2-1
• Round M to fit frac precision

44
Mathematical Properties of FP Add

45
Mathematical Properties of FP Add
• Commutative? a+b =b+a

45
Mathematical Properties of FP Add
• Commutative? a+b =b+a Yes

45
Mathematical Properties of FP Add
• Commutative? a+b =b+a Yes
• Associative? a+b+c = a+(b+c)

45
Mathematical Properties of FP Add
• Commutative? a+b =b+a Yes
• Associative? a+b+c = a+(b+c) No
• Overflow and inexactness of rounding
• (3.14+1e10)-1e10 = 0, 3.14+(1e10-1e10) = 3.14

45
Mathematical Properties of FP Add
• Commutative? a+b =b+a Yes
• Associative? a+b+c = a+(b+c) No
• Overflow and inexactness of rounding
• (3.14+1e10)-1e10 = 0, 3.14+(1e10-1e10) = 3.14
1.000 x 25 + 1.101 x 2-3

45
Mathematical Properties of FP Add
• Commutative? a+b =b+a Yes
• Associative? a+b+c = a+(b+c) No
• Overflow and inexactness of rounding
• (3.14+1e10)-1e10 = 0, 3.14+(1e10-1e10) = 3.14
1.000 x 25 + 1.101 x 2-3

1.000 x 25 + 0.00000001101x 25

45
Mathematical Properties of FP Add
• Commutative? a+b =b+a Yes
• Associative? a+b+c = a+(b+c) No
• Overflow and inexactness of rounding
• (3.14+1e10)-1e10 = 0, 3.14+(1e10-1e10) = 3.14
1.000 x 25 + 1.101 x 2-3

1.000 x 25 + 0.00000001101x 25

1. 00000001101 x 25

45
Mathematical Properties of FP Add
• Commutative? a+b =b+a Yes
• Associative? a+b+c = a+(b+c) No
• Overflow and inexactness of rounding
• (3.14+1e10)-1e10 = 0, 3.14+(1e10-1e10) = 3.14
1.000 x 25 + 1.101 x 2-3

1.000 x 25 + 0.00000001101x 25

1. 00000001101 x 25

1.000 x 25 45
Mathematical Properties of FP Add
• Commutative? a+b =b+a Yes
• Associative? a+b+c = a+(b+c) No
• Overflow and inexactness of rounding
• (3.14+1e10)-1e10 = 0, 3.14+(1e10-1e10) = 3.14

46
Mathematical Properties of FP Add
• Commutative? a+b =b+a Yes
• Associative? a+b+c = a+(b+c) No
• Overflow and inexactness of rounding
• (3.14+1e10)-1e10 = 0, 3.14+(1e10-1e10) = 3.14
• 0 is additive identity?

46
Mathematical Properties of FP Add
• Commutative? a+b =b+a Yes
• Associative? a+b+c = a+(b+c) No
• Overflow and inexactness of rounding
• (3.14+1e10)-1e10 = 0, 3.14+(1e10-1e10) = 3.14
• 0 is additive identity? Yes

46
Mathematical Properties of FP Add
• Commutative? a+b =b+a Yes
• Associative? a+b+c = a+(b+c) No
• Overflow and inexactness of rounding
• (3.14+1e10)-1e10 = 0, 3.14+(1e10-1e10) = 3.14
• 0 is additive identity? Yes
• Every element has additive inverse (negation)?

46
Mathematical Properties of FP Add
• Commutative? a+b =b+a Yes
• Associative? a+b+c = a+(b+c) No
• Overflow and inexactness of rounding
• (3.14+1e10)-1e10 = 0, 3.14+(1e10-1e10) = 3.14
• 0 is additive identity? Yes
• Every element has additive inverse (negation)? Almost
• Except for infinities & NaNs

46
Mathematical Properties of FP Add
• Commutative? a+b =b+a Yes
• Associative? a+b+c = a+(b+c) No
• Overflow and inexactness of rounding
• (3.14+1e10)-1e10 = 0, 3.14+(1e10-1e10) = 3.14
• 0 is additive identity? Yes
• Every element has additive inverse (negation)? Almost
• Except for infinities & NaNs
• Monotonicity: a ≥ b ⇒ a+c ≥ b+c?

46
Mathematical Properties of FP Add
• Commutative? a+b =b+a Yes
• Associative? a+b+c = a+(b+c) No
• Overflow and inexactness of rounding
• (3.14+1e10)-1e10 = 0, 3.14+(1e10-1e10) = 3.14
• 0 is additive identity? Yes
• Every element has additive inverse (negation)? Almost
• Except for infinities & NaNs
• Monotonicity: a ≥ b ⇒ a+c ≥ b+c? Almost
• Except for infinities & NaNs

46
Carnegie Mellon

Floating Point Multiplication

47
Carnegie Mellon

Floating Point Multiplication


• (–1)s1 M1 2E1 x (–1)s2 M2 2E2

47
Carnegie Mellon

Floating Point Multiplication


• (–1)s1 M1 2E1 x (–1)s2 M2 2E2
• Exact Result: (–1)s M 2E
• Sign s: s1 ^ s2
• Significand M: M1 x M2
• Exponent E: E1 + E2

47
Carnegie Mellon

Floating Point Multiplication


• (–1)s1 M1 2E1 x (–1)s2 M2 2E2
• Exact Result: (–1)s M 2E
• Sign s: s1 ^ s2
• Significand M: M1 x M2
• Exponent E: E1 + E2
• Fixing
• If M ≥ 2, shift M right, increment E
• If E out of range, overflow
• Round M to fit frac precision

47
Carnegie Mellon

Floating Point Multiplication


• (–1)s1 M1 2E1 x (–1)s2 M2 2E2
• Exact Result: (–1)s M 2E
• Sign s: s1 ^ s2
• Significand M: M1 x M2
• Exponent E: E1 + E2
• Fixing
• If M ≥ 2, shift M right, increment E
• If E out of range, overflow
• Round M to fit frac precision
• Implementation
• Biggest chore is multiplying significands

47
Mathematical Properties of FP Mult

48
Mathematical Properties of FP Mult
• Multiplication Commutative?

48
Mathematical Properties of FP Mult
• Multiplication Commutative? Yes

48
Mathematical Properties of FP Mult
• Multiplication Commutative? Yes
• Multiplication is Associative?

48
Mathematical Properties of FP Mult
• Multiplication Commutative? Yes
• Multiplication is Associative? No
• Possibility of overflow, inexactness of rounding
• Ex: (1e20*1e20)*1e-20= inf, 1e20*(1e20*1e-20)= 1e20

48
Mathematical Properties of FP Mult
• Multiplication Commutative? Yes
• Multiplication is Associative? No
• Possibility of overflow, inexactness of rounding
• Ex: (1e20*1e20)*1e-20= inf, 1e20*(1e20*1e-20)= 1e20
• 1 is multiplicative identity?

48
Mathematical Properties of FP Mult
• Multiplication Commutative? Yes
• Multiplication is Associative? No
• Possibility of overflow, inexactness of rounding
• Ex: (1e20*1e20)*1e-20= inf, 1e20*(1e20*1e-20)= 1e20
• 1 is multiplicative identity? Yes

48
Mathematical Properties of FP Mult
• Multiplication Commutative? Yes
• Multiplication is Associative? No
• Possibility of overflow, inexactness of rounding
• Ex: (1e20*1e20)*1e-20= inf, 1e20*(1e20*1e-20)= 1e20
• 1 is multiplicative identity? Yes
• Multiplication distributes over addition?
• a*(b+c) = a*b+a*c?

48
Mathematical Properties of FP Mult
• Multiplication Commutative? Yes
• Multiplication is Associative? No
• Possibility of overflow, inexactness of rounding
• Ex: (1e20*1e20)*1e-20= inf, 1e20*(1e20*1e-20)= 1e20
• 1 is multiplicative identity? Yes
• Multiplication distributes over addition? No
• a*(b+c) = a*b+a*c?
• Possibility of overflow, inexactness of rounding
• 1e20*(1e20-1e20)= 0.0, 1e20*1e20 – 1e20*1e20 = NaN

48
Mathematical Properties of FP Mult
• Multiplication Commutative? Yes
• Multiplication is Associative? No
• Possibility of overflow, inexactness of rounding
• Ex: (1e20*1e20)*1e-20= inf, 1e20*(1e20*1e-20)= 1e20
• 1 is multiplicative identity? Yes
• Multiplication distributes over addition? No
• a*(b+c) = a*b+a*c?
• Possibility of overflow, inexactness of rounding
• 1e20*(1e20-1e20)= 0.0, 1e20*1e20 – 1e20*1e20 = NaN
• Monotonicity: a ≥ b & c ≥ 0 ⇒ a * c ≥ b *c?

48
Mathematical Properties of FP Mult
• Multiplication Commutative? Yes
• Multiplication is Associative? No
• Possibility of overflow, inexactness of rounding
• Ex: (1e20*1e20)*1e-20= inf, 1e20*(1e20*1e-20)= 1e20
• 1 is multiplicative identity? Yes
• Multiplication distributes over addition? No
• a*(b+c) = a*b+a*c?
• Possibility of overflow, inexactness of rounding
• 1e20*(1e20-1e20)= 0.0, 1e20*1e20 – 1e20*1e20 = NaN
• Monotonicity: a ≥ b & c ≥ 0 ⇒ a * c ≥ b *c? Almost
• Except for infinities & NaNs

48
Floating Point in C
C Data Max Value
Bits Max Value
Type (Decimal)

{
char 8 27 - 1 127
short 16 215 - 1 32767
Fixed point
int 32 231 - 1 2147483647
(implicit binary point)
long 64 263 - 1 ~9.2 × 1018

SP floating point float 32 (2 - 2-23) × 2127 ~3.4 × 1038

DP floating point double 64 (2 - 2-52) × 21023 ~1.8 × 10308

49
Floating Point in C
C Data Max Value
Bits Max Value
Type (Decimal)

{
char 8 27 - 1 127
short 16 215 - 1 32767
Fixed point
int 32 231 - 1 2147483647
(implicit binary point)
long 64 263 - 1 ~9.2 × 1018

SP floating point float 32 (2 - 2-23) × 2127 ~3.4 × 1038

DP floating point double 64 (2 - 2-52) × 21023 ~1.8 × 10308

49
Floating Point in C
C Data Max Value
Bits Max Value
Type (Decimal)

{
char 8 27 - 1 127
short 16 215 - 1 32767
Fixed point
int 32 231 - 1 2147483647
(implicit binary point)
long 64 263 - 1 ~9.2 × 1018

SP floating point float 32 (2 - 2-23) × 2127 ~3.4 × 1038

DP floating point double 64 (2 - 2-52) × 21023 ~1.8 × 10308

• To represent 231 in fixed-point, you need at least 32 bits


• Because fixed-point is a weighted positional representation
• In floating-point, we directly encode the exponent
• Floating point is based on scientific notation
• Encoding 31 only needs 6 bits in the exp field

49
Floating Point in C
• double/float → int
• Truncates fractional part
• Like rounding toward zero
• Not defined when out of range or NaN

50
Floating Point in C
• double/float → int
• Truncates fractional part
• Like rounding toward zero
• Not defined when out of range or NaN
• int → float
• Can’t guarantee exact casting. Will round according to rounding mode

s exp frac

1 8-bit 23-bit

• int → double
s exp frac

1 11-bit 52-bit
50
Carnegie Mellon

Floating Point Review


v = (–1)s x 1.frac x 2E

s exp frac

51
Carnegie Mellon

Floating Point Review


s exp frac Value Value
v= (–1)s x 1.frac x 2E 0 000 00 0.00 x 2-2 0
Denormalized
0 000 11 0.11 x 2-2 3/16
s exp frac

• Denormalized (exp == 000)


• E = (exp + 1) – bias
• M = 0.frac

51
Carnegie Mellon

Floating Point Review


s exp frac Value Value
v= (–1)s x 1.frac x 2E 0 000 00 0.00 x 2-2 0
Denormalized
0 000 11 0.11 x 2-2 3/16
s exp frac 0 001 00 1.00 x 2-2 1/4
0 001 11 1.11 x 2-2 7/16
0 010 00 1.00 x 2-1 1/2
• Denormalized (exp == 000) 0 010 11 1.11 x 2-1 7/8
• E = (exp + 1) – bias Normalized 0 100 00 1.00 x 20 1
• M = 0.frac 0 100 11 1.11 x 20 1 3/4
0 101 00 1.00 x 21 2
• Normalized (exp != 000) 0 101 11 1.11 x 21 3 1/2
• E = exp – bias
0 110 00 1.00 x 22 4
• M = 1.frac 0 110 11 1.11 x 22 7

51
Carnegie Mellon

Floating Point Review


s exp frac Value Value
v= (–1)s x 1.frac x 2E 0 000 00 0.00 x 2-2 0
Denormalized
0 000 11 0.11 x 2-2 3/16
s exp frac 0 001 00 1.00 x 2-2 1/4
0 001 11 1.11 x 2-2 7/16
0 010 00 1.00 x 2-1 1/2
• Denormalized (exp == 000) 0 010 11 1.11 x 2-1 7/8
• E = (exp + 1) – bias Normalized 0 100 00 1.00 x 20 1
• M = 0.frac 0 100 11 1.11 x 20 1 3/4
0 101 00 1.00 x 21 2
• Normalized (exp != 000) 0 101 11 1.11 x 21 3 1/2
• E = exp – bias
0 110 00 1.00 x 22 4
• M = 1.frac 0 110 11 1.11 x 22 7
0 111 00 infinite infinite
Special Value
0 111 11 NaN NaN

51
Carnegie Mellon

Floating Point Review


s exp frac Value Value
0 000 00 0.00 x 2-2 0
Denormalized
0 000 11 0.11 x 2-2 3/16
0 001 00 1.00 x 2-2 1/4
0 001 11 1.11 x 2-2 7/16
0 010 00 1.00 x 2-1 1/2
0 010 11 1.11 x 2-1 7/8
Normalized 0 100 00 1.00 x 20 1
0 100 11 1.11 x 20 1 3/4
0 101 00 1.00 x 21 2
0 101 11 1.11 x 21 3 1/2
0 110 00 1.00 x 22 4
0 110 11 1.11 x 22 7
0 111 00 infinite infinite
Special Value
0 111 11 NaN NaN

52
Carnegie Mellon

Floating Point Review


s exp frac Value Value
• Bit patterns representing non- Denormalized 0 000 00 0.00 x 2-2 0
negative numbers are ordered 0 000 11 0.11 x 2-2 3/16
the same way as integers, so 0 001 00 1.00 x 2-2 1/4
could use regular integer 0 001 11 1.11 x 2-2 7/16
comparison. 0 010 00 1.00 x 2-1 1/2
0 010 11 1.11 x 2-1 7/8
Normalized 0 100 00 1.00 x 20 1
0 100 11 1.11 x 20 1 3/4
0 101 00 1.00 x 21 2
0 101 11 1.11 x 21 3 1/2
0 110 00 1.00 x 22 4
0 110 11 1.11 x 22 7
0 111 00 infinite infinite
Special Value
0 111 11 NaN NaN

52
Carnegie Mellon

Floating Point Review


s exp frac Value Value
• Bit patterns representing non- Denormalized 0 000 00 0.00 x 2-2 0
negative numbers are ordered 0 000 11 0.11 x 2-2 3/16
the same way as integers, so 0 001 00 1.00 x 2-2 1/4
could use regular integer 0 001 11 1.11 x 2-2 7/16
comparison. 0 010 00 1.00 x 2-1 1/2
• You don’t get this property if: 0 010 11 1.11 x 2-1 7/8
• exp is interpreted as signed Normalized 0 100 00 1.00 x 20 1
• exp and frac are swapped 0 100 11 1.11 x 20 1 3/4
0 101 00 1.00 x 21 2
0 101 11 1.11 x 21 3 1/2
0 110 00 1.00 x 22 4
0 110 11 1.11 x 22 7
0 111 00 infinite infinite
Special Value
0 111 11 NaN NaN

52
So far in 252…
int, float
C Program if, else
+, -, >>

53
So far in 252…
int, float
C Program if, else
+, -, >>
Compiler
Equivalent ret, call
Assembly fadd, add
Program jmp, jne
Semantically
Assembler
Equivalent
00001111
Machine 01010101
Code 11110000

Fixed-point adder
Processor (e.g., ripple carry),
Floating-point adder

NAND Gate
Transistor
NOR Gate
54
So far in 252…
int, float
High-Level C Program if, else
Language +, -, >>
ret, call
Assembly fadd, add
Program jmp, jne
Instruction Set
Architecture (ISA)
00001111
Machine 01010101
Code 11110000

Fixed-point adder
Microarchitecture Processor (e.g., ripple carry),
Floating-point adder

NAND Gate
Circuit Transistor
NOR Gate
55
So far in 252…
High-Level C Program
Language

Assembly
Program
Instruction Set
Architecture (ISA)
Machine
Code

Microarchitecture Processor

Circuit Transistor
56
So far in 252…
High-Level
• ISA: Software programmers’
C Program view of a computer
Language
• Provide all info for someone wants
to write assembly/machine code
Assembly • “Contract” between assembly/
Program machine code and processor
Instruction Set
Architecture (ISA)
Machine
Code

Microarchitecture Processor

Circuit Transistor
56
So far in 252…
High-Level
• ISA: Software programmers’
C Program view of a computer
Language
• Provide all info for someone wants
to write assembly/machine code
Assembly • “Contract” between assembly/
Program machine code and processor
Instruction Set
Architecture (ISA) • Processors execute machine
Machine code (binary). Assembly
Code program is merely a text
representation of machine
code
Microarchitecture Processor

Circuit Transistor
56
So far in 252…
High-Level
• ISA: Software programmers’
C Program view of a computer
Language
• Provide all info for someone wants
to write assembly/machine code
Assembly • “Contract” between assembly/
Program machine code and processor
Instruction Set
Architecture (ISA) • Processors execute machine
Machine code (binary). Assembly
Code program is merely a text
representation of machine
code
Microarchitecture Processor
• Microarchitecture: Hardware
implementation of the ISA (with
Circuit Transistor the help of circuit
technologies) 56
This Module (4-5 Lectures)
High-Level
• Assembly Programming
C Program
Language • Explain how various C constructs
are implemented in assembly code
• Effectively translating from C to
Assembly
assembly program manually
Program
Instruction Set • Helps us understand how
Architecture (ISA) compilers work
Machine • Helps us understand how
Code assemblers work

Microarchitecture Processor

Circuit Transistor
57
This Module (4-5 Lectures)
High-Level
• Assembly Programming
C Program
Language • Explain how various C constructs
are implemented in assembly code
• Effectively translating from C to
Assembly
assembly program manually
Program
Instruction Set • Helps us understand how
Architecture (ISA) compilers work
Machine • Helps us understand how
Code assemblers work

• Microarchitecture is the
Microarchitecture Processor topic of the next module

Circuit Transistor
57

You might also like