0% found this document useful (0 votes)

14 views36 pages

2.4 Floating Points

The document discusses floating point arithmetic, focusing on fractional binary numbers and the IEEE floating point standard established in 1985. It covers the representation of numbers, operations like rounding, addition, and multiplication, and highlights the importance of precision and potential errors in calculations, illustrated by the Patriot Missile failure. The document also provides examples of floating-point representation and operations in both decimal and binary formats.

Uploaded by

bofalat186

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views36 pages

2.4 Floating Points

Uploaded by

bofalat186

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Carnegie Mellon

Floating Point

Instructor
Dr. Neha Agrawal
Carnegie Mellon

Today: Floating Point

■ Background: Fractional binary numbers
■ IEEE floating point standard: Definition
■ Example and properties
■ Rounding, addition, multiplication
■ Floating point in C
■ Summary

2
Carnegie Mellon

Fractional binary numbers

■ What is 1011.1012?

3
Carnegie Mellon

Fractional Binary Numbers

2i
2i-1

4
••• 2
1
bi bi-1 ••• b2 b1 b0 b-1 b-2 b-3 ••• b-j
1/2
1/4
1/8

■ Representation 2-j
 Bits to right of “binary point” represent fractional powers of 2
 Represents rational number:

4
Carnegie Mellon

Fractional Binary Numbers: Examples

■ Value Representation
5 3/4 101.112
2 7/8 10.1112
63/64 1.01112

■ Observations
 Divide by 2 by shifting right
 Multiply by 2 by shifting left
 Numbers of form 0.111111…2 are just below 1.0
 1/2 + 1/4 + 1/8 + … + 1/2i + … ➙ 1.0
 Use notation 1.0 – ε

5
Carnegie Mellon

Representable Numbers
■ Limitation
 Can represent 𝑦

 Other rational numbers have repeating bit representations

■ Value Representation
 1/3 0.0101010101[01]…2
 1/5 0.001100110011[0011]…2
 1/10 0.0001100110011[0011]…2

6
Motivation
■ On February 25, 1991, during the Gulf War, an American Patriot Missile
battery in Dharan, Saudi Arabia, failed to track and intercept an
incoming Iraqi Scud missile. The Scud struck an American Army
barracks, killing 28 soldiers and injuring around 100 other people.
Areport of the General Accounting office, GAO/IMTEC-92-26, entitled
Patriot Missile Defense: Software Problem Led to System Failure at
Dhahran, Saudi Arabia reported on the cause of the failure. It turns out
that the cause was an inaccurate calculation of the time since boot due
to computer arithmetic errors.

7
■ Specifically, the time in tenths of second as measured by the system's internal clock was
multiplied by 1/10 to produce the time in seconds. This calculation was performed using a
24 bit fixed point register. In particular, the value 1/10, which has a non-terminating binary
expansion, was chopped at 24 bits after the radix point. The Patriot battery had been up
around 100 hours, and an easy calculation shows that the resulting time error due to the
magnified chopping error was about 0.34 seconds. (The number 1/10 equals
1/24+1/25+1/28+1/29+1/212+1/213+.... In other words, the binary expansion of 1/10 is
0.0001100110011001100110011001100.... Now the 24 bit register in the Patriot stored
instead 0.00011001100110011001100 introducing an error of
0.0000000000000000000000011001100... binary, or about 0.000000095 decimal.
Multiplying by the number of tenths of a second in 100 hours gives
0.000000095×100×60×60×10=0.34.)
■ A Scud travels at about 1,676 meters per second, and so travels more than half a
kilometer in this time. This was far enough that the incoming Scud was outside the "range
gate" that the Patriot tracked. Ironically, the fact that the bad time calculation had been
improved in some parts of the code, but not all, contributed to the problem, since it meant
that the inaccuracies did not cancel, as discussed here.

8
Carnegie Mellon

Today: Floating Point

■ Background: Fractional binary numbers
■ IEEE floating point standard: Definition
■ Example and properties
■ Rounding, addition, multiplication
■ Floating point in C
■ Summary

9
Carnegie Mellon

IEEE Floating Point

■ IEEE Standard 754
 Established in 1985 as uniform standard for floating point arithmetic
Before that, many idiosyncratic formats
 Supported by all major CPUs

■ Driven by numerical concerns

 Nice standards for rounding, overflow, underflow
 Hard to make fast in hardware
 Numerical analysts predominated over hardware designers in defining
standard

10
IEEE Floating-Point Format
single: 8 bits single: 23 bits
double: 11 bits double: 52 bits
S Exponent Fraction/Mantissa

x  (1)S (1 Fraction)  2(Exponent Bias)

Significand
■ S: sign bit (0  non-negative, 1  negative)
■ Normalize significand: 1.0 ≤ |significand| < 2.0
 Always has a leading pre-binary-point 1 bit, so no need to represent it
explicitly (hidden bit)
 Significand is Fraction with the “1.” restored
■ Exponent: excess representation: actual exponent + Bias
 Ensures exponent is unsigned
 Single: Bias = 127; Double: Bias 1023

11
Floating-Point Example
■ Represent –0.75
 –0.75 = (–1)1 × 1.12 × 2–1
 S=1
 Fraction = 1000…002
 Exponent = –1 + Bias
 Single: –1 + 127 = 126 = 011111102
 Double: –1 + 1023 = 1022 = 011111111102

■ Single: 1011111101000…00
■ Double: 1011111111101000…00

12
Reason for bias

13
Single Precision

■ Overflow - the exponent is too large to be represented in the

exponent field
■ Underflow - nonzero fraction has become so small that it
cannot be represented

14
Single-Precision Range
■ Exponents 00000000 and 11111111 reserved
■ Smallest value

 Exponent: 00000001
 actual exponent = 1 – 127 = –126
 Fraction: 000…00  significand = 1.0
 ±1.0 × 2–126 ≈ ±1.2 × 10–38
■ Largest value
 exponent: 11111110
 actual exponent = 254 – 127 = +127
 Fraction: 111…11  significand ≈ 2.0
 ±2.0 × 2+127 ≈ ±3.4 × 10+38

15
Double Precision

16
Double-Precision Range
■ Exponents 0000…00 and 1111…11 reserved
■ Smallest value

 Exponent: 00000000001
 actual exponent = 1 – 1023 = –1022
 Fraction: 000…00  significand = 1.0
 ±1.0 × 2–1022 ≈ ±2.2 × 10–308
■ Largest value
 Exponent: 11111111110
 actual exponent = 2046 – 1023 = +1023
 Fraction: 111…11  significand ≈ 2.0
 ±2.0 × 2+1023 ≈ ±1.8 × 10+308

17
Floating-Point Example
■ What number is represented by the single-precision float
11000000101000…00
 S=1
 Fraction = 01000…002
 Exponent = 100000012 = 129
■ x = (–1)1 × (1 + 012) × 2(129 – 127)
= (–1) × 1.25 × 22
= –5.0

18
19
Single Precision Examples
■ Denormalized Numbers

20
Precisions
■ Extended precision: 80 bits (Intel only)

s exp frac

1 15-bits 63 or 64-bits

23
Carnegie Mellon

Special Properties of Encoding

■ FP Zero Same as Integer Zero
 All bits = 0

■ Can (Almost) Use Unsigned Integer Comparison

 Must first compare sign bits
 Must consider −0 = 0
 NaNs problematic
Will be greater than any other values
 What should comparison yield?
 Otherwise OK
 Denorm vs. normalized
 Normalized vs. infinity

24
Carnegie Mellon

Today: Floating Point

■ Background: Fractional binary numbers
■ IEEE floating point standard: Definition
■ Example and properties
■ Rounding, addition, multiplication
■ Floating point in C
■ Summary

25
Carnegie Mellon

Floating Point Operations: Basic Idea

■ x +f y = Round(x + y)

■ x f y = Round(x  y)

■ Basic idea
 First compute exact result
 Make it fit into desired precision
 Possibly overflow if exponent too large
 Possibly round to fit into frac

26
Carnegie Mellon

Rounding
■ Rounding Modes (illustrate with $ rounding)

■ $1.40 $1.60 $1.50 $2.50 –$1.50

 Towards zero $1 $1 $1 $2 –$1
 Round down (−) $1 $1 $1 $2 –$2
 Round up (+) $2 $2 $2 $3 –$1
 Nearest Even (default) $1 $2 $2 $2 –$2

■ What are the advantages of the modes?

27
Carnegie Mellon

Closer Look at Round-To-Even

■ Default Rounding Mode
 All others are statistically biased
 Sum of set of positive numbers will consistently be over- or under-
estimated

■ Applying to Other Decimal Places / Bit Positions

 When exactly halfway between two possible values
 Round so that least significant digit is even
 E.g., round to nearest hundredth
1.2349999 1.23 (Less than half way)
1.2350001 1.24 (Greater than half way)
1.2350000 1.24 (Half way—round up)
1.2450000 1.24 (Half way—round down)

28
Carnegie Mellon

Rounding Binary Numbers

■ Binary Fractional Numbers
 “Even” when least significant bit is 0
 “Half way” when bits to right of rounding position = 100… 2

■ Examples
 Round to nearest 1/4 (2 bits right of binary point)
Value Binary Rounded Action Rounded Value
2 3/32 10.000112 10.002 (<1/2—down) 2
2 3/16 10.001102 10.012 (>1/2—up) 2 1/4
2 7/8 10.111002 11.002 ( 1/2—up) 3
2 5/8 10.101002 10.102 ( 1/2—down) 2 1/2

29
Floating-Point Addition
Consider a 4-digit decimal example
9.999 × 101 + 1.610 × 10–1
1. Align decimal points
Shift number with smaller exponent
9.999 × 101 + 0.016 × 101
2. Add significands
9.999 × 101 + 0.016 × 101 = 10.015 × 101
3. Normalize result & check for over/underflow
1.0015 × 102
4.Round and renormalize if necessary. Assume only four
digits are allowed for significant and two digits for exponent
1.002 × 102

Chapter 3 — Arithmetic for Computers — 30

30
Floating-Point Addition
Now consider a 4-digit binary example
1.0002 × 2–1 + –1.1102 × 2–2 (0.5 + –0.4375)
1. Align binary points
Shift number with smaller exponent
1.0002 × 2–1 + –0.1112 × 2–1
2. Add significands
1.0002 × 2–1 + –0.1112 × 2–1 = 0.0012 × 2–1
3. Normalize result & check for over/underflow
1.0002 × 2–4, (no over/underflow
4. Round and renormalize if necessary
1.0002 × 2–4 (no change) = 0.0625

Chapter 3 — Arithmetic for Computers — 31

30
Floating-Point Multiplication
 Consider a 4-digit decimal example
 1.110 × 1010 × 9.200 × 10–5
Biased 10 + 127 = 137, and -5 + 127 = 122,
 1. Add exponents New exponent
137 + 122 = 259 Wrong !!!
 For biased exponents, subtract bias from sum
(10 + 127) + (-5 +127) = (5 + 2 *127) = 259
 New exponent = 10 + –5 = 5 we must subtract the bias from the sum:

2. Multiply significands New exponent 137 + 122 - 127 =

1.110 × 9.200 = 10.212  10.212 × 105 259 -127 = 132 = (5 + 127)
3. Normalize result & check for
over/underflow
1.0212 × 106
4. Round and renormalize if necessary
1.021 × 106
5. Determine sign of result from signs of
operands
+1.021 × 106

Chapter 3 — Arithmetic for Computers — 32

32
Floating-Point Multiplication
■ Now consider a 4-digit binary example
 1.0002 × 2–1 × –1.1102 × 2–2 (0.5 × –0.4375)
■ 1. Add exponents
 Unbiased: –1 + –2 = –3
 Biased: (–1 + 127) + (–2 + 127) = –3 + 254 – 127 = –3 + 127
■ 2. Multiply significands
 1.0002 × 1.1102 = 1.1100002  1.1102 × 2–3
■ 3. Normalize result & check for over/underflow
 1.1102 × 2–3 (no change) & no over/underflow
■ 4. Round and renormalize if necessary
 1.1102 × 2–3 (no change)
■ 5. Determine sign: +ve × –ve  –ve
 –1.1102 × 2–3 = –0.21875
Chapter 3 — Arithmetic for Computers — 33
33
Carnegie Mellon

Today: Floating Point

■ Background: Fractional binary numbers
■ IEEE floating point standard: Definition
■ Example and properties
■ Rounding, addition, multiplication
■ Floating point in C
■ Summary

34
Carnegie Mellon

Floating Point in C
■ C Guarantees Two Levels
float single precision
double double precision

■ Conversions/Casting
Casting between int, float, and double changes bit representation
 double/float → int
 Truncates fractional part
 Like rounding toward zero
 Not defined when out of range or NaN: Generally sets to TMin
 int → double
 Exact conversion, as long as int has ≤ 53 bit word size
 int → float
 Will round according to rounding mode

35
Carnegie Mellon

Floating Point Puzzles

■ For each of the following C expressions, either:
 Argue that it is true for all argument values
 Explain why not true
• x == (int)(float) x
• x == (int)(double) x
• f == (float)(double) f
int x = …;
float f = …; • d == (float) d
double d = …; • f == -(-f);
• 2/3 == 2/3.0
Assume neither • d < 0.0 ((d*2) < 0.0)
d nor f is NaN • d>f -f > -d
• d * d >= 0.0
• (d+f)-d == f

36
Carnegie Mellon

Today: Floating Point

■ Background: Fractional binary numbers
■ IEEE floating point standard: Definition
■ Example and properties
■ Rounding, addition, multiplication
■ Floating point in C
■ Summary

37
Carnegie Mellon

Summary
■ IEEE Floating Point has clear mathematical properties
■ Represents numbers of form M x 2E
■ One can reason about operations independent of
implementation
 As if computed with perfect precision and then rounded
■ Not the same as real arithmetic
 Violates associativity/distributivity
 Makes life difficult for compilers & serious numerical applications
programmers

Samples of Test Bank For Information Security and IT Risk Management 1st Edition by Manish Agrawal
No ratings yet
Samples of Test Bank For Information Security and IT Risk Management 1st Edition by Manish Agrawal
6 pages
Lecture 3 - Floating Point
No ratings yet
Lecture 3 - Floating Point
33 pages
Floating Point
No ratings yet
Floating Point
33 pages
Floating Points
No ratings yet
Floating Points
31 pages
Floating Point & Fixed Point Representation - BCA II
No ratings yet
Floating Point & Fixed Point Representation - BCA II
24 pages
NAChapter 1
No ratings yet
NAChapter 1
24 pages
Complete Floating Point (Blog)
No ratings yet
Complete Floating Point (Blog)
18 pages
EE 109 Unit 20: IEEE 754 Floating Point Representation Floating Point Arithmetic
No ratings yet
EE 109 Unit 20: IEEE 754 Floating Point Representation Floating Point Arithmetic
31 pages
Computer Architecture: Nguyễn Trí Thành
No ratings yet
Computer Architecture: Nguyễn Trí Thành
55 pages
Floating Point Numbers 237045407 237045407
No ratings yet
Floating Point Numbers 237045407 237045407
20 pages
04 Float
No ratings yet
04 Float
40 pages
04 Float 2
No ratings yet
04 Float 2
44 pages
5 Data - Floating - Point v1
No ratings yet
5 Data - Floating - Point v1
25 pages
Floating Point Arithmetic Class
No ratings yet
Floating Point Arithmetic Class
24 pages
Floa NG Point: 15 - 213: Introduc On To Computer Systems 4 Lecture, Sep 5, 2013
No ratings yet
Floa NG Point: 15 - 213: Introduc On To Computer Systems 4 Lecture, Sep 5, 2013
40 pages
Fix and Floting Systems
No ratings yet
Fix and Floting Systems
28 pages
Floating Point: 15-213: Introduction To Computer Systems 4 Lecture, Sep. 10, 2015
No ratings yet
Floating Point: 15-213: Introduction To Computer Systems 4 Lecture, Sep. 10, 2015
40 pages
Lecture 4 - Floating Point Data
No ratings yet
Lecture 4 - Floating Point Data
44 pages
8.3 Floating Point Numbers
No ratings yet
8.3 Floating Point Numbers
19 pages
Floating Point Sept 6, 2006 15-213: "The Course That Gives CMU Its Zip!"
No ratings yet
Floating Point Sept 6, 2006 15-213: "The Course That Gives CMU Its Zip!"
34 pages
Floating Point Representation of Data: By-Astha Jain Class-It1 0827IT171019
No ratings yet
Floating Point Representation of Data: By-Astha Jain Class-It1 0827IT171019
16 pages
Division: Check For 0 Divisor Long Division Approach
No ratings yet
Division: Check For 0 Divisor Long Division Approach
27 pages
Floating Point 6up
No ratings yet
Floating Point 6up
7 pages
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
No ratings yet
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
32 pages
Floating Point
No ratings yet
Floating Point
13 pages
Lecture11 Slides 1
No ratings yet
Lecture11 Slides 1
52 pages
CH03 Data II
No ratings yet
CH03 Data II
31 pages
Lec07 - Computer Arithmetic - Floating-Point Representation and Arithmetic
No ratings yet
Lec07 - Computer Arithmetic - Floating-Point Representation and Arithmetic
42 pages
Ece552 10 Floating Point
No ratings yet
Ece552 10 Floating Point
15 pages
Fixed Point Numbers
No ratings yet
Fixed Point Numbers
20 pages
Data Storage in Computer System: BITS Pilani
No ratings yet
Data Storage in Computer System: BITS Pilani
30 pages
Floating Point
No ratings yet
Floating Point
3 pages
Lect4 Floats
No ratings yet
Lect4 Floats
64 pages
Week8 Slides
No ratings yet
Week8 Slides
43 pages
#3 - Floating Point
No ratings yet
#3 - Floating Point
38 pages
Fixed & Floating Point
No ratings yet
Fixed & Floating Point
31 pages
LEC03 Data II
No ratings yet
LEC03 Data II
45 pages
The World Is Not Just Integers: Programming Languages Support Numbers With Fraction
No ratings yet
The World Is Not Just Integers: Programming Languages Support Numbers With Fraction
51 pages
Floating - Point - Number
No ratings yet
Floating - Point - Number
36 pages
Asembly Language
No ratings yet
Asembly Language
42 pages
COA
No ratings yet
COA
14 pages
CH08.2-Computer Arithmetic
No ratings yet
CH08.2-Computer Arithmetic
14 pages
Numerical Methods Chap1
No ratings yet
Numerical Methods Chap1
14 pages
Floating Point: - We Need A Way To Represent
No ratings yet
Floating Point: - We Need A Way To Represent
14 pages
4.4 - 1 New Floating Point
No ratings yet
4.4 - 1 New Floating Point
22 pages
Chapter 03 Arith 3 Float
No ratings yet
Chapter 03 Arith 3 Float
30 pages
Manage-Implementation of Floating - Bhagyashree Hardiya
No ratings yet
Manage-Implementation of Floating - Bhagyashree Hardiya
6 pages
Chap 02
No ratings yet
Chap 02
16 pages
Lecture 6. Fixed and Floating Point Numbers: Prof. Taeweon Suh Computer Science Education Korea University
No ratings yet
Lecture 6. Fixed and Floating Point Numbers: Prof. Taeweon Suh Computer Science Education Korea University
24 pages
Floating Point Arithmetic
100% (1)
Floating Point Arithmetic
30 pages
Floating Point Representation: Reading: B&O 2.4
No ratings yet
Floating Point Representation: Reading: B&O 2.4
44 pages
Lecture 4
No ratings yet
Lecture 4
154 pages
Binary Tutorial
No ratings yet
Binary Tutorial
10 pages
Floating Point
No ratings yet
Floating Point
6 pages
Demystifying Floating Point - John Farrier - CppCon 2015
No ratings yet
Demystifying Floating Point - John Farrier - CppCon 2015
61 pages
Floating Point Alu
No ratings yet
Floating Point Alu
11 pages
Cosc 2150: Computer Organization: Chapter 9, Part 3 Floating Point Numbers
No ratings yet
Cosc 2150: Computer Organization: Chapter 9, Part 3 Floating Point Numbers
39 pages
Rounding Errors: Course Website
No ratings yet
Rounding Errors: Course Website
34 pages
Floating Point Representation of Numbers: Wide Range
No ratings yet
Floating Point Representation of Numbers: Wide Range
11 pages
Floating-Point Numbers
No ratings yet
Floating-Point Numbers
23 pages
Anti-Aliasing with MSAA vs ABAA
From Everand
Anti-Aliasing with MSAA vs ABAA
Michel A Rohner
No ratings yet
Lab Manual 12
No ratings yet
Lab Manual 12
11 pages
Writing An Interpreter in Go - Writing An Interpreter in Go - Thorsten Ball PDF
No ratings yet
Writing An Interpreter in Go - Writing An Interpreter in Go - Thorsten Ball PDF
14 pages
SMDC Residential in Philippines Access Control Case Study
No ratings yet
SMDC Residential in Philippines Access Control Case Study
2 pages
Ahd-Dpu 9 Dat en V9 20221025
No ratings yet
Ahd-Dpu 9 Dat en V9 20221025
2 pages
Igs-Nt Communication-Guide - 7
No ratings yet
Igs-Nt Communication-Guide - 7
155 pages
SY-5005005 HMI EngineersManual
No ratings yet
SY-5005005 HMI EngineersManual
136 pages
Adobe Research - MDSRTeamFlyerV3
No ratings yet
Adobe Research - MDSRTeamFlyerV3
3 pages
L12A Introduction To Multiprocessors Part I
No ratings yet
L12A Introduction To Multiprocessors Part I
61 pages
Visual Programming: by Sohail Adil Khan
No ratings yet
Visual Programming: by Sohail Adil Khan
37 pages
Pondicherry University Curriculum and Syllabi For: B.Tech. (Computer Science and Engineering)
No ratings yet
Pondicherry University Curriculum and Syllabi For: B.Tech. (Computer Science and Engineering)
24 pages
Python Notes Class VI Rev
No ratings yet
Python Notes Class VI Rev
4 pages
Autonomous Database Ecpu Faq
No ratings yet
Autonomous Database Ecpu Faq
5 pages
Advanced Programming Using The Spark Core API: in This Chapter
No ratings yet
Advanced Programming Using The Spark Core API: in This Chapter
69 pages
Dell Poweredge 30year Infographic
No ratings yet
Dell Poweredge 30year Infographic
1 page
SkyFactory 4 Multiplayer Instructions
No ratings yet
SkyFactory 4 Multiplayer Instructions
6 pages
ASUS Dual AMD Radeon™ RX 6600 8GB GDDR6 Gaming Graphics Card (AMD RDNA™ 2, PCIe 4.0, 8GB GDDR6 Memory, HDMI 2.1, DisplayPort 1.4 PDF
No ratings yet
ASUS Dual AMD Radeon™ RX 6600 8GB GDDR6 Gaming Graphics Card (AMD RDNA™ 2, PCIe 4.0, 8GB GDDR6 Memory, HDMI 2.1, DisplayPort 1.4 PDF
1 page
Alta ARINC PCIE4L A429 Data Sheet 1
No ratings yet
Alta ARINC PCIE4L A429 Data Sheet 1
2 pages
Eng PCB800860 Edp 30 40 Lvds 40 - 240826 - 211500
0% (1)
Eng PCB800860 Edp 30 40 Lvds 40 - 240826 - 211500
14 pages
VINTIK-PI V2 Schematic - Pulse Induction Metal Detectors
100% (1)
VINTIK-PI V2 Schematic - Pulse Induction Metal Detectors
1 page
Ds Mod 5
No ratings yet
Ds Mod 5
17 pages
2.2 Dynamics of Feedback Control Systems
No ratings yet
2.2 Dynamics of Feedback Control Systems
42 pages
Smart Note Taker
No ratings yet
Smart Note Taker
18 pages
ABAP Programming Language
No ratings yet
ABAP Programming Language
25 pages
GC 2024 06 19
No ratings yet
GC 2024 06 19
20 pages
Device Registration Process - Annexure-1
No ratings yet
Device Registration Process - Annexure-1
2 pages
Objective Ques
No ratings yet
Objective Ques
8 pages
Pico 8
No ratings yet
Pico 8
75 pages
Reading The Design Into PT: Learning Objectives
No ratings yet
Reading The Design Into PT: Learning Objectives
16 pages
Installing Window 98: How To Install Windows 98? Step-by-Step
No ratings yet
Installing Window 98: How To Install Windows 98? Step-by-Step
16 pages

2.4 Floating Points

Uploaded by

2.4 Floating Points

Uploaded by

Carnegie Mellon

Today: Floating Point

Fractional binary numbers

Fractional Binary Numbers

Fractional Binary Numbers: Examples

 Other rational numbers have repeating bit representations

Today: Floating Point

IEEE Floating Point

■ Driven by numerical concerns

x  (1)S (1 Fraction)  2(Exponent Bias)

■ Overflow - the exponent is too large to be represented in the

Special Properties of Encoding

■ Can (Almost) Use Unsigned Integer Comparison

Today: Floating Point

Floating Point Operations: Basic Idea

■ $1.40 $1.60 $1.50 $2.50 –$1.50

■ What are the advantages of the modes?

Closer Look at Round-To-Even

■ Applying to Other Decimal Places / Bit Positions

Rounding Binary Numbers

Chapter 3 — Arithmetic for Computers — 30

Chapter 3 — Arithmetic for Computers — 31

2. Multiply significands New exponent 137 + 122 - 127 =

Chapter 3 — Arithmetic for Computers — 32

Today: Floating Point

Floating Point Puzzles

Today: Floating Point

You might also like