0% found this document useful (0 votes)

63 views

Floating Point Arithmetic

This document discusses 32-bit floating point arithmetic and the IEEE 754 floating point standard. It describes how floating point numbers are represented using a sign bit, exponent field, and fraction field, and how operations like addition, multiplication, and special values like infinity and NaN are handled. Normalized and denormalized number representations are also covered.

Uploaded by

shwetabhagat

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views

Floating Point Arithmetic

Uploaded by

shwetabhagat

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 30

32 BIT FLOATING POINT

ARITHMETIC UNIT

SHWETA(2k21/VLS/18)
Outline
 Floating-Point Numbers
 IEEE 754 Floating-Point Standard
 Floating-Point Addition and Subtraction
 Floating-Point Multiplication
 Simulation Results
The World is Not Just Integers
 Programming languages support numbers with fraction
 Called floating-point numbers
 Examples:
3.14159265… (π)
2.71828… (e)
0.000000001 or 1.0 × 10–9 (seconds in a nanosecond)
86,400,000,000,000 or 8.64 × 1013 (nanoseconds in a day)
last number is a large integer that cannot fit in a 32-bit integer

 We use a scientific notation to represent

 Very small numbers (e.g. 1.0 × 10–9)
 Very large numbers (e.g. 8.64 × 10 13)
 Scientific notation: ± d . f1f2f3f4 … × 10 ± e e e 1 2 3
Floating-Point Numbers
 Examples of floating-point numbers in base 10 …
 5.341×103 , 0.05341×105 , –2.013×10–1 , –201.3×10–3
decimal point
 Examples of floating-point numbers in base 2 …
 1.00101×223 , 0.0100101×225 , –1.101101×2–3 , 1101.101×2–6
binary point
 Exponents are kept in decimal for clarity
 Floating-point numbers should be normalized
 Exactly one non-zero digit should appear before the point
 In a decimal number, this digit can be from 1 to 9
 In a binary number, this digit should be 1
 Normalized FP Numbers: 5.341×103 and –1.101101×2–3
 NOT Normalized: 0.05341×105 and –1101.101×2–6
Floating-Point Representation
 A floating-point number is represented by the triple
 S is the Sign bit (0 is positive and 1 is negative)
 Representation is called sign and magnitude
 E is the Exponent field (signed)
 Very large numbers have large positive exponents
 Very small close-to-zero numbers have negative exponents
 More bits in exponent field increases range of values

 F is the Fraction field (fraction after binary point)

 More bits in fraction field improves the precision of FP numbers
S Exponent Fraction

Value of a floating-point number = (-1) S × val(F) × 2val(E)

IEEE 754 Floating-Point Standard
 Found in virtually every computer invented since 1980
 Simplified porting of floating-point numbers
 Unified the development of floating-point algorithms
 Increased the accuracy of floating-point numbers
 Single Precision Floating Point Numbers (32 bits)
 1-bit sign + 8-bit exponent + 23-bit fraction
S Exponent8 Fraction23

 Double Precision Floating Point Numbers (64 bits)

 1-bit sign + 11-bit exponent + 52-bit fraction
S Exponent11 Fraction52
(continued)
Normalized Floating Point Numbers

 For a normalized floating point number (S, E, F)

S E F = f 1 f 2 f3 f4 …

 Significand is equal to (1.F)2 = (1.f1f2f3f4…)2

IEEE 754 assumes hidden 1. (not stored) for normalized numbers
 Significand is 1 bit longer than fraction

 Value of a Normalized Floating Point Number is

(–1)S × (1.F)2 × 2val(E)
(–1)S × (1.f1f2f3f4 …)2 × 2val(E)
(–1)S × (1 + f1×2-1 + f2×2-2 + f3×2-3 + f4×2-4 …)2 × 2val(E)

(–1)S is 1 when S is 0 (positive), and –1 when S is 1 (negative)

Biased Exponent Representation
 How to represent a signed exponent? Choices are …
 Sign + magnitude representation for the exponent
 Two’s complement representation
 Biased representation
 IEEE 754 uses biased representation for the exponent
 Value of exponent = val(E) = E – Bias (Bias is a constant)
 Recall that exponent field is 8 bits for single precision
 E can be in the range 0 to 255
 E = 0 and E = 255 are reserved for special use (discussed later)
 E = 1 to 254 are used for normalized floating point numbers
 Bias = 127 (half of 254), val(E) = E – 127
 val(E=1) = –126, val(E=127) = 0, val(E=254) = 127
Biased Exponent – Cont’d
 For double precision, exponent field is 11 bits
 E can be in the range 0 to 2047
 E = 0 and E = 2047 are reserved for special use
 E = 1 to 2046 are used for normalized floating point numbers
 Bias = 1023 (half of 2046), val(E) = E – 1023
 val(E=1) = –1022, val(E=1023) = 0, val(E=2046) = 1023
 Value of a Normalized Floating Point Number is
(–1)S × (1.F)2 × 2E – Bias
(–1)S × (1.f1f2f3f4 …)2 × 2E – Bias
(–1)S × (1 + f1×2-1 + f2×2-2 + f3×2-3 + f4×2-4 …)2 × 2E – Bias
Examples of Single Precision Float
 What is the decimal value of this Single Precision float?
10111110001000000000000000000000

 Solution:
 Sign = 1 is negative
 Exponent = (01111100)2 = 124, E – bias = 124 – 127 = –3
 Significand = (1.0100 … 0)2 = 1 + 2-2 = 1.25 (1. is implicit)
 Value in decimal = –1.25 × 2–3 = –0.15625
Examples of Double Precision Float

 What is the decimal value of this Double Precision

float ?
01000000010100101010000000000000
00000000000000000000000000000000

 Solution:
 Value of exponent = (10000000101)2 – Bias = 1029 –
1023 = 6
 Value of double float = (1.00101010 … 0)2 × 26 (1. is
implicit) =(1001010.10 … 0)2 = 74.5
Largest Normalized Float
 What is the Largest normalized float?
 Solution for Single Precision:
01111111011111111111111111111111
 Exponent – bias = 254 – 127 = 127 (largest exponent for SP)
 Significand = (1.111 … 1)2 = almost 2
 Value in decimal ≈ 2 × 2127 ≈ 2128 ≈ 3.4028 … × 1038
 Solution for Double Precision:
01111111111011111111111111111111
11111111111111111111111111111111
 Value in decimal ≈ 2 × 21023 ≈ 21024 ≈ 1.79769 … × 10308
 Overflow: exponent is too large to fit in the exponent field
Smallest Normalized Float
 What is the smallest (in absolute value) normalized float?
 Solution for Single Precision:
00000000100000000000000000000000
 Exponent – bias = 1 – 127 = –126 (smallest exponent for SP)
 Significand = (1.000 … 0)2 = 1
 Value in decimal = 1 × 2–126 = 1.17549 … × 10–38
 Solution for Double Precision:
00000000000100000000000000000000
00000000000000000000000000000000

 Value in decimal = 1 × 2–1022 = 2.22507 … × 10–308

 Underflow: exponent is too small to fit in exponent field
Zero, Infinity, and NaN
 Zero
 Exponent field E = 0 and fraction F = 0
 +0 and –0 are possible according to sign bit S
 Infinity
 Infinity is a special value represented with maximum E and F
 For single precision with 8-bit exponent: maximum E = 255
 For double precision with 11-bit exponent: maximum E = 2047

 Infinity can result from overflow or division by zero

 +∞ and –∞ are possible according to sign bit S
 NaN (Not a Number)
 NaN is a special value represented with maximum E and F ≠ 0
 Result from exceptional situations, such as 0/0 or sqrt(negative)
 Operation on a NaN results is NaN: Op(X, NaN) = NaN
Denormalized Numbers
 IEEE standard uses denormalized numbers to …
 Fill the gap between 0 and the smallest normalized float
 Provide gradual underflow to zero
 Denormalized: exponent field E is 0 and fraction F ≠ 0
 Implicit 1. before the fraction now becomes 0. (not normalized)
 Value of denormalized number ( S, 0, F )
Single precision: (–1) S × (0.F)2 × 2–126
Double precision: (–1) S × (0.F)2 × 2–1022
Negative Negative Positive Positive
Overflow Underflow Underflow Overflow

-∞ Normalized (–ve) Denorm Denorm Normalized (+ve) +∞

-2128 -2–126 0 2–126 2128

Special Value Rules
Operation Result
n /  0
 x  
nonzero / 0 
+  (similar for -)
0 / 0 NaN
- NaN (similar for -)
 /  NaN
 x 0 NaN
NaN op anything NaN
Summary of IEEE 754 Encoding
Single-Precision Exponent = 8 Fraction = 23 Value
Normalized Number 1 to 254 Anything ± (1.F)2 × 2E – 127
Denormalized Number 0 nonzero ± (0.F)2 × 2–126
Zero 0 0 ±0
Infinity 255 0 ±∞
NaN 255 nonzero NaN

Double-Precision Exponent = 11 Fraction = 52 Value

Normalized Number 1 to 2046 Anything ± (1.F)2 × 2E – 1023
Denormalized Number 0 nonzero ± (0.F)2 × 2–1022
Zero 0 0 ±0
Infinity 2047 0 ±∞
NaN 2047 nonzero NaN
Floating Point Addition Example
 Consider adding: (1.111)2 × 2–1 + (1.011)2 × 2–3
 For simplicity, we assume 4 bits of precision (or 3 bits of fraction)
 Cannot add significands … Why?
 Because exponents are not equal
 How to make exponents equal?
 Shift the significand of the lesser exponent right until its exponent
matches the larger number
 (1.011)2 × 2–3 = (0.1011)2 × 2–2 = (0.01011)2 × 2–1
 Difference between the two exponents = –1 – (–3) = 2
 So, shift right by 2 bits 1.111
+
0.01011
 Now, add the significands:
Carry 10.00111
Addition Example – cont’d
 So, (1.111)2 × 2–1 + (1.011)2 × 2–3 = (10.00111)2 × 2–1
 However, result (10.00111)2 × 2–1 is NOT normalized
 Normalize result: (10.00111)2 × 2–1 = (1.000111)2 × 20
 In this example, we have a carry
 So, shift right by 1 bit and increment the exponent
 Round the significand to fit in appropriate number of bits
 We assumed 4 bits of precision or 3 bits of fraction 1.000 111
+ 1
 Round to nearest: (1.000111)2 ≈ (1.001)2
1.001
 Renormalize if rounding generates a carry
 Detect overflow / underflow
 If exponent becomes too large (overflow) or too small (underflow)
Floating Point Subtraction Example

 Consider: (1.000)2 × 2–3 – (1.000)2 × 22

 We assume again: 4 bits of precision (or 3 bits of fraction)
 Shift significand of the lesser exponent right
 Difference between the two exponents = 2 – (–3) = 5
 Shift right by 5 bits: (1.000)2 × 2–3 = (0.00001000)2 × 22
 Convert subtraction into addition to 2's complement

Sign Since result is negative, convert

+ 0.00001 × 2 2
result from 2's complement to
2’s Complement

– 1.00000 × 22 sign-magnitude
0 0.00001 × 22
2’s Complement
1 1.00000 × 22 – 0.11111 × 22
1 1.00001 × 22
Subtraction Example – cont’d
 So, (1.000)2 × 2–3 – (1.000)2 × 22 = – 0.111112 × 22
 Normalize result: – 0.111112 × 22 = – 1.11112 × 21
 For subtraction, we can have leading zeros
 Count number z of leading zeros (in this case z = 1)
 Shift left and decrement exponent by z
 Round the significand to fit in appropriate number of bits
 We assumed 4 bits of precision or 3 bits of fraction
 Round to nearest: (1.1111)2 ≈ (10.000)2 1.111 1
+ 1
 Renormalize: rounding generated a carry 10.000
–1.11112 × 21 ≈ –10.0002 × 21 = –1.0002 × 22
 Result would have been accurate if more fraction bits are used
Floating Point Addition / Subtraction
Start
Shift significand right by
1. Compare the exponents of the two numbers. Shift the smaller d = | EX – EY |
number to the right until its exponent would match the larger
exponent.
Add significands when signs
of X and Y are identical,
2. Add / Subtract the significands according to the sign bits.
Subtract when different
X – Y becomes X + (–Y)
3. Normalize the sum, either shifting right and incrementing the
exponent or shifting left and decrementing the exponent
Normalization shifts right by 1 if
4. Round the significand to the appropriate number of bits, and there is a carry, or shifts left by the
renormalize if rounding generates a carry number of leading zeros in the
case of subtraction

Overflow or yes
Exception Rounding either truncates fraction,
underflow?
or adds a 1 to least significant
no fraction bit
Done
Simulation
Floating Point Multiplication Example

 Consider multiplying: 1.0102 × 2–1 by –1.1102 × 2–2

 As before, we assume 4 bits of precision (or 3 bits of fraction)
 Unlike addition, we add the exponents of the operands
 Result exponent value = (–1) + (–2) = –3
 Using the biased representation: EZ = EX + EY – Bias
 EX = (–1) + 127 = 126 (Bias = 127 for SP)
 EY = (–2) + 127 = 125 1.010
×
1.110
 EZ = 126 + 125 – 127 = 124 (value = –3)
0000
 Now, multiply the significands: 1010
(1.010)2 × (1.110)2 = (10.001100)2 1010
3-bit fraction 3-bit fraction 6-bit fraction 1010
10001100
Multiplication Example – cont’d
 Since sign SX ≠ SY, sign of product SZ = 1 (negative)
 So, 1.0102 × 2–1 × –1.1102 × 2–2 = –10. 0011002 × 2–3
 However, result: –10. 0011002 × 2–3 is NOT normalized
 Normalize: 10. 0011002 × 2–3 = 1.00011002 × 2–2
 Shift right by 1 bit and increment the exponent
 At most 1 bit can be shifted right … Why?
 Round the significand to nearest:
1.000 1100
1.00011002 ≈ 1.0012 (3-bit fraction) + 1
Result ≈ –1. 0012 × 2–2 (normalized) 1.001
 Detect overflow / underflow
 No overflow / underflow because exponent is within range
Floating Point Multiplication
Start
Biased Exponent Addition
1. Add the biased exponents of the two numbers, subtracting the EZ = EX + EY – Bias
bias from the sum to get the new biased exponent
Result sign SZ = SX xor SY can be
2. Multiply the significands. Set the result sign to positive if computed independently
operands have same sign, and negative otherwise

Since the operand significands

3. Normalize the product if necessary, shifting its significand right 1.FX and 1.FY are ≥ 1 and < 2, their
and incrementing the exponent
product is ≥ 1 and < 4.
To normalize product, we need to
4. Round the significand to the appropriate number of bits, and shift right by 1 bit only and
renormalize if rounding generates a carry increment exponent

yes
Rounding either truncates fraction,
Overflow or
Exception or adds a 1 to least significant
underflow?
fraction bit
no
Done
Simulation
Advantages of IEEE 754 Standard
 Used predominantly by the industry
 Encoding of exponent and fraction simplifies comparison
 Integer comparator used to compare magnitude of FP numbers
 Includes special exceptional values: NaN and ±∞
 Special rules are used such as:
 0/0 is NaN, sqrt(–1) is NaN, 1/0 is ∞, and 1/∞ is 0
 Computation may continue in the face of exceptional conditions
 Denormalized numbers to fill the gap
 Between smallest normalized number 1.0 × 2E and zero min

 Denormalized numbers, values 0.F × 2E , are closer to zero

min

 Gradual underflow to zero

Floating Point Complexities
 Operations are somewhat more complicated
 In addition to overflow we can have underflow
 Accuracy can be a big problem
 Implementing the standard can be tricky
 Not using the standard can be even worse

SONY HCD S300 Electrisch Schema
100% (1)
SONY HCD S300 Electrisch Schema
91 pages
Face-Off Challenge Throw Challenge: # of Successes Needed Modifiers
No ratings yet
Face-Off Challenge Throw Challenge: # of Successes Needed Modifiers
2 pages
Combined Avesta Grammar
No ratings yet
Combined Avesta Grammar
98 pages
Reviewing Nuristani Languages and Tribal Dialects
No ratings yet
Reviewing Nuristani Languages and Tribal Dialects
11 pages
OscilloPhone Use Your Smartphone As An Oscilloscop PDF
No ratings yet
OscilloPhone Use Your Smartphone As An Oscilloscop PDF
22 pages
Colarusso - Typological Parallels Between PIE and NWC (1981)
No ratings yet
Colarusso - Typological Parallels Between PIE and NWC (1981)
83 pages
Basic Stamp 2 Tutorial
No ratings yet
Basic Stamp 2 Tutorial
376 pages
Grammar
No ratings yet
Grammar
19 pages
Pluguin Opentoonz
No ratings yet
Pluguin Opentoonz
2 pages
STM32 Nucleo Boards Manual
No ratings yet
STM32 Nucleo Boards Manual
63 pages
ElfballTacticsGuide v1.1
No ratings yet
ElfballTacticsGuide v1.1
45 pages
Course Book
100% (2)
Course Book
105 pages
Sikhism For Modern Man
100% (1)
Sikhism For Modern Man
110 pages
Saptamenzu Si
No ratings yet
Saptamenzu Si
128 pages
Hyde Clarke-Turanian Epoch of Romans
No ratings yet
Hyde Clarke-Turanian Epoch of Romans
52 pages
Experiment 1 Familiarization With CRO and Function Generator
No ratings yet
Experiment 1 Familiarization With CRO and Function Generator
3 pages
Luwian, Language (Yakubovich)
No ratings yet
Luwian, Language (Yakubovich)
26 pages
Rastorgueva Vs Et Al The Gilaki Language
No ratings yet
Rastorgueva Vs Et Al The Gilaki Language
461 pages
Spoken Kashmiri
No ratings yet
Spoken Kashmiri
120 pages
Secret Weapons3
No ratings yet
Secret Weapons3
8 pages
FORTH On The Atari Learning by Using
No ratings yet
FORTH On The Atari Learning by Using
132 pages
Armen Petrosyan
No ratings yet
Armen Petrosyan
4 pages
400 Puzzles
100% (1)
400 Puzzles
180 pages
Programming Robots
100% (1)
Programming Robots
239 pages
Fpga Based Coin Recognition System: A Technical Report
No ratings yet
Fpga Based Coin Recognition System: A Technical Report
14 pages
Digital Design Using Verilog HDL PDF
No ratings yet
Digital Design Using Verilog HDL PDF
108 pages
The World Is Not Just Integers: Programming Languages Support Numbers With Fraction
No ratings yet
The World Is Not Just Integers: Programming Languages Support Numbers With Fraction
51 pages
Floating Point Arithmetic Class
No ratings yet
Floating Point Arithmetic Class
24 pages
Lec07 - Computer Arithmetic - Floating-Point Representation and Arithmetic
No ratings yet
Lec07 - Computer Arithmetic - Floating-Point Representation and Arithmetic
42 pages
Lecture 4
No ratings yet
Lecture 4
21 pages
Lecture 02 - Floating Point Arithmetic
No ratings yet
Lecture 02 - Floating Point Arithmetic
14 pages
Lecture 4
No ratings yet
Lecture 4
21 pages
Floating Point: - We Need A Way To Represent
No ratings yet
Floating Point: - We Need A Way To Represent
14 pages
"The Course That Gives CMU Its Zip!": Topics
No ratings yet
"The Course That Gives CMU Its Zip!": Topics
31 pages
CH03-Data-II(2) (2)
No ratings yet
CH03-Data-II(2) (2)
31 pages
Floating Point
No ratings yet
Floating Point
13 pages
Lecture5_Arithmetic for Computers – Part 2
No ratings yet
Lecture5_Arithmetic for Computers – Part 2
57 pages
IEEE 754 Floating Point Notes
No ratings yet
IEEE 754 Floating Point Notes
4 pages
Lecture 2
No ratings yet
Lecture 2
27 pages
Week8 Slides
No ratings yet
Week8 Slides
43 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
Floating Points
No ratings yet
Floating Points
31 pages
4-Floating-Point-inclass
No ratings yet
4-Floating-Point-inclass
33 pages
Floating Point Sept 6, 2006 15-213: "The Course That Gives CMU Its Zip!"
No ratings yet
Floating Point Sept 6, 2006 15-213: "The Course That Gives CMU Its Zip!"
34 pages
Floating Point Representation of Data: By-Astha Jain Class-It1 0827IT171019
No ratings yet
Floating Point Representation of Data: By-Astha Jain Class-It1 0827IT171019
16 pages
Floating-Point Numbers
No ratings yet
Floating-Point Numbers
23 pages
Chapter2 2.5
No ratings yet
Chapter2 2.5
34 pages
#3 - Floating Point
No ratings yet
#3 - Floating Point
38 pages
3. Floating_Point_Number
No ratings yet
3. Floating_Point_Number
36 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
Floating Point & fixed point Representation_BCA II
No ratings yet
Floating Point & fixed point Representation_BCA II
24 pages
"The Course That Gives CMU Its Zip!": Topics
No ratings yet
"The Course That Gives CMU Its Zip!": Topics
30 pages
LEC03 Data II
No ratings yet
LEC03 Data II
45 pages
5 Data - Floating - Point v1
No ratings yet
5 Data - Floating - Point v1
25 pages
Cosc 2150: Computer Organization: Chapter 9, Part 3 Floating Point Numbers
No ratings yet
Cosc 2150: Computer Organization: Chapter 9, Part 3 Floating Point Numbers
39 pages
8.3 Floating Point Numbers
No ratings yet
8.3 Floating Point Numbers
19 pages
This Unit: Arithmetic and ALU Design Floating Point Arithmetic
No ratings yet
This Unit: Arithmetic and ALU Design Floating Point Arithmetic
8 pages
Lec 06
No ratings yet
Lec 06
49 pages
Arithmetic Operations On Binary Numbers: Two's Complement Addition
No ratings yet
Arithmetic Operations On Binary Numbers: Two's Complement Addition
11 pages
ML System Optimization Lecture 11 Quantization
No ratings yet
ML System Optimization Lecture 11 Quantization
150 pages
EE 109 Unit 20: IEEE 754 Floating Point Representation Floating Point Arithmetic
No ratings yet
EE 109 Unit 20: IEEE 754 Floating Point Representation Floating Point Arithmetic
31 pages
Digital Design Probable Project
No ratings yet
Digital Design Probable Project
1 page
11.5 The Shifter: Presentation by Abuga Dominic 2K19/VLS/24
No ratings yet
11.5 The Shifter: Presentation by Abuga Dominic 2K19/VLS/24
9 pages
Interview Experience - NXP Semiconductors
No ratings yet
Interview Experience - NXP Semiconductors
2 pages
Texas Instruments Interview
No ratings yet
Texas Instruments Interview
2 pages
Low Power Techniques For SRAM
No ratings yet
Low Power Techniques For SRAM
11 pages
M.Tech 2 Semester (ECE) Session 2020-21 VLS5208 Layout Design & Skills With Analog Perspective
No ratings yet
M.Tech 2 Semester (ECE) Session 2020-21 VLS5208 Layout Design & Skills With Analog Perspective
2 pages
RTL - Interview - Questions
No ratings yet
RTL - Interview - Questions
4 pages
AXI Stream Protocol
No ratings yet
AXI Stream Protocol
8 pages
LOGARITHMIC LOOK AHEAD ADDER Writeup
No ratings yet
LOGARITHMIC LOOK AHEAD ADDER Writeup
5 pages
Essentials of Materials Science and Engineering SI Edition 3rd Edition Askeland Solutions Manual - Available For Instant Download And Reading
100% (4)
Essentials of Materials Science and Engineering SI Edition 3rd Edition Askeland Solutions Manual - Available For Instant Download And Reading
53 pages
Linear Shrinkage
No ratings yet
Linear Shrinkage
4 pages
LEAST-MASTERED-COMPETENCIES-1st and 2nd Quarter-Grade V-2021
No ratings yet
LEAST-MASTERED-COMPETENCIES-1st and 2nd Quarter-Grade V-2021
5 pages
Maintenance Engineering Management of A LNG Plant Critical System A Case Study of Gas Turbine Dry Gas Seal
No ratings yet
Maintenance Engineering Management of A LNG Plant Critical System A Case Study of Gas Turbine Dry Gas Seal
9 pages
Sol. Man. - Chapter 16 - Accounting For Dividends
100% (2)
Sol. Man. - Chapter 16 - Accounting For Dividends
16 pages
CCNA Security 640-554: Chapter 7: Cryptographic Systems
No ratings yet
CCNA Security 640-554: Chapter 7: Cryptographic Systems
158 pages
Wiring Schematic (With ACS Option)
No ratings yet
Wiring Schematic (With ACS Option)
6 pages
Fe - DSR: Female Double Swivel Ring
No ratings yet
Fe - DSR: Female Double Swivel Ring
2 pages
Water Waves
No ratings yet
Water Waves
24 pages
A2 Expt 14.4 (8) Analysis of Iron Tablets
100% (2)
A2 Expt 14.4 (8) Analysis of Iron Tablets
3 pages
(Undergraduate Texts in Mathematics) Stephanie Frank Singer-Linearity, Symmetry, and Prediction in The Hydrogen Atom-Springer (2005) PDF
100% (3)
(Undergraduate Texts in Mathematics) Stephanie Frank Singer-Linearity, Symmetry, and Prediction in The Hydrogen Atom-Springer (2005) PDF
404 pages
Lesson 4.1 - Rectangular Coordinate System
No ratings yet
Lesson 4.1 - Rectangular Coordinate System
67 pages
Polution Test On Rodurflex 400kV
No ratings yet
Polution Test On Rodurflex 400kV
5 pages
Advances and Challenges in Understanding The Electrocatalytic Conversion of Carbon Dioxide To Fuels
No ratings yet
Advances and Challenges in Understanding The Electrocatalytic Conversion of Carbon Dioxide To Fuels
14 pages
Straus, Intro To Post-Tonal Theory, 294-307 PDF
No ratings yet
Straus, Intro To Post-Tonal Theory, 294-307 PDF
14 pages
12 Binom and Normal
No ratings yet
12 Binom and Normal
6 pages
Service Call Attendent Sheet
No ratings yet
Service Call Attendent Sheet
8 pages
Kinds of Variables and Their Uses
No ratings yet
Kinds of Variables and Their Uses
3 pages
SYLLABUS - M TECH Mechanical - Engineering - JNU
No ratings yet
SYLLABUS - M TECH Mechanical - Engineering - JNU
22 pages
Three-Dimensional Simulation of Warp Knitted Structures Based On Geometric Unit Cell of Loop Yarns
No ratings yet
Three-Dimensional Simulation of Warp Knitted Structures Based On Geometric Unit Cell of Loop Yarns
9 pages
LG M2294a Chassis Ln73a SM (Monitor de Eduardo Semprum)
No ratings yet
LG M2294a Chassis Ln73a SM (Monitor de Eduardo Semprum)
35 pages
Weekly Learning Activity Sheets General Physics 1 Grade 12, Quarter 2, Week 4
100% (2)
Weekly Learning Activity Sheets General Physics 1 Grade 12, Quarter 2, Week 4
4 pages
Problem Set 1 - Solutions To Differential Equation
No ratings yet
Problem Set 1 - Solutions To Differential Equation
1 page
Intermediate Programming (Java) 1: Course Title: Getting Started With Java Language
No ratings yet
Intermediate Programming (Java) 1: Course Title: Getting Started With Java Language
11 pages
RL Examples
No ratings yet
RL Examples
6 pages
Pro Pam360
No ratings yet
Pro Pam360
2 pages
Boulder Amateur TV Repeater's Newsletter-105
No ratings yet
Boulder Amateur TV Repeater's Newsletter-105
10 pages
Asset Worksheet Grade V 22-12-23
No ratings yet
Asset Worksheet Grade V 22-12-23
8 pages
Brosur PRIMUS RX 350 520 TMI
100% (1)
Brosur PRIMUS RX 350 520 TMI
2 pages

Floating Point Arithmetic

Uploaded by

Floating Point Arithmetic

Uploaded by

32 BIT FLOATING POINT

 We use a scientific notation to represent

 F is the Fraction field (fraction after binary point)

Value of a floating-point number = (-1) S × val(F) × 2val(E)

 Double Precision Floating Point Numbers (64 bits)

 For a normalized floating point number (S, E, F)

 Significand is equal to (1.F)2 = (1.f1f2f3f4…)2

 Value of a Normalized Floating Point Number is

(–1)S is 1 when S is 0 (positive), and –1 when S is 1 (negative)

 What is the decimal value of this Double Precision

 Value in decimal = 1 × 2–1022 = 2.22507 … × 10–308

 Infinity can result from overflow or division by zero

-∞ Normalized (–ve) Denorm Denorm Normalized (+ve) +∞

-2128 -2–126 0 2–126 2128

Double-Precision Exponent = 11 Fraction = 52 Value

 Consider: (1.000)2 × 2–3 – (1.000)2 × 22

Sign Since result is negative, convert

 Consider multiplying: 1.0102 × 2–1 by –1.1102 × 2–2

Since the operand significands

 Denormalized numbers, values 0.F × 2E , are closer to zero

 Gradual underflow to zero

You might also like