0% found this document useful (0 votes)

4 views30 pages

Chapter 03 Arith 3 Float

This document provides an overview of floating-point arithmetic, focusing on the IEEE 754 standard for single and double precision, special numbers, and floating-point operations such as addition and multiplication. It details RISC-V floating-point instructions, fixed-point versus floating-point representations, and the structure of floating-point numbers. Additionally, it discusses the importance of rounding modes and the internal format used in arithmetic operations.

Uploaded by

s6i893i7744

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views30 pages

Chapter 03 Arith 3 Float

Uploaded by

s6i893i7744

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Computer

Architecture
CH3 Computer Arithmetic (III)
Floating Point

Prof. Ren-Shuo Liu

NTHU EE
Outline
• Overview
• IEEE 754 standard
• Single-precision
• Double-precision
• Special numbers
• Floating-point operations
• Addition
• Multiplication
• Rounding

2
Outline
• Overview
• RISC-V floating-point instructions
• Fixed-point and floating-point representations
• IEEE 754 standard
• Floating-point operations

3
RISC-V Floating Instructions
• Arithmetic
• fadd.s, fsub.s, fmul.s, fdiv.s # s means single-precision
• fadd.d, fsub.d, fmul.d, fdiv.d # d means double-precision

• Comparisons
• feq.s, feq.d # equal
• flt.s, flt.d # less than
• fle.s, fle.d # less than or equal

• Load / store
• flw, fsw
• fld, fsd

4
Floating Point Unit and Register
Files
• Separate 32 registers for Processor
floating-point $0
• Register pairs (e.g., $F0 and $1
Add
$F1) for double precision $2
• $F0 is not always zero
Mult/Div
$31

• Floating-point instructions $F0 fadd/

$F1 fmult/fdiv
can be optional $F2
• Many embedded systems do
not utilize them $F31

5
Fixed-Point
• Integers scaled by an implicit (隱含的) factor
• The scaling factor for each variable does not change
(i.e., fixed) during the entire computation
• Examples
• 3.14 is represented as
• 314 (scaling factor = 1/100)
• 3140 (scaling factor = 1/1000)
• 5,000,000 is represented as
• 5 (scaling factor = 1,000,000)
• 50 (scaling factor = 100,000)

6
Floating-Point ~= Scientific Notation
(科學記號表示法)
• 光速 + 2.99792458 × 10 8 (m/s)
• 電子電量 - 1.60217733 × 10 -19 (C)
• 0.5莫耳碳原子 + 6.00000000 × 10 -3 (kg)
sign significand (有效數) exponent (指數)
or fraction (小數),
or mantissa (尾數) radix or base (底數)

• Normalized form
• Exactly one non-zero significant digit to the left of the
point
• 29.9 × 107 and 0.299 × 109 are not normalized forms

7
Floating-Point ~= Scientific Notation
(科學記號表示法)
• 光速 + 2.99792458 × 10 8 (m/s)
• 電子電量 - 1.60217733 × 10 -19 (C)
• 0.5莫耳碳原子 + 6.00000000 × 10 -3 (kg)
sign significand (有效數) exponent (指數)
or fraction (小數),
or mantissa (尾數) radix or base (底數)

• Scaling factor (the exponent) is explicit

• "Floating"
• Scaling factor can change during computation

8
Floating Point Number
32 bits
• IEEE 754 standard
• Single-precision S Exp. Significand 64 bits

• Double-precision S Exp. Significand

• Represented value: (-1)S × 1.Significanttwo × 2(Exponent - Bias)

Sign bit Exp. bits Significand bits Bias
Single precision
(float in C/C++)
1 8 23 127
Double precision
(double in C/C++)
1 11 52 1023

9
Special Floating Point Numbers
32 bits

S Exp. Significand
zero S 0…00 0…00
Denormalized value S 0…00 non-zero
+/- ∞ S 1…11 0…00
Not a number (NaN) any 1…11 non-zero

i.e., the maximal and minimal exponent values are reserved

for special floating-point numbers

10
Denormalized Value
• S 0...00 Significand are denormalized values
• (-1)S × 0.Significant × 2(1-bias)
• No leading one to the left of the point

• Objective
• Represent very small value
• Gradual underflow

11
Floating Point Examples
32 bits

• S Exp. Significand
0 01111000 10100……..000

= 1.1010…...000two × 2(120-127)
= 1.1010two × 2-7
= 1.625ten × 2-7
= 0.0126953125ten

12
Floating Point Examples
fraction part x2
• Convert -3.14 to 32-bit float
• 3 = 11two 0.14
0.28 0.88
• 3.14
0.56 1.76
= 11.0010_0011_1101_0111_0000_1010…two 1.12 1.52
= 1.1001_0001_1110_1011_1000_010 × 21 0.24 1.04
0.48 0.08
23-bit significand (assume not rounded) 0.96 0.16
1.92 0.32
1.84 0.64
= 1 10000000 1.68 1.28
1.36 0.56
0.72 1.12
32 bits
1.44 0.24

13
IEEE 754 Online Converter

https://www.h-schmidt.net/FloatConverter/IEEE754.html

http://babbage.cs.qc.cuny.edu/IEEE-754.old/Decimal.html
14
Floating Point Operations
• Comparisons
• Addition
• Multiplication

15
Comparisons
• Similar to sign-magnitude integer comparison
S Exp. Significand S Exp. Significand
viewed as viewed as

S Magnitude S Magnitude

• Rationales
• Positive > negative
• Between two positive floating point numbers
• One with larger {exponent, significand} is greater
• Between two negative floating point numbers
• One with smaller {exponent, significand} is greater
16
Comparisons (Cont'd)
• Cases directly supported by sign-magnitude
comparisons
• +∞ == +∞
• -∞ == -∞
• -∞ < all numbers < ∞
• 0 == -0

• Special cases that sign-magnitude comparisons do not

directly support
• != involving any NaN yields true
• All other comparisons involving NaN yield false
• NaN < 10 ?  false
• NaN > NaN?  false
• NaN == NaN?  false

17
Addition
• Steps
1. Align (adjust the smaller number)
2. Perform addition
3. Normalize
4. Round
5. Re-normalize

• Examples
• 9.999ten × 101 + 1.610ten × 10-1
• 1.101two × 29 + 1.110two × 212
• Assume four-digit significands

18
Decimal Example
9.999ten × 101 + 1.610ten × 10-1
Align
= 9.999ten × 101 + 0.01610ten × 101
Add
= 10.01500ten × 101
Normalize
= 1.001500ten × 102
Round
= 1.002ten × 102
Renormalize
= 1.002ten × 102 (no change)

19
Binary Example
1.101 × 29 + 1.110 × 212
Align two two
= 0.001101two × 212 + 1.110two × 212
Add
= 1.111101two × 212
Normalize
= 1.111101two × 212 (no change)
Round
= 10.000two × 212
Renormalize
= 1.000two × 213

20
Compare
Exponents

Shift smaller
number right

Add

Normalize

Round

21
Multiplication
• Steps
1. Add exponents (considering the bias)
2. Multiply the significands (with sign determined)
3. Normalize (and check over/underflow)
4. Round
5. Re-normalize (and re-check over/underflow)

• Examples
• 1.110ten × 1010 × 9.200ten × 10-5
• 1.000two × 2-1 × (-1.110two) × 2-2
• Inputs and outputs have four-digit significand

22
Decimal Example
1.110ten × 1010 × 9.200ten × 10-5
Exponent 10 + (-5) = 5
Multiply 1.110ten × 9.200ten = 10.212ten
Normalize = 1.0212ten × 106
Round = 1.021ten × 106
Renormalize = 1.021ten × 106 (no change)

23
Binary Example
1.000two × 2-1 × (-1.110two) × 2-2
Exponent (-1) + (-2) = (-3)
Multiply 1.000two × (-1.110two) = (-1.110000two)
Normalize = (-1.110000two) × 2-3 (no change)
Round = (1.110two) × 2-3 (no change)
Renormalize = (1.110two) × 2-3 (no change)

24
Internal Format with Extra Bits
• Extra bits are needed during arithmetic operations
to increase the arithmetic accuracy
• e.g., 1.101two × 29 + 1.110two × 212
without extra bits with extra bits

0.001two × 212 0.001101two × 212

+ 1.110two × 212 + 1.110000two × 212
= 1.111two × 212 = 1.111101two × 212
= 10.00two × 212
= 1.000two × 213

25
IEEE 754 Internal Format
• Three extra bits
• The 3rd one represents any remaining nonzero bits to
the right

S Exp. Significand 1.0001two × 27

right shift 5 bits during
arithmetic
Internal format
0.000010001two × 212
S Exp. Significand 0.0000101two × 212
First
Second
Third

26
IEEE 754 Internal Format
• Roles/names of the three extra bits
• First: Guard
• Second: Round
• Third: Sticky

0.000010001two × 212
S Exp. Significand 0.0000101two × 212
First
Second
Third

27
IEEE 754 Rounding Mode
• Four modes can be chosen by programmers
• Toward 0 (also called truncation)
• Toward +∞
• Toward -∞
• Toward nearest even (default mode)
• Choose the even one if there are two equally nearest values

28
Round Toward Nearest Even
• Binary examples

10.00011 10.00101 10.10100 10.11100

Results 10.00 10.01 10.10 11.00

• Reduce the statistical biases of rounding noises

29
Outline
• Overview
• IEEE 754 standard
• Single-precision
• Double-precision
• Special numbers
• Floating-point operations
• Addition
• Multiplication
• Rounding

Floor Truss Span Tables
No ratings yet
Floor Truss Span Tables
2 pages
Shiitake Mushroom Handbook
100% (3)
Shiitake Mushroom Handbook
290 pages
Astm F513-00
No ratings yet
Astm F513-00
14 pages
Boeing 747 Data
100% (1)
Boeing 747 Data
244 pages
Floating Point: - We Need A Way To Represent
No ratings yet
Floating Point: - We Need A Way To Represent
14 pages
Floating Point Arithmetic Class
No ratings yet
Floating Point Arithmetic Class
24 pages
Lecture5 - Arithmetic For Computers - Part 2
No ratings yet
Lecture5 - Arithmetic For Computers - Part 2
57 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
2.4 Floating Points
No ratings yet
2.4 Floating Points
36 pages
Floating Point Arithmetic
100% (1)
Floating Point Arithmetic
30 pages
Floating-Point Numbers
No ratings yet
Floating-Point Numbers
23 pages
Computer Architecture: Nguyễn Trí Thành
No ratings yet
Computer Architecture: Nguyễn Trí Thành
55 pages
Week8 Slides
No ratings yet
Week8 Slides
43 pages
Ece552 10 Floating Point
No ratings yet
Ece552 10 Floating Point
15 pages
Floating Point: Adders and Multipliers
No ratings yet
Floating Point: Adders and Multipliers
45 pages
The World Is Not Just Integers: Programming Languages Support Numbers With Fraction
No ratings yet
The World Is Not Just Integers: Programming Languages Support Numbers With Fraction
51 pages
Lec07 - Computer Arithmetic - Floating-Point Representation and Arithmetic
No ratings yet
Lec07 - Computer Arithmetic - Floating-Point Representation and Arithmetic
42 pages
Floating Point & Fixed Point Representation - BCA II
No ratings yet
Floating Point & Fixed Point Representation - BCA II
24 pages
Lecture 10 (Temp)
No ratings yet
Lecture 10 (Temp)
50 pages
Division: Check For 0 Divisor Long Division Approach
No ratings yet
Division: Check For 0 Divisor Long Division Approach
27 pages
CH08.2-Computer Arithmetic
No ratings yet
CH08.2-Computer Arithmetic
14 pages
Floating-Point Numbers and Operations Representation
No ratings yet
Floating-Point Numbers and Operations Representation
8 pages
Floating Point
No ratings yet
Floating Point
33 pages
Pooja Vashisth
No ratings yet
Pooja Vashisth
35 pages
9-Algorithms For Floating Point Arithmetic Operations-22-01-2024
No ratings yet
9-Algorithms For Floating Point Arithmetic Operations-22-01-2024
49 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
Cse 321 4 5
No ratings yet
Cse 321 4 5
11 pages
"The Course That Gives CMU Its Zip!": Topics
No ratings yet
"The Course That Gives CMU Its Zip!": Topics
30 pages
Booth and Radix-4 Questions
No ratings yet
Booth and Radix-4 Questions
8 pages
An Fpga Based 64-Bit Ieee - 754 Double Precision Floating Point Adder/Subtractor and Multiplier Using VHDL
No ratings yet
An Fpga Based 64-Bit Ieee - 754 Double Precision Floating Point Adder/Subtractor and Multiplier Using VHDL
11 pages
Lecture 3 - Floating Point
No ratings yet
Lecture 3 - Floating Point
33 pages
Floating Point: 15-213: Introduction To Computer Systems 4 Lecture, Sep. 10, 2015
No ratings yet
Floating Point: 15-213: Introduction To Computer Systems 4 Lecture, Sep. 10, 2015
40 pages
8.3 Floating Point Numbers
No ratings yet
8.3 Floating Point Numbers
19 pages
MIPS Architecture - BITS Pilani
No ratings yet
MIPS Architecture - BITS Pilani
58 pages
Floating Point Representation of Numbers: Wide Range
No ratings yet
Floating Point Representation of Numbers: Wide Range
11 pages
Numerical Methods Chap1
No ratings yet
Numerical Methods Chap1
14 pages
Lec 21
No ratings yet
Lec 21
18 pages
04 Float
No ratings yet
04 Float
40 pages
Lab 1
100% (1)
Lab 1
10 pages
Computer Arithmetic: Part III: Floating-Point Arithmetic
No ratings yet
Computer Arithmetic: Part III: Floating-Point Arithmetic
19 pages
Dit 705 - DSP
No ratings yet
Dit 705 - DSP
15 pages
Demystifying Floating Point - John Farrier - CppCon 2015
No ratings yet
Demystifying Floating Point - John Farrier - CppCon 2015
61 pages
EE 109 Unit 20: IEEE 754 Floating Point Representation Floating Point Arithmetic
No ratings yet
EE 109 Unit 20: IEEE 754 Floating Point Representation Floating Point Arithmetic
31 pages
"The Course That Gives CMU Its Zip!": Topics
No ratings yet
"The Course That Gives CMU Its Zip!": Topics
30 pages
chapter02b float 中文
No ratings yet
chapter02b float 中文
48 pages
Lecture 2
No ratings yet
Lecture 2
27 pages
IT3030E CA Chap3 Arithmetics
No ratings yet
IT3030E CA Chap3 Arithmetics
39 pages
5 Data - Floating - Point v1
No ratings yet
5 Data - Floating - Point v1
25 pages
This Unit: Arithmetic and ALU Design Floating Point Arithmetic
No ratings yet
This Unit: Arithmetic and ALU Design Floating Point Arithmetic
8 pages
Mathematical Preliminaries and Error Analysis
100% (1)
Mathematical Preliminaries and Error Analysis
106 pages
Lecture 4 - Floating Point Data
No ratings yet
Lecture 4 - Floating Point Data
44 pages
CH10 COA10e
No ratings yet
CH10 COA10e
48 pages
LEC03 Data II
No ratings yet
LEC03 Data II
45 pages
BiD 09
No ratings yet
BiD 09
56 pages
Complete Floating Point (Blog)
No ratings yet
Complete Floating Point (Blog)
18 pages
Floating Points
No ratings yet
Floating Points
31 pages
Lect 13
No ratings yet
Lect 13
41 pages
L09 - Floating-Point & Logic
No ratings yet
L09 - Floating-Point & Logic
59 pages
Review: How To Represent Real Numbers
No ratings yet
Review: How To Represent Real Numbers
9 pages
Chap 02
No ratings yet
Chap 02
16 pages
Chapter2 2.5
No ratings yet
Chapter2 2.5
34 pages
CH10 Computer Arithmetic
No ratings yet
CH10 Computer Arithmetic
55 pages
How To Represent Real Numbers: - in Decimal Scientific Notation
No ratings yet
How To Represent Real Numbers: - in Decimal Scientific Notation
16 pages
Generalized Fermat Equation
From Everand
Generalized Fermat Equation
Ran Van Vo
No ratings yet
Wiring Diagram: Security Control System
No ratings yet
Wiring Diagram: Security Control System
1 page
Islamic Investment Fund: Tahreem Zafar Roll No. 172026 Course Instructor Dr. Mian Abbas
No ratings yet
Islamic Investment Fund: Tahreem Zafar Roll No. 172026 Course Instructor Dr. Mian Abbas
38 pages
Final PPT CAMPUS
No ratings yet
Final PPT CAMPUS
20 pages
Total
No ratings yet
Total
19 pages
Applied Modelling and Visualisation
No ratings yet
Applied Modelling and Visualisation
12 pages
T1 - Universal Beam
No ratings yet
T1 - Universal Beam
8 pages
Day 3 Slot 3 Mid Sm23 Summer
No ratings yet
Day 3 Slot 3 Mid Sm23 Summer
32 pages
DMW Project Report by Saurabh Zingade
No ratings yet
DMW Project Report by Saurabh Zingade
16 pages
Defecte Multiplexare
No ratings yet
Defecte Multiplexare
22 pages
Sunder Rajan - 2005 - Biocapital
No ratings yet
Sunder Rajan - 2005 - Biocapital
359 pages
Syllabus - Private International Law Copy 2
No ratings yet
Syllabus - Private International Law Copy 2
5 pages
Matsumoto Hakuō II
No ratings yet
Matsumoto Hakuō II
3 pages
Mid Semester Theory Exam17079936871961
No ratings yet
Mid Semester Theory Exam17079936871961
17 pages
Strategic Management of Mitsubishi
No ratings yet
Strategic Management of Mitsubishi
17 pages
CE Certificate
No ratings yet
CE Certificate
3 pages
Summay Chapter 6 and 8 (Paul Goodwin and George Wright)
No ratings yet
Summay Chapter 6 and 8 (Paul Goodwin and George Wright)
10 pages
Configuring A JOB in T24
No ratings yet
Configuring A JOB in T24
2 pages
Company Profile Acurate Packtech
No ratings yet
Company Profile Acurate Packtech
6 pages
The Raine Report Issue 02
No ratings yet
The Raine Report Issue 02
51 pages
Article Hand Signals
No ratings yet
Article Hand Signals
6 pages
AGS Guide To Ground Investigation Reports Final
No ratings yet
AGS Guide To Ground Investigation Reports Final
6 pages
The Design Development and Testing of A PDF
No ratings yet
The Design Development and Testing of A PDF
109 pages
OS Lab Manual Part 3
No ratings yet
OS Lab Manual Part 3
7 pages
UNDP Malaysia Peat Swamp Forest PDF
100% (1)
UNDP Malaysia Peat Swamp Forest PDF
40 pages
Gamal Mohamed CV
No ratings yet
Gamal Mohamed CV
2 pages
Aircraft Fastener
100% (3)
Aircraft Fastener
119 pages

Chapter 03 Arith 3 Float

Uploaded by

Chapter 03 Arith 3 Float

Uploaded by

Computer

Prof. Ren-Shuo Liu

• Floating-point instructions $F0 fadd/

• Scaling factor (the exponent) is explicit

• Double-precision S Exp. Significand

• Represented value: (-1)S × 1.Significanttwo × 2(Exponent - Bias)

i.e., the maximal and minimal exponent values are reserved

• Special cases that sign-magnitude comparisons do not

0.001two × 212 0.001101two × 212

S Exp. Significand 1.0001two × 27

10.00011 10.00101 10.10100 10.11100

• Reduce the statistical biases of rounding noises

You might also like