0% found this document useful (0 votes)

61 views7 pages

Computational Physics I: Luigi Scorzato Lecture 2: Floating Point Arithmetic

This lecture discusses floating point arithmetic and its limitations for computational physics simulations: - Floating point numbers represent real numbers using a finite number of bits, introducing rounding errors. Commonly used formats include 32-bit and 64-bit following the IEEE 754 standard. - While basic arithmetic operations like addition and multiplication are reasonably accurate, errors accumulate over many operations. Associativity and distributivity do not strictly hold due to rounding. - Simple numbers like fractions in decimal form cannot be represented exactly. Derivatives and sums converge slowly due to shifting of the mantissa with differing magnitudes. - Numerical models assume the error in floating point operations is bounded and proportional to the operation value. Consistency checks

Uploaded by

Petàr Groff

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

61 views7 pages

Computational Physics I: Luigi Scorzato Lecture 2: Floating Point Arithmetic

Uploaded by

Petàr Groff

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Computational Physics I

Luigi Scorzato Lecture 2: Floating point arithmetic

Computer memories are nite: 1. 2. how can we represent , or on a computer? To what extent can such representation(s) be trusted?

Representation
Usually computers assign 32 or 63 bits foreach single number, but there are two main strategies to do it:

Fixed Point (used for Integers):

n = (1)a0 (a1 20 + a2 21 + . . . + aM 1 2M 2 )

M is the number of bits available for a single number (typical choices are M=32 or 64) and ai = 0,1. nmax = 2M-1 i.e. (with 64bits) 9.2 1018. This means that (9 1018), (9 1018 +1) are all represented and distinguishable; but 1019=

Floating Point (used for Reals, ...)

x = (1)s 1.f 2

bias

IEEE754 standard s is 1 bit for the sign; f is the 23 (single precision) or 53 (double) bits mantissa and is the 8 (single) 10 (double) bits exponent. The length of the Mantissa denes roughly the relative precision and that of the exponent the range. Both are represented as integers.

Limitations of Floating Point Arithmetics

Commutativity and Addition inverse are OK for IEEE: (a+b=b+a; a*b=b*a; a-a=0) (less trivial than you may think). But this are the last good news... Addition is not associative ((a+b)+c) != (a+(b+c)) Distributive law does not hold ((a+b)*c) != (a*c+b*c) Multiplication inverse may not exist a* (1/a) != 1 Most simple numbers in decimal units are not mapped exactly

Typical mechanism that produces errors:

shift of the mantissa, when summing very different numbers. E.g:

A Model

Instead of: || oat(A op B) - (A op B) ||=0, we can only assume that: || oat(A op B) - (A op B) || < u || A op B || (where op=+,-,/,* of single oating point numbers) we can use this model to predict which errors we might expect. For example for the scalar product one nds (Golub-van Loan):
N N N

fl
k=1

xk yk

k=1

xk yk N u

k=1

|xk yk | + O(u2 )

Simple exercises

Exponential function [see my_exp.m] Accumulating sums as e.g. harmonic series: lim_n (1+x/n)^n [my_exp_seq.m]

(x)n = n! n=0
N k=1

1 N ln N + Euler k

Trying to understand precisely the origin of rounding errors is often frustrating and as hard as

Message:

solving analytically the problem that we want to solve numerically. What we can do is to check a posteriori: check the correctness with known exact results, which have the same numerical difculties. check consistency when changing conditions by negligeable amounts (when you know that they should not matter: sometimes high sensitivity is physical). check consistency when changing numerical precision of the operations.

Compute Derivative of a function

Notation:
fn = f (t0 + nh) f
(1),order=1

Naive 1st derivative:

f1 f0 + O(h) h

One can do better (remember Taylor):

f1 f2 f1 f1 f2 f2 f (1),o=2 f (1),o=4 = = = = = =

h2 (2) h3 (3) h4 (4) f0 hf + f f + f + O(h5 ) 2 3! 4! (2h)2 (2) (2h)3 (3) (2h)4 (4) f0 2hf (1) + f f + f + O((2h)5 ) 2 3! 4! 1 3 (3) (1) 2hf + h f + O(h5 ) 3 8 3 (3) (1) 4hf + h f + O(h5 ) 3 f1 f1 + O(h2 ) 2h (f2 f2 ) 8(f1 f1 ) + O(h4 ) 2h
(1)

However, smaller h and higher orders are not necessarely better: see the following example

Exercise: Write a program that computes the derivative of sin( x) for

different order of approximation; different values of h; different values of .

Compare with [numdiff.m]

Cit335 Summary
No ratings yet
Cit335 Summary
10 pages
Basic Mathematics Form Two & Four-1
No ratings yet
Basic Mathematics Form Two & Four-1
5 pages
Square & Square Root Worksheet - 2
No ratings yet
Square & Square Root Worksheet - 2
3 pages
3-FENG 346 Constrained and Nonlinear Optimization Problems - B
No ratings yet
3-FENG 346 Constrained and Nonlinear Optimization Problems - B
47 pages
MATH 2160 Numerical Analysis 1 Notes: S. H. Lui Department of Mathematics University of Manitoba
No ratings yet
MATH 2160 Numerical Analysis 1 Notes: S. H. Lui Department of Mathematics University of Manitoba
111 pages
Numerical Methods I - Roundoff Errors
No ratings yet
Numerical Methods I - Roundoff Errors
46 pages
Gershgorin GSC PDF
No ratings yet
Gershgorin GSC PDF
118 pages
Numerical Methods Notes
100% (1)
Numerical Methods Notes
553 pages
Maths All Notes
No ratings yet
Maths All Notes
122 pages
Chapter 4
No ratings yet
Chapter 4
26 pages
Week
No ratings yet
Week
87 pages
Demystifying Floating Point - John Farrier - CppCon 2015
No ratings yet
Demystifying Floating Point - John Farrier - CppCon 2015
61 pages
Mathematical Physics 2024-09-19 09 - 48 - 18
No ratings yet
Mathematical Physics 2024-09-19 09 - 48 - 18
31 pages
Affine Geometry
No ratings yet
Affine Geometry
26 pages
GENMATHPPT2NDQUARTER
No ratings yet
GENMATHPPT2NDQUARTER
45 pages
Week 2 M1Lessons 2-3
No ratings yet
Week 2 M1Lessons 2-3
41 pages
R Numeric Programming
100% (1)
R Numeric Programming
124 pages
CBNST Notes For BCA PU 3rd Sem Based On Syllabus PDF
No ratings yet
CBNST Notes For BCA PU 3rd Sem Based On Syllabus PDF
27 pages
De ZG535 Course Handout
No ratings yet
De ZG535 Course Handout
6 pages
CBNST Notes For BCA PU 3rd Sem Based On Syllabus PDF
100% (1)
CBNST Notes For BCA PU 3rd Sem Based On Syllabus PDF
27 pages
9-Algorithms For Floating Point Arithmetic Operations-22-01-2024
No ratings yet
9-Algorithms For Floating Point Arithmetic Operations-22-01-2024
49 pages
Numerical Methods
No ratings yet
Numerical Methods
17 pages
PreCalc Final Exam RWS
No ratings yet
PreCalc Final Exam RWS
3 pages
Block Diagram Reduction
No ratings yet
Block Diagram Reduction
14 pages
An Fpga Based 64-Bit Ieee - 754 Double Precision Floating Point Adder/Subtractor and Multiplier Using VHDL
No ratings yet
An Fpga Based 64-Bit Ieee - 754 Double Precision Floating Point Adder/Subtractor and Multiplier Using VHDL
11 pages
Error Analysis in Numerical Methods
No ratings yet
Error Analysis in Numerical Methods
9 pages
Tutorial1 NMCI101 Solutions
No ratings yet
Tutorial1 NMCI101 Solutions
21 pages
EEPC 102 Module 1
No ratings yet
EEPC 102 Module 1
6 pages
CHAP 03e
No ratings yet
CHAP 03e
32 pages
WBMT2049-T2/WI2032TH - Numerical Analysis For ODE's
No ratings yet
WBMT2049-T2/WI2032TH - Numerical Analysis For ODE's
30 pages
Unit 4 - 2
No ratings yet
Unit 4 - 2
21 pages
Eepc102 Module 1
No ratings yet
Eepc102 Module 1
6 pages
Mathematics P1 Nov 2016 Memo Afr & Eng
No ratings yet
Mathematics P1 Nov 2016 Memo Afr & Eng
20 pages
18Mab102T-Advanced Calculus and Complex Analysis: Unit I - Double and Triple Integrals
No ratings yet
18Mab102T-Advanced Calculus and Complex Analysis: Unit I - Double and Triple Integrals
22 pages
Lecture Notes On Numerical Methods For Engineering (?) : Pedro Fortuny Ayuso
No ratings yet
Lecture Notes On Numerical Methods For Engineering (?) : Pedro Fortuny Ayuso
104 pages
Floating Point Arithmethic - Error Analysis
No ratings yet
Floating Point Arithmethic - Error Analysis
30 pages
Numerical Analysis
No ratings yet
Numerical Analysis
21 pages
Lecture Notes On Numerical Methods For Engineering (?) : Pedro Fortuny Ayuso
No ratings yet
Lecture Notes On Numerical Methods For Engineering (?) : Pedro Fortuny Ayuso
104 pages
Approximation and Round-Off Errors: Speed: 48.X Mileage: 87324.4X
No ratings yet
Approximation and Round-Off Errors: Speed: 48.X Mileage: 87324.4X
27 pages
2024 MAT1EA1 Semester Test 1
No ratings yet
2024 MAT1EA1 Semester Test 1
8 pages
Mathematical Modeling
No ratings yet
Mathematical Modeling
14 pages
Lab 3
No ratings yet
Lab 3
5 pages
Chapter 4 Shape Function
No ratings yet
Chapter 4 Shape Function
34 pages
Floating Points
No ratings yet
Floating Points
31 pages
Chapter 1: Introduction and Mathematical Preliminaries: Evy Kersal e
No ratings yet
Chapter 1: Introduction and Mathematical Preliminaries: Evy Kersal e
49 pages
St. Rita's College of Misamis Oriental, Inc
No ratings yet
St. Rita's College of Misamis Oriental, Inc
3 pages
11 Discrete-Time Fourier Transform: Solutions To Recommended Problems
No ratings yet
11 Discrete-Time Fourier Transform: Solutions To Recommended Problems
10 pages
Lecture 10 (Temp)
No ratings yet
Lecture 10 (Temp)
50 pages
Floating-Point Numbers and Round-Off Errors by Kusal Kaluarachchi Medium
No ratings yet
Floating-Point Numbers and Round-Off Errors by Kusal Kaluarachchi Medium
2 pages
Ask A Question
No ratings yet
Ask A Question
5 pages
MIT18 335JF10 Lec4 Hand PDF
No ratings yet
MIT18 335JF10 Lec4 Hand PDF
3 pages
L3 Source of Error, Floating-Point
No ratings yet
L3 Source of Error, Floating-Point
26 pages
Mathematical Geodesy Maa-6.3230: Martin Vermeer 4th February 2013
No ratings yet
Mathematical Geodesy Maa-6.3230: Martin Vermeer 4th February 2013
127 pages
PS2 2019 Solutions
No ratings yet
PS2 2019 Solutions
9 pages
Lecture Notes17
No ratings yet
Lecture Notes17
122 pages
Local and Global Sequence Alignment 5+5 Examples
No ratings yet
Local and Global Sequence Alignment 5+5 Examples
10 pages
Twk2A Homogeneous Linear Des With Constant Coe Cients (Section 4.3) Solutions
No ratings yet
Twk2A Homogeneous Linear Des With Constant Coe Cients (Section 4.3) Solutions
12 pages
Numerical Methods: Representing Numbers
No ratings yet
Numerical Methods: Representing Numbers
30 pages
Stage 4 Mathematics: (90 Mins Maths) Student'S Workbook
No ratings yet
Stage 4 Mathematics: (90 Mins Maths) Student'S Workbook
7 pages
Lecture Notes On Numerical Analysis
No ratings yet
Lecture Notes On Numerical Analysis
68 pages
Rounding Errors: Course Website
No ratings yet
Rounding Errors: Course Website
34 pages
GSC-320 Numerical Computing: Lecturer:Fasiha Ikram
No ratings yet
GSC-320 Numerical Computing: Lecturer:Fasiha Ikram
17 pages
Samiullah Malik Aerospace Engineering PHD - Fall 2018
No ratings yet
Samiullah Malik Aerospace Engineering PHD - Fall 2018
2 pages
Floating-Point Numbers and Operations Representation
No ratings yet
Floating-Point Numbers and Operations Representation
8 pages
Solutions Assign 1 2022 Autumn
No ratings yet
Solutions Assign 1 2022 Autumn
5 pages
P03 Intro To Numerical Methods
No ratings yet
P03 Intro To Numerical Methods
11 pages
Numerical Methods
No ratings yet
Numerical Methods
72 pages
Eeg 823
No ratings yet
Eeg 823
71 pages
Probability Theory Presentation 11
No ratings yet
Probability Theory Presentation 11
36 pages
Computations in Mechanical Engineering: Numbers and Vectors
No ratings yet
Computations in Mechanical Engineering: Numbers and Vectors
18 pages
Orthogonality and Vector Spaces
No ratings yet
Orthogonality and Vector Spaces
18 pages
Errors and Propagation
No ratings yet
Errors and Propagation
8 pages
Grade6 Algebra PDF
100% (5)
Grade6 Algebra PDF
12 pages
ELEC3030 Notes v1
No ratings yet
ELEC3030 Notes v1
118 pages
Chapter 1 (5 Lectures)
No ratings yet
Chapter 1 (5 Lectures)
15 pages
Floating-Point Arithmetic in The Coq System
No ratings yet
Floating-Point Arithmetic in The Coq System
10 pages
Floating Point Formats: 0 1 1 P 1 (P 1) e I
No ratings yet
Floating Point Formats: 0 1 1 P 1 (P 1) e I
3 pages
L5 Asmptotic Notations
No ratings yet
L5 Asmptotic Notations
7 pages
Floating Point
No ratings yet
Floating Point
3 pages
1.3 Error, Accuracy, and Stability: Preliminaries
No ratings yet
1.3 Error, Accuracy, and Stability: Preliminaries
4 pages
Scientific Computation (Floating Point Numbers)
No ratings yet
Scientific Computation (Floating Point Numbers)
4 pages
Floating Point Arithmetic
No ratings yet
Floating Point Arithmetic
9 pages
A Brief Introduction To The IEEE Standard
No ratings yet
A Brief Introduction To The IEEE Standard
4 pages
Real Number Representation and Floating Point Arithmetic
No ratings yet
Real Number Representation and Floating Point Arithmetic
12 pages
Relativity An Introduction To Special and General Relativity
100% (22)
Relativity An Introduction To Special and General Relativity
418 pages
1.3 Error, Accuracy, and Stability: Preliminaries
No ratings yet
1.3 Error, Accuracy, and Stability: Preliminaries
4 pages
Numerical Analysis Lecture Notes: 1. Computer Arithmetic
No ratings yet
Numerical Analysis Lecture Notes: 1. Computer Arithmetic
6 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Trigonometric Ratios to Transformations (Trigonometry) Mathematics E-Book For Public Exams
From Everand
Trigonometric Ratios to Transformations (Trigonometry) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
5/5 (1)

Computational Physics I: Luigi Scorzato Lecture 2: Floating Point Arithmetic

Uploaded by

Computational Physics I: Luigi Scorzato Lecture 2: Floating Point Arithmetic

Uploaded by

Computational Physics I

Luigi Scorzato Lecture 2: Floating point arithmetic

Fixed Point (used for Integers):

Floating Point (used for Reals, ...)

Limitations of Floating Point Arithmetics

Typical mechanism that produces errors:

shift of the mantissa, when summing very different numbers. E.g:

Compute Derivative of a function

Naive 1st derivative:

One can do better (remember Taylor):

Exercise: Write a program that computes the derivative of sin( x) for

different order of approximation; different values of h; different values of .

You might also like