0% found this document useful (0 votes)

6 views

Machine Arithmetic_Notes

Uploaded by

Sai laxman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Machine Arithmetic_Notes

Uploaded by

Sai laxman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Chapter 11

Machine Arithmetic

11.1 Decimal number system

In our decimal system, natural numbers are represented by a sequence of
digits. For example, 582 = 5 · 102 + 8 · 10 + 2. Generally,

an · · · a1 a0 = 10n an + · · · + 10a1 + a0 ,

where ai 2 {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} are digits. Fractional numbers require an

additional fractional part:

f = .d1 d2 . . . dt . . . = 10 1 d1 + 10 2 d2 + · · · + 10 t dt + · · ·

which may be finite or infinite.

11.2 Floating point representation

Alternatively, any real number can be written as a product of a fractional
part with a sign and a power of ten:

±.d1 d2 . . . dt . . . ⇥ 10e

where di are decimal digits and e 2 Z is an integer. For example,

58.2 = 0.582 ⇥ 102 = 0.0582 ⇥ 103 , etc. (11.1)

This is called floating point representation of decimal numbers. The part

.d1 d2 . . . dt . . . is called mantissa and e is called exponent. By changing the
exponent e with a fixed mantissa .d1 d2 . . . dt . . . we can move (“float”) the
decimal point, for example 0.582 ⇥ 102 = 58.2 and 0.582 ⇥ 101 = 5.82.

86
11.3 Normalized floating point representation
To avoid unnecessary multiple representations of the same number, such
as in (11.1), we can require that d1 6= 0. We say the floating point repre-
sentation is normalized if d1 6= 0. Then 0.582 ⇥ 102 is the only normalized
representation of the number 58.2.
For every positive real r > 0 there is a unique integer e 2 Z such that
f = 10 e r 2 [0.1, 1). Then r = f ⇥ 10e is the normalized representation of r.
For most of real numbers, the normalized representation is unique, however,
there are exceptions, such as
.9999 . . . ⇥ 100 = .1000 . . . ⇥ 101 ,
both representing the number r = 1. In such cases one of the two representa-
tions has a finite fractional part and the other has an infinite fractional part
with trailing nines.

11.4 Binary number system

In the binary number system, the base is 2 (instead of 10), and there
are only two digits: 0 and 1. Any natural number N can be written, in the
binary system, as a sequence of binary digits:
N = (an · · · a1 a0 )2 = 2n an + · · · + 2a1 + a0
where ai 2 {0, 1}. For example, 5 = 1012 , 11 = 10112 , 64 = 10000002 , etc.
Binary system, due to its simplicity, is used by all computers. In the modern
computer world, the word bit means binary digit.
Fractional numbers require an additional fractional part:
f = (.d1 d2 . . . dt . . .)2 = 2 1 d1 + 2 2 d2 + · · · + 2 t dt + · · ·
which may be finite or infinite. For example, 0.5 = 0.12 , 0.625 = 0.1012 , and
0.6 = (0.10011001100110011 . . .)2
(the blocks 00 and 11 alternate indefinitely). The floating point representa-
tion of real numbers in the binary system is given by
r = ±.d1 d2 . . . dt . . . 2
⇥ 2e
where .d1 d2 . . . dt . . . is called mantissa and e 2 Z is called exponent. Again,
we say that the above representation is normalized if d1 6= 0, this ensures
uniqueness for almost all real numbers. Note that d1 6= 0 implies d1 = 1, i.e.,
every normalized binary representation begins with a one.

87
11.5 Other number systems
We can use a number system with any fixed base 2. By analogy
with Sections 11.1 and 11.4, any natural number N can be written, in that
system, as a sequence of digits:
n
N = (an · · · a1 a0 ) = an + · · · + a 1 + a0

where ai 2 {0, 1, . . . , 1} are digits. Fractional numbers require an addi-

tional fractional part:
1 2 t
f = .d1 d2 . . . dt . . . = d1 + d2 + · · · + dt + · · ·

which may be finite or infinite. The floating point representation of real

numbers in the system with base is given by
e
r = ±.d1 d2 . . . dt . . . ⇥

where .d1 d2 . . . dt . . . is called mantissa and e 2 Z is called exponent. Again,

we say that the above representation is normalized if d1 6= 0, this ensures
uniqueness for almost all real numbers.
In the real world, computers can only handle a certain fixed number of digits in their
electronic memory. For the same reason, possible values of the exponent e are always
limited to a certain fixed interval. This motivates our next definition.

11.6 Machine number systems (an abstract version)

A machine number system is specified by four integers, ( , t, L, U ), where
2 is the base
t 1 is the length of the mantissa
L 2 Z is the minimal value for the exponent e
U 2 Z is the maximal value for the exponent e (of course, L  U ).
Real numbers in a machine system are represented by
e
r = ±.d1 d2 . . . dt ⇥ , LeU (11.2)

The representation must be normalized, i.e., d1 6= 0.

88
11.7 Basic properties of machine systems
Suppose we use a machine system with parameters ( , t, L, U ). There are finitely many
real numbers that can be represented by (11.2). More precisely, there are
t 1
2( 1) (U L + 1)
such numbers. The largest machine number is
U U t
M = .( 1) . . . ⇥ = (1 ).
The smallest positive machine number is
L L 1
m = .10 . . . 0 ⇥ = .
Note that zero cannot be represented in the above format, since we require d1 6= 0. For
this reason, every real machine systems includes a few special numbers, like zero, that
have to be represented di↵erently. Other “special numbers” are +1 and 1.

11.8 Two standard machine systems

Most modern computers conform to the IEEE floating-point standard (ANSI/IEEE
Standard 754-1985), which specifies two machine systems:
I Single precision is defined by = 2, t = 24, L = 125 and U = 128.
II Double precision is defined by = 2, t = 53, L = 1021 and U = 1024.

11.9 Rounding rules

A machine system with parameters ( , t, L, U ) provides exact represen-
tation for finitely many real numbers. Other real numbers have to be ap-
proximated by machine numbers. Suppose x 6= 0 be a real number with
normalized floating point representation
e
x = ±0.d1 d2 . . . ⇥
where the number of digits may be finite or infinite.
If e > U or e < L, then x cannot be properly represented in the machine system (it
is either “too large” or “too small”). If e < L, then x is usually converted to the special
number zero. If e > U , then x is usually converted to the special number +1 or 1.
If e 2 [L, U ] is within the proper range, then the mantissa of x has to
be reduced to t digits (if it is longer than that or infinite). There are two
standard versions of such reductions:
(a) keep the first t digits and chop o↵ the rest;
(b) round o↵ to the nearest available, i.e. use the rules
⇢
.d1 . . . dt if dt+1 < /2
.d1 . . . dt + .0 . . . 01 if dt+1 /2

89
11.10 Relative errors
Let x be a real number and xc its computer representation in a machine
system with parameters ( , t, L, U ), as described above. We will always assume
that x is neither too big nor too small, i.e., its exponent e is within the proper range
L  e  U . How accurately does xc represent x? (How close is xc to x?)
The absolute error of the computer representation, i.e., xc x, may be
quite large when the exponent e is big. It is more customary to describe the
accuracy in terms of the relative error (xc x)/x as follows:
xc x
= " or xc = x(1 + ").
x
It is easy to see that the maximal possible value of |"| is
⇢ 1 t
for chopped arithmetic (a)
|"|  u = 1 1 t
2
for rounded arithmetic (b)
The number u is called unit roundo↵ or machine epsilon.

11.11 Machine epsilon

Note that the machine epsilon u is not the smallest positive number m
represented by the given machine system (cf. Sect. 11.7).
One can describe u as the smallest positive number " > 0 such that
(1 + ")c 6= 1. In other words, u is the smallest positive value that, when
added to one, yields a result di↵erent from one.
In more practical terms, u tells us how many accurate digits machine
numbers can carry. If u ⇠ 10 p , then any machine number xc carries at
most p accurate decimal digits.
For example, suppose for a given machine system u ⇠ 10 7 and a machine number
xc representing some real number x has value 35.41879236 (when printed on paper or
displayed on computer screen). Then we can say that x ⇡ 35.41879, and the digits of x
beyond 9 cannot be determined. In particular, the digits 236 in the printed value of xc
are meaningless (they are “trash” to be discarded).

11.12 Machine epsilon for the two standard machine systems

I For the IEEE floating-point single precision standard with chopped
arithmetic u = 2 23 ⇡ 1.2 ⇥ 10 7 . In other words, approximately 7 decimal
digits are accurate.
II For the IEEE floating-point double precision standard with chopped
arithmetic u = 2 52 ⇡ 2.2⇥10 16 . In other words, approximately 16 decimal
digits are accurate.

90
In the next two examples, we will solve systems of linear equations by
using the rules of a machine system. In other words, we will pretend that we
are computers. This will help us understand how real computers work.

11.13 Example
Let us solve the system of equations
  
0.01 2 x 2
=
1 3 y 4

The exact solution was found in Section 7.13:

200 196
x= 197 ⇡ 1.015 and y= 197 ⇡ 0.995.

Now let us solve this system by using chopped arithmetic with base = 10 and t = 2 (i.e.,
our mantissa will be always limited to two decimal digits).
First we use Gaussian elimination (without pivoting). Multiplying the first equation
by 100 and subtracting it from the second gives 197y = 196. Both numbers 197 and
196 are three digit long, so we must chop the third digit o↵. This gives 190y = 190,
hence y = 1. Substituting y = 1 into the first equation gives 0.01x = 2 2 = 0, hence
x = 0. Thus our computed solution is

xc = 0 and yc = 1.

The relative error in x is (xc x)/x = 1, i.e., the computed xc is 100% o↵ mark!
Let us increase the length of mantissa to t = 3 and repeat the calculations. This gives

xc = 2 and yc = 0.994.

The relative error in x is now (xc x)/x = 0.97, i.e., the computed xc is 97% o↵ mark.
Not much of improvement... We postpone the explanation until Chapter 13.
Let us now apply partial pivoting (Section 7.15). First we interchange the rows:
  
1 3 x 4
= .
0.01 2 y 2

Now we multiply the first equation by 0.01 and subtract it from the second to get 1.97y =
1.96. With t = 2, we must chop o↵ the third digit in both numbers: 1.9y = 1.9, hence
y = 1. Substituting y = 1 into the fits equation gives x + 3 = 4, hence x = 1, i.e.,

xc = 1 and yc = 1.

This is a great improvement over the first two solutions (without pivoting). The relative
error in x is now (xc x)/x = 0.015, so the computed xc is only 1.5% o↵. The relative
error in y is even smaller (about 0.005).

91
Machine arithmetic Exact arithmetic: Machine arithmetic
with t = 2: with t = 3:
0.01x + 2y = 2
x + 3y = 4
+
0.01x + 2y = 2 Chopping o↵ 0.01x + 2y = 2
190y = 190 197y = 196
+ +
0.01x + 2y = 2 0.01x + 2y = 2 Chopping o↵ 0.01x + 2y = 2
196 !
y=1 y= 197 ⇡ 0.9949 y = 0.994
+ + +
392 2
0.01x = 2 2=0 !! 0.01x = 2 197 = 197
!! 0.01x = 2 1.98A8 = 0.02
196
y=1 y= 197 y = 0.994
+ + +
x=0 x = 200
197 ⇡ 1.0152 x=2
196
y=1 y= 197 ⇡ 0.9949 y = 0.994

Example 11.13 without pivoting.

Machine arithmetic Exact arithmetic: Machine arithmetic

with t = 2: with t = 3:
x + 3y = 4
0.01x + 2y = 2
+
x + 3y = 4 Chopping o↵ x + 3y = 4
1.9y = 1.9 1.97y = 1.96
+ +
x + 3y = 4 x + 3y = 4 Chopping o↵ x + 3y = 4
196 !
y=1 y= 197 ⇡ 0.9949 y = 0.994
+ + +
588 200
x=4 3=1 x=4 197 = 197 x = 4 2.982A = 1.02
196
y=1 y= 197 y = 0.994

Example 11.13 with partial pivoting.

92
Now let us continue the partial pivoting with t = 3. We can keep three digits, so
1.97y = 1.96 gives y = 1.96/1.97 ⇡ 0.9949, which we have to reduce to y = 0.994.
Substituting this value of y into the first equation gives x + 2.982 = 4, which we have to
reduce to x + 2.98 = 4, hence x = 1.02. So now

xc = 1.02 and yc = 0.994.

The relative error in x is (xc x)/x = 0.0047, less than 0.5%.

The table below shows the relative error of the numerical solution xc by Gaussian
elimination with pivoting and di↵erent lengths of mantissa. We see that the relative error
is roughly proportional to the “typical” round-o↵ error 10 t , with a factor of about 2 to
5. We can hardly expect a better accuracy.

relative error typical error factor

t=2 1.5 ⇥ 10 2 10 2 1.5
t=3 4.7 ⇥ 10 3 10 3 4.7
t=4 2.2 ⇥ 10 4 10 4 2.2

Conclusions: Gaussian elimination without pivoting may lead to catastrophic errors, which
will remain unexplained until Chapter 13. Pivoting is more reliable – it seems to provide
nearly maximum possible accuracy here. But see the next example...

11.14 Example
Let us solve another system of equations:
  
3 1 x 5
=
1 0.35 y 1.7

The exact solution here is

x=1 and y = 2.
The largest coefficient, 3, is at the top left corner already, so pivoting (partial or complete)
would not change anything.
Solving this system in chopped arithmetic with = 10 and t = 2 gives xc = 0 and
yc = 5, which is 150% o↵. Increasing the length of the mantissa to t = 3 gives xc = 0.883
and yc = 2.35, so the relative error is 17%. With t = 4, we obtain xc = 0.987 and
yc = 2.039, now the relative error is 2%. The table below shows that the relative error
of the numerical solutions is roughly proportional to the typical round-o↵ error 10 t , but
with a big factor fluctuating around 150 or 200.

relative error typical error factor

t=2 1.5 ⇥ 10 0 10 2 150
t=3 1.7 ⇥ 10 1 10 3 170
t=4 2.0 ⇥ 10 2 10 4 200

We postpone a complete analysis of the above two examples until Chapter 13.

93
11.15 Computational errors
Let x and y be two real numbers represented in a machine system by xc
and yc , respectively. An arithmetic operation x ⇤ y (where ⇤ stands for one
of the four basic operations: +, , ⇥, ÷) is performed by a computer in the
following way. The computer first finds xc ⇤ yc exactly and then represents
that number in its machine system. The result is z = (xc ⇤ yc )c .
Note that, generally, z is di↵erent from (x ⇤ y)c , which is the machine representation
of the exact result x ⇤ y. Hence, z is not necessarily the best representation for x ⇤ y. In
other words, the computer makes additional round o↵ errors at each arithmetic operation.
Assuming that xc = x(1 + "1 ) and yc = y(1 + "2 ) we have

(xc ⇤ yc )c = (xc ⇤ yc ) (1 + "3 ) = [ x(1 + "1 ) ] ⇤ [ y(1 + "2 ) ] (1 + "3 )

where |"1 |, |"2 |, |"3 |  u.

11.16 Multiplication and division

For multiplication, we have

z = xy(1 + "1 )(1 + "2 )(1 + "3 ) ⇡ xy(1 + "1 + "2 + "3 )

(here we ignore higher order terms like "1 "2 ), so the relative error is (approx-
imately) bounded by 3u. A similar estimate can be made for division:

x(1 + "1 )(1 + "3 ) x

z= ⇡ (1 + "1 "2 + "3 ).
y(1 + "2 ) y
Note: we used Taylor expansion
1
=1 "2 + "22 "32 + · · ·
1 + "2
and again ignored higher order terms.
Thus again the relative error is (approximately) bounded by 3u.
Conclusion: machine multiplication and machine division magnify rela-
tive errors by a factor of three, at most.

94
11.17 Addition and subtraction
For addition, we have
⇣ x"1 + y"2 ⌘
z = (x + y + x"1 + y"2 )(1 + "3 ) = (x + y) 1 + (1 + "3 ).
x+y
Again ignoring higher order terms we can bound the relative error of z by
|x| + |y|
u + u.
|x + y|
Thus, the operation of addition magnifies relative errors by a factor of
|x| + |y|
+ 1.
|x + y|
Similar estimates can be made for subtraction x y: it magnifies relative
errors by a factor
|x| + |y|
+ 1.
|x y|
Hence the addition and subtraction magnify relative errors by a variable
factor which depends on x and y. This factor may be arbitrarily large if
x + y ⇡ 0 for addition or x y ⇡ 0 for subtraction. This phenomenon is
known as catastrophic cancelation. It occurred in our Example 11.13 when we solved
it without pivoting (see the line marked with double exclamation signs on page 92).

Exercise 11.1. (JPE, September 1993). Solve the system

✓ ◆✓ ◆ ✓ ◆
0.001 1.00 x 1.00
=
1.00 2.00 y 3.00
using the LU decomposition with and without partial pivoting and chopped arithmetic
with base = 10 and t = 3 (i.e., work with a three digit mantissa). Obtain computed
solutions (xc , yc ) in both cases. Find the exact solution, compare, make comments.

Exercise 11.2. (JPE, May 2003). Consider the system

✓ ◆✓ ◆ ✓ ◆
" 1 x 1
=
2 1 y 0
Assume that |"| ⌧ 1. Solve the system by using the LU decomposition with and with-
out partial pivoting and adopting the following rounding o↵ models (at all stages of the
computation!):
a + b" = a (for a 6= 0),
a + b/" = b/" (for b 6= 0).
Find the exact solution, compare, make comments.

Manuals Pt6a-34
100% (1)
Manuals Pt6a-34
1 page
ASNT Study Guide-Basic 3rd Ed 2016 Scan
92% (25)
ASNT Study Guide-Basic 3rd Ed 2016 Scan
129 pages
MCU6831 FOC Algorithm
No ratings yet
MCU6831 FOC Algorithm
284 pages
Chrysler PCM Flash Availability
100% (3)
Chrysler PCM Flash Availability
265 pages
ASTM Standards
80% (5)
ASTM Standards
8 pages
Final Drawings
100% (1)
Final Drawings
253 pages
Numerical Methods I - Roundoff Errors
No ratings yet
Numerical Methods I - Roundoff Errors
46 pages
3.1 Data Representation: 3.1.3 Real Numebrs and Normalized Floating-Point Representation
No ratings yet
3.1 Data Representation: 3.1.3 Real Numebrs and Normalized Floating-Point Representation
14 pages
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
No ratings yet
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
32 pages
Lab 3
No ratings yet
Lab 3
5 pages
Numerical Analysis - Patel
No ratings yet
Numerical Analysis - Patel
16 pages
Scientific Computation (Floating Point Numbers)
No ratings yet
Scientific Computation (Floating Point Numbers)
4 pages
Floating Point 6up
No ratings yet
Floating Point 6up
7 pages
NT Notes
No ratings yet
NT Notes
8 pages
ch1 anal num
No ratings yet
ch1 anal num
17 pages
Numerical+Analysis+Chapter+1 2
No ratings yet
Numerical+Analysis+Chapter+1 2
13 pages
Rounding Errors: Course Website
No ratings yet
Rounding Errors: Course Website
34 pages
1 5 Floating Point Representation
No ratings yet
1 5 Floating Point Representation
9 pages
Chap2 Float
No ratings yet
Chap2 Float
20 pages
16-Algorithms For Floating Point Arithmetic Operations and Numericals-01-02-2024
No ratings yet
16-Algorithms For Floating Point Arithmetic Operations and Numericals-01-02-2024
21 pages
Real Number Representation and Floating Point Arithmetic
No ratings yet
Real Number Representation and Floating Point Arithmetic
12 pages
IEEE Standard 754
No ratings yet
IEEE Standard 754
10 pages
Fixed Point and Floating Point Number Representations
No ratings yet
Fixed Point and Floating Point Number Representations
7 pages
Soc2040 SP Week 5 Lecture1 Slides On Data Representation Part4 Spring 2024
No ratings yet
Soc2040 SP Week 5 Lecture1 Slides On Data Representation Part4 Spring 2024
46 pages
MAT104 Lectures 105056
No ratings yet
MAT104 Lectures 105056
2 pages
Slide8-Number Systems and Number Representations
No ratings yet
Slide8-Number Systems and Number Representations
24 pages
Floating Points
No ratings yet
Floating Points
31 pages
Finite Word Length Effects
No ratings yet
Finite Word Length Effects
31 pages
Round-Off Errors and Computer Arithmetic
No ratings yet
Round-Off Errors and Computer Arithmetic
19 pages
Floating Point Numbers
No ratings yet
Floating Point Numbers
3 pages
4-Floating-Point-inclass
No ratings yet
4-Floating-Point-inclass
33 pages
Cacc
No ratings yet
Cacc
106 pages
LESSON 5 Floating Point
No ratings yet
LESSON 5 Floating Point
11 pages
Floating Point Representation
No ratings yet
Floating Point Representation
3 pages
What Are Floating Point Numbers?
No ratings yet
What Are Floating Point Numbers?
7 pages
Round-Off Errors and Computer Arithmetic
No ratings yet
Round-Off Errors and Computer Arithmetic
19 pages
Dr.Shoeb_ME212_Lec-3
No ratings yet
Dr.Shoeb_ME212_Lec-3
43 pages
Module 2 - PART D Floating
No ratings yet
Module 2 - PART D Floating
30 pages
Floating Point Representation: Major: All Engineering Majors Authors: Autar Kaw, Matthew Emmons
No ratings yet
Floating Point Representation: Major: All Engineering Majors Authors: Autar Kaw, Matthew Emmons
21 pages
COMP0068 Lecture10 High Level Data Types
No ratings yet
COMP0068 Lecture10 High Level Data Types
25 pages
Floating Point Arithmetic
100% (1)
Floating Point Arithmetic
30 pages
Number Representation
No ratings yet
Number Representation
7 pages
Mws Gen Aae Spe Floatingpoint
No ratings yet
Mws Gen Aae Spe Floatingpoint
8 pages
COA UNIT-III PPTs Dr.G.Bhaskar ECE
No ratings yet
COA UNIT-III PPTs Dr.G.Bhaskar ECE
64 pages
02 - Data Representation - Exercise Sheet 2 (Solutions) - 1271758327
No ratings yet
02 - Data Representation - Exercise Sheet 2 (Solutions) - 1271758327
5 pages
Floating Point
No ratings yet
Floating Point
26 pages
The IEEE Standard For Floating Point Arithmetic
No ratings yet
The IEEE Standard For Floating Point Arithmetic
9 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
Review: How To Represent Real Numbers
No ratings yet
Review: How To Represent Real Numbers
9 pages
Mathematical Preliminaries and Error Analysis
100% (1)
Mathematical Preliminaries and Error Analysis
106 pages
Number System
No ratings yet
Number System
38 pages
Analisis Numerico
No ratings yet
Analisis Numerico
700 pages
Gate Cse Cao
100% (1)
Gate Cse Cao
108 pages
Floating Point Representation: Major: All Engineering Majors Authors: Autar Kaw, Matthew Emmons
No ratings yet
Floating Point Representation: Major: All Engineering Majors Authors: Autar Kaw, Matthew Emmons
21 pages
3-EED220 Lecture 3
No ratings yet
3-EED220 Lecture 3
22 pages
HW_2
No ratings yet
HW_2
4 pages
Unit 2
No ratings yet
Unit 2
16 pages
The World Is Not Just Integers: Programming Languages Support Numbers With Fraction
No ratings yet
The World Is Not Just Integers: Programming Languages Support Numbers With Fraction
51 pages
System Architecture
No ratings yet
System Architecture
34 pages
(Turner) - Applied Scientific Computing - Chap - 02
No ratings yet
(Turner) - Applied Scientific Computing - Chap - 02
19 pages
Floating-Point Numbers and Operations Representation
No ratings yet
Floating-Point Numbers and Operations Representation
8 pages
Computer Architecture & Organization Unit 2
No ratings yet
Computer Architecture & Organization Unit 2
24 pages
Lec07 - Computer Arithmetic - Floating-Point Representation and Arithmetic
No ratings yet
Lec07 - Computer Arithmetic - Floating-Point Representation and Arithmetic
42 pages
Principles of Digital Electronics
From Everand
Principles of Digital Electronics
Sapana Rane
No ratings yet
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Incubator - Scitek
No ratings yet
Incubator - Scitek
30 pages
Monte Carlo in Power System
No ratings yet
Monte Carlo in Power System
8 pages
Compiler Manual
No ratings yet
Compiler Manual
62 pages
IS322 Assignment 1 - v1
No ratings yet
IS322 Assignment 1 - v1
4 pages
Linux Os Project
No ratings yet
Linux Os Project
3 pages
Jawaban Ujian Tengah Semester Nama: Jihan Syehar NIM: 194 348 051 014 Jurusan: Akuntansi
No ratings yet
Jawaban Ujian Tengah Semester Nama: Jihan Syehar NIM: 194 348 051 014 Jurusan: Akuntansi
4 pages
A Study On Electro-Static Precipitator and Its Voltage Controller (BAPCON)
94% (18)
A Study On Electro-Static Precipitator and Its Voltage Controller (BAPCON)
73 pages
Bayesian Analysis with Python 1st Edition Martin instant download
50% (2)
Bayesian Analysis with Python 1st Edition Martin instant download
54 pages
KB Duct 2019 Catalog
100% (1)
KB Duct 2019 Catalog
60 pages
Policies and Procedures: Standard Operating Procedures Hydrant Inspection Procedures Section Ii 7.0 - 7.1
100% (1)
Policies and Procedures: Standard Operating Procedures Hydrant Inspection Procedures Section Ii 7.0 - 7.1
2 pages
XII Sample Paper Complete Syllabus-1 Answer Key
No ratings yet
XII Sample Paper Complete Syllabus-1 Answer Key
16 pages
Huawei: H12-311 Exam
No ratings yet
Huawei: H12-311 Exam
9 pages
Revised PPT For Online Lecture 6 HVAC-Types of Systems
100% (1)
Revised PPT For Online Lecture 6 HVAC-Types of Systems
22 pages
py
No ratings yet
py
9 pages
07 Laboratory Exercise 1(9)
No ratings yet
07 Laboratory Exercise 1(9)
10 pages
Desktop Migration Proposal
No ratings yet
Desktop Migration Proposal
7 pages
Intelligence Analysis W1
No ratings yet
Intelligence Analysis W1
29 pages
EB - Lecture 2 - ECommerce Revenue Models - H
No ratings yet
EB - Lecture 2 - ECommerce Revenue Models - H
8 pages
Hkvacc P051 R3
No ratings yet
Hkvacc P051 R3
34 pages
Datasheet Ripple-Generator en
No ratings yet
Datasheet Ripple-Generator en
5 pages
3rd Grade English Booklet - PDF
No ratings yet
3rd Grade English Booklet - PDF
1 page
HSPM300 1 CourseOutline Jan June2022 GVD V.1 09022022
No ratings yet
HSPM300 1 CourseOutline Jan June2022 GVD V.1 09022022
68 pages
NTC Thermistors:: Type CL
No ratings yet
NTC Thermistors:: Type CL
2 pages
upm-lignin-factsheet-100
No ratings yet
upm-lignin-factsheet-100
2 pages

Machine Arithmetic_Notes

Uploaded by

Machine Arithmetic_Notes

Uploaded by

Chapter 11

11.1 Decimal number system

where ai 2 {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} are digits. Fractional numbers require an

which may be finite or infinite.

11.2 Floating point representation

where di are decimal digits and e 2 Z is an integer. For example,

58.2 = 0.582 ⇥ 102 = 0.0582 ⇥ 103 , etc. (11.1)

This is called floating point representation of decimal numbers. The part

11.4 Binary number system

where ai 2 {0, 1, . . . , 1} are digits. Fractional numbers require an addi-

which may be finite or infinite. The floating point representation of real

where .d1 d2 . . . dt . . . is called mantissa and e 2 Z is called exponent. Again,

11.6 Machine number systems (an abstract version)

The representation must be normalized, i.e., d1 6= 0.

11.8 Two standard machine systems

11.9 Rounding rules

11.11 Machine epsilon

11.12 Machine epsilon for the two standard machine systems

The exact solution was found in Section 7.13:

Example 11.13 without pivoting.

Machine arithmetic Exact arithmetic: Machine arithmetic

Example 11.13 with partial pivoting.

xc = 1.02 and yc = 0.994.

The relative error in x is (xc x)/x = 0.0047, less than 0.5%.

relative error typical error factor

The exact solution here is

relative error typical error factor

(xc ⇤ yc )c = (xc ⇤ yc ) (1 + "3 ) = [ x(1 + "1 ) ] ⇤ [ y(1 + "2 ) ] (1 + "3 )

where |"1 |, |"2 |, |"3 |  u.

11.16 Multiplication and division

x(1 + "1 )(1 + "3 ) x

Exercise 11.1. (JPE, September 1993). Solve the system

Exercise 11.2. (JPE, May 2003). Consider the system

You might also like