Chapter One Handout
Chapter One Handout
Chapter One Handout
Base conversion
These are the possibilities
Decimal Octal
Binary Hexadeci
Octal to binary
Convert each octal digit to a 3 bit equivalent binary representation (23 =8)
Example : 7058 = ( )2
7 = 111, 0 = 000, 5 = 101
7058 = ( 111000101 )2
Exercise : 2548
Hexadecimal to binary
Convert each octal digit to a 4 bit equivalent binary representation (24 =16)
Example : 10AF16 = ( )2
1= 0001, 0 = 0000, A = 1010, F = 1111
10AF16 = (0001000010101111)2
Exercise : ADE316
Decimal to octal
Divide by 8
Stop when the quotient is 0
Keep track of the remainder
First remainder is bit 0 (lsb) then continues
Example : 123410 = ( )8
Number divider quotient remainder
1234 8 154 2
154 8 19 2
19 8 2 3
2 8 0 2
Since the last quotient is less than the divider we will take it as the remainder
Start from the last remainder
123410 = ( 2322 )8
Exercise: 56710 = ( )8
Decimal to hexadecimal
Repeatedly divide the number and then each succeeding quotient by b until a quotient
of zero is obtained. The remainders from the last to the first; but converted to base b,
form the required number. An appropriate number of leading zeroes is prefixed to
obtain the required number of bits.
Example: Convert 5876 into a 16-bit hexadecimal number.
Solution:
367 22 1 0
16 5876 16 367 16 22 16 1
- 5872 - 352 - 16 -0
4 15 6 1
4 F 6 1
Thus the answer is 16F4H
Binary to octal
To convert a binary number into octal we follow the given steps
1. Divide the binary digits into groups of 3 digits, starting from the right.
2. Convert each group of 3 binary digits into 1 octal digit.
Convert Binary number 1001012 into Octal form
Step 1. Make groups of 3 digits from right
1001012
Groups: 1002 1012
Step 2. Convert each 3 digits group into 1 octal digit
1012 = 58
1002 = 48
so, 1001012 = 458
Hexadecimal to Octal conversion
To convert a hexadecimal number into octal we follow the given steps
1. Convert each hexadecimal digit into groups of 4 digits binary
2. Combine the groups from step 1
3. Divide the binary digits from step 2 into groups of 3 digits, starting from the right
4. Convert each group of 3 binary digits into 1 octal digit
Step 1. Convert each hexadecimal digit into groups of 4 digits binary
1516
Groups: 116 516
516 = 01012
116 = 00012
Step 2. Combine the groups
so, 1516 = 000101012
Step 3. Divide the binary digits from step 2 into groups of 3 digits, starting from the
right
Groups: 0002 0102 1012
Step 4. Convert each group of 4 binary digits into 1 hexadecimal digit
1012 = 58
0102 = 28
0002 = 08
so, 1516 = 0258 = 258
Octal to Hexadecimal conversion
To convert an octal number into hexadecimal we follow the given steps
1. Convert each octal digit into groups of 3 digits binary
2. Combine the groups from step 1
3. Divide the binary digits from step 2 into groups of 4 digits, starting from the right
4. Convert each group of 4 binary digits into 1 hexadecimal digit
(-1)s2c-1023(1 + f ).
Example: Consider the machine number
0 10000000011 1011100100010000000000000000000000000000000000000000.
The leftmost bit is s = 0, which indicates that the number is positive. The next 11 bits,
10000000011, give the characteristic and are equivalent to the decimal number
c = 1 · 210 + 0 · 29 + · · · + 0 · 22 + 1 · 21 + 1 · 20 = 1024 + 2 + 1 = 1027.
The exponential part of the number is, therefore, 21027-1023 = 24. The final 52 bits
specify that the mantissa is
= 27.56640625.
However, the next smallest machine number is
0 10000000011 1011100100001111111111111111111111111111111111111111,
and the next largest machine number is
0 10000000011 1011100100010000000000000000000000000000000000000001.
Exercise: What number has the representation (45DE4000)16?
Exercise: 0100000001111110100000000000000000000000000000000000000000000000,
considered as a double precision word
Numerical Error Analysis
Numerical calculations always involve approximations due to several reasons. These
errors are not the result of poor thinking or carelessness (like programming errors) but
they inevitably arise in all numerical calculations. We can divide the sources of errors
roughly into four categories: model, method, initial values (data) and round-off.
A. Modeling errors
When a practical problem is formulated into mathematical language, it is almost
always necessary to make simplifications. Examples of modeling errors include
leaving out less influential factors (e.g., no air resistance in falling) or using a
simplified description of a more complex system (e.g., classical description of a
quantum-mechanical system).Modeling errors are not discussed here in more detail
but left as a subject of courses in the various application fields.
B. Methodological errors
The conversion of a mathematical problem into a numerical one is also a source of
errors. Care should be taken to control these errors and to estimate their magnitude
and thus the quality of the numerical solution. Note that by methodological errors we
mean errors that would persist even if a hypothetical "perfect" computer had an
infinitely accurate representation and no round-off error. As a general rule, there is
not much a programmer can do about the computer’s round-off error.
Methodological errors, on the other hand, are entirely under the programmer’s
control. In fact, an incredible amount of work in the field of numerical analysis has
been devoted to the fine minimization methodological errors! An example of
methodological errors is the truncation error (or chopping error) which is
encountered when, for example, an interminating series is chopped:
=1+ !
+ !
+ !
+ ⋯,
x3 x4
.......................
3! 4!
Example the Taylor series approximate for ex is given by
x2 x3
ex 1 x ....................
2! 3!
Approximate e1 by three Taylor series terms and calculate the truncation error.
ex 1+ x + x2 / 2!
e1 = 1+1+12/2! =1+1+1/2
e1 =2.5 (approximate)
e1 = 2.73 (true)
error = | 2.73-2.5|
truncation error is = 0.23 (truncation)
Example 1 Determine the five-digit (a) chopping and (b) rounding values of the
irrational number π. Solution The number π has an infinite decimal expansion of the
form π = 3.14159265. . .Written in normalized decimal form, we have π =
0.314159265 . . . × 101.
(a) The floating-point form of π using five-digit chopping is f l(π) = 0.31415 × 101 =
3.1415.
(b) The sixth digit of the decimal expansion of π is a 9, so the floating-point form of π
using five-digit rounding is f l(π) = (0.31415 + 0.00001) × 101 = 3.1416.
Significant Figures
• Number of significant figures indicates precision. Significant digits of a
number are those that can be used with confidence, e.g., the number of certain
digits plus one estimated digit.
53,800 How many significant figures?
5.38 x 104 3
5.380 x 104 4
5.3800 x 104 5
Zeros are sometimes used to locate the decimal point not significant figures.
0.00001753 4
0.0001753 4
0.001753 4
Types of error
Numerical errors arise from the use of approximations to represent exact
mathematical operations and quantities. These include truncation errors, which result
when approximations are used to represent exact mathematical procedures, and
round-off errors, which result when numbers having limited significant figures are
used to represent exact numbers. For both types, the relationship between the exact, or
true result and the approximation can be formulated as
True value = approximation + error
By rearranging the Equation
True error (Et) = true value - approximation
where Et is used to designate the exact value of the error. The subscript t is included
to designate that this is the “true” error.
Relative true error is denoted by t and is defined as the ratio between the true error
and the true value.
True Error
Relative True Error
True Value
true error
True percent relative error, t 100 %
true value
Problem Statement. Suppose that you have the task of measuring the lengths of a
bridge and a rivet and come up with 9999 and 9 cm, respectively. If the true values
are 10,000 and 10 cm, respectively, compute (a) the true error and (b) the true percent
relative error for each case.
Solution. (a) The error for measuring the bridge is
Et = 10,000 - 9999 = 1 cm
Thus, although both measurements have an error of 1 cm, the relative error for the
rivet is much greater. We would conclude that we have done an adequate job of
measuring the bridge, whereas our estimate for the rivet leaves something to be
desired.
For numerical methods, the true value will be known only when we deal with
functions that can be solved analytically (simple systems). In real world applications,
we usually not know the answer a priori. Then
Approximate error
a 100%
Approximation
where the subscript a signifies that the error is normalized to an approximate value.
This process is performed repeatedly, or iteratively, to successively compute (we
hope) better and better approximations. For such cases, the error is often estimated as
the difference between previous and current approximations. Thus, percent relative
error is determined according to
Example 1
The derivative of a function f (x) at a particular value of x can be approximately calculated
by
f ( x h) f ( x )
f ( x)
h
of f ( 2) For f ( x ) 7e 0.5 x and h 0.3 , find
3.5e 0.5 x
So the true value of f ' (2) is
= 0.0758895
= 7.58895%
Example 3
From Example 3, the approximate value of f (2) 10.263 using h 0.3 and
f ' ( 2) 9.8800 using h 0.15 .
Ea
Present Approximation – Previous Approximation
9.8799 10.265
0.38474
The relative approximate error is calculated as
Approximate Error
a Present Approximation
0.38474
9.8799
0.038942
Relative approximate errors are also presented as percentages. For this example,
a 0.038942 100%
= 3.8942%
Absolute relative approximate errors may also need to be calculated. In this example
a | 0.038942 |
0.038942 or 3.8942%
Example 5
x 0.7
If one chooses 6 terms of the Maclaurin series for e to calculate e , how many
significant digits can you trust in the solution? Find your answer without knowing or
using the exact answer.
Solution
x2
ex 1 x .................
2!
Using 6 terms, we get the current approximation as
0.7 2 0.7 3 0.7 4 0.7 5
e 0.7 1 0.7
2! 3! 4! 5!
2.0136
Using 5 terms, we get the previous approximation as
0.7 2 0.7 3 0.7 4
e 0.7 1 0.7
2! 3! 4!
2.0122
The percentage absolute relative approximate error is
2.0136 2.0122
a 100
2.0136
0.069527%
a 0.5 10 2 2 %
Since , at least 2 significant digits are correct in the answer of
e 0.7 2.0136