0% found this document useful (0 votes)
60 views37 pages

Unit 1 - Number Systems and Errors

Uploaded by

upscloverxcd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views37 pages

Unit 1 - Number Systems and Errors

Uploaded by

upscloverxcd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

O02CA501: Computational Mathematics

MASTER OF COMPUTER
jpaac

APPLICATIONS
SEMESTER 1

O02CA501
COMPUTATIONAL MATHEMATICS
Unit: 1 – Number Systems and Errors 1
O02CA501: Computational Mathematics

Unit 1
Number Systems and Errors
TABLE OF CONTENTS
Fig No /
SL SAQ /
Topic Table / Page No
No Activity
Graph
1 Introduction - -
4-5
1.1 Learning Objectives - -

2 The Representation of Number Systems - -

2.1 Integers - -

2.2 Fractions - - 6-15

2.3 Floating Point Arithmetic - -

2.4 Conversion of Number Systems - -

3 Significant Figures/Digits - -

3.1 Definition - -
16-18
3.2 Note - -

3.3 Example - -

4 Errors - -

4.1 Inherent Errors - -

4.2 Numerical Errors - -

4.3 Round Off Errors - -


19-22
4.4 Definition - -

4.5 Numbers Rounded-off to n Significant Digits - -

4.6 Example - -

4.7 Truncation Errors - -

5 Absolute, Relative and Percentage Errors - -

5.1 Definition - - 23-24

5.2 Definition - -

Unit: 1 – Number Systems and Errors 2


O02CA501: Computational Mathematics

5.3 Definition - -

6 Mathematical Preliminaries - -

6.1 Theorem - -

6.2 Rolle’s Theorem - -

6.3 Generalized Rolle’s Theorem - -

6.4 Example - -

6.5 Intermediate Value Theorem - -

6.6 Lagrange’s Mean value Theorem - -


25-31
6.7 Geometrical Interpretation of Lagrange’s - -
Mean Value theorem

6.8 Taylor’s Series for a function of one variable - -

6.9 Maclaurin’s Expansion - -

6.10 Complex Numbers - -

6.11 Conjugate of a Complex Number - -

6.12 Modulus of a Complex Number - -

7 Summary - - 32

8 Self-Assessment Questions - 1 33-34

9 Terminal Questions - - 35

10 Answers - -

10.1 Self-Assessment Question Answers - - 36-37

10.2 Terminal Question Answers - -

Unit: 1 – Number Systems and Errors 3


O02CA501: Computational Mathematics

1. INTRODUCTION

Computational Mathematics forms the bedrock of modern computing and is pivotal in an MCA
(Master of Computer Applications) program. This discipline intertwines mathematical theory,
computational techniques, and algorithm development, enabling students to solve complex
scientific, engineering, and business problems efficiently. As we delve deeper into the era of data,
understanding the nuances of computational methods becomes indispensable. These methods
not only enhance analytical skills but also equip students with the ability to model real-world
scenarios, optimize processes, and innovate within the rapidly evolving tech landscape. The study
of computational mathematics fosters a robust analytical framework, critical for algorithm design,
data analysis, software development, and beyond. It cultivates a mindset that is adept at
approaching problems methodically, ensuring solutions are not just effective but also scalable and
adaptable to future technological advancements.
Unit 1, titled "Number Systems and Errors," serves as the gateway to understanding the
foundational elements of computational mathematics within the MCA curriculum. This unit
embarks on a journey through the fundamental concept of number systems, including Decimal,
Binary, Octal, and Hexadecimal, which are crucial for data representation in computing. It
elucidates how integers, forming the core of mathematical and computer science concepts, are
represented and manipulated across these systems to achieve efficient storage and computation.
Furthermore, the unit explores the representation of fractions and the intricacies of floating-point
arithmetic, which are essential for handling real numbers and performing accurate calculations.
The segment on errors delves into the unavoidable aspect of computational mathematics - the
approximation and rounding errors inherent in numerical computations. Understanding these
errors is vital for developing algorithms that minimize inaccuracies and enhance the reliability of
computational results.
Studying "Number Systems and Errors" demands a multifaceted approach, blending theoretical
understanding with practical application. Begin by solidifying your grasp of the various number
systems through hands-on exercises that involve conversions and operations within and across
these systems. Dive into the conceptual underpinnings of floating-point arithmetic to appreciate
its significance in computing and numerical analysis. Engage in practical tasks that simulate real-
world scenarios, applying different number systems and observing the impact of errors on

Unit: 1 – Number Systems and Errors 4


O02CA501: Computational Mathematics

computational outcomes. Leverage software tools and programming languages to implement and
experiment with numerical algorithms, enhancing your understanding of precision and accuracy
in computational contexts. Collaborate with peers to tackle complex problems, fostering a
collective learning environment that encourages the exchange of ideas and solutions. As you
progress, continually reflect on the application of these fundamental concepts in advanced
computational mathematics and computer science topics, ensuring a well-rounded and deep
comprehension of the subject matter.

1.1. Learning Objectives


In this Unit, you will –
Recall the definitions and key characteristics of Decimal, Binary, Octal, and Hexadecimal
number systems.
Interpret the significance of floating-point arithmetic in computational mathematics.
Execute conversions between Decimal, Binary, Octal, and Hexadecimal systems.
Differentiate between various types of errors in numerical computations, such as rounding
and truncation errors.
Assess the impact of floating-point arithmetic errors on the accuracy of computational results.

Unit: 1 – Number Systems and Errors 5


O02CA501: Computational Mathematics

2. THE REPRESENTATION OF NUMBER SYSTEMS

2.1 Integers
Integers, the set of whole numbers including positive, negative, and zero, form a fundamental part
of mathematical and computer science concepts. In computer systems, integers are represented
in various formats to facilitate efficient storage, computation, and manipulation. Understanding
these representations is crucial for both theoretical mathematics and practical computing
applications.
Representing integers in different number systems is a fundamental concept in mathematics and
computer science. Let’s explore how integers can be represented in various commonly used
number systems: Decimal, Binary, Octal, and Hexadecimal.
Decimal System (Base-10)
The decimal system is the most widely used number system, employing base-10. It consists of
ten symbols: 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. Each position in a decimal number represents a power
of 10, based on its position from the right, starting with 100
Example: The decimal number 345 can be broken down as:
3𝑥102 + 4𝑥101 + 5𝑥100

Binary System (Base-2)


The binary system is fundamental in computing and digital electronics, using only two symbols: 0
and 1. Each position in a binary number represents a power of 2.
Example: The binary number 1011 can be interpreted as:
1𝑥23 + 0𝑥22 + 1𝑥21 + 1𝑥20 = 8 + 0 + 2 + 1 = 1110
Therefore,
10112 = 1110

Octal System (Base-8)


The octal system uses eight symbols: 0, 1, 2, 3, 4, 5, 6, and 7. Each position in an octal number
represents a power of 8.
Example: The octal number 157 can be expanded as:
1𝑥82 + 5𝑥81 + 7𝑥80 = 64 + 40 + 7 = 11110

Unit: 1 – Number Systems and Errors 6


O02CA501: Computational Mathematics

Therefore,
1578 = 11110

Hexadecimal System (Base-16)


The hexadecimal system is used in computing for a compact representation of binary data, using
sixteen symbols: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F (where A=10, B=11, C=12, D=13,
E=14, F=15). Each position represents a power of 16.
Example: The hexadecimal number 1A3 can be broken down as:
1𝑥162 + 10𝑥161 + 3𝑥160 = 256 + 160 + 3 = 41910
Therefore,
1𝐴316 = 41910

2.2 Fractions
Fractions represent a part of a whole or, more generally, any number of equal parts. In
mathematics, they are crucial for precise expressions of non-integer values. Fractions can be
represented in various forms, each serving different purposes in mathematical computation and
practical applications.
Simple Fraction
A simple fraction consists of two integers, one on top of the other, separated by a line. The number
above the line is called the numerator, and the number below is the denominator. For example,
3
4
represents three parts of a whole that is divided into four equal parts.
Decimal Representation
Fractions can be converted into decimal form by dividing the numerator by the denominator. This
form is widely used in everyday life and in most scientific calculations due to its simplicity. For
example, the fraction
1
2
is equivalent to the decimal 0.5.
Percentages

Unit: 1 – Number Systems and Errors 7


O02CA501: Computational Mathematics

Percentages are a way to express fractions with a denominator of 100. Converting a fraction to a
percentage involves finding an equivalent fraction with 100 in the denominator or converting the
fraction to a decimal and then multiplying by 100. For instance,
3
4
as a percentage is 75%, as
3
= 0.75
4
and
0 ⋅ 75 × 100 = 75%
Continued Fractions
Continued fractions are an expression of a number as the sum of its integer part and the reciprocal
of another number, which can also be expressed as a continued fraction. This representation is
useful for precise expressions of irrational numbers and for certain types of mathematical analysis.
For example, the Golden Ratio
1
∅=1+
1
1+ 1
1+1+⋯

Binary Fractions
In computing, fractions can also be represented in binary, similar to binary integers. The place
values to the right of the binary point represent
1 1 1
, , ,…
2 4 8
and so on. Let’s take an example of 0.112 which represents the decimal fraction 0.7510 because
it is equivalent to
1 1
+
2 4
Examples:
1. Decimal to Fraction: To convert the decimal 0.25 to a fraction, recognize that 25 is 25% and
25⁄ 1
100 can be simplified to ⁄4.
2. Fraction to Decimal: To convert the fraction 5⁄8 to decimal, divide 5 by 8 to get 0.625.

3. Percentage to Fraction: To convert 20% to a fraction, write it as 20⁄100 and simplify to 1⁄5.

Unit: 1 – Number Systems and Errors 8


O02CA501: Computational Mathematics

4. Binary Fraction to Decimal: The binary fraction 0.1012 is equivalent to


1 0 1
+ +
2 4 8
In decimal which sums to 0.62510
Understanding the various representations of fractions and their conversions is vital in many fields,
from academic studies to everyday financial calculations, enhancing the versatility and depth of
mathematical comprehension.

2.3 Floating Point Arithmetic


Floating point arithmetic is a method used in computing and various scientific calculations to
represent real numbers that cannot be accurately depicted as simple fractions or integers. This
system allows for the representation of very large or very small numbers in a compact form,
facilitating calculations that involve a wide range of values.
Basics of Floating-Point Representation
A floating-point number is typically represented in three parts: the sign, the exponent, and the
mantissa (or significand). The sign bit determines whether the number is positive or negative. The
exponent expresses the power to which the base (usually 2 in binary systems) is raised, adjusting
the number's scale. The mantissa represents the significant digits of the number. The general form
of a floating-point number is:
(−1)𝑠𝑖𝑔𝑛 𝑥 (1. 𝑚𝑎𝑛𝑡𝑖𝑠𝑠𝑎) 𝑥 2(𝑒𝑥𝑝𝑜𝑛𝑒𝑛𝑡−𝑏𝑖𝑎𝑠)
The "bias" is a predetermined offset used to represent both positive and negative exponents.
Examples of Floating-Point Arithmetic
1. Addition: To add two floating point numbers, their exponents must be equalized. For
instance, adding
1 ⋅ 5 × 102
and
2.5 × 101
, we first adjust the second number to
0 ⋅ 25 × 102
, and then add the mantissa to get
1 ⋅ 75 × 102
2. Subtraction: Similar to addition, subtraction requires exponent alignment. Subtracting

Unit: 1 – Number Systems and Errors 9


O02CA501: Computational Mathematics

2.0 × 103
From
5.0 × 103
involves direct mantissa subtraction since the exponents are already equal, resulting in
3.0 × 103
3. Multiplication: Multiply the mantissa and add the exponents. For example, multiplying
1 ⋅ 2 × 103
by
3.0 × 102
gives
3.6 × 105
4. Division: Divide the mantissa and subtract the exponents. Dividing
6 ⋅ 0 × 104
by
2 ⋅ 0 × 102
yields
3.0 × 102
Precision and Rounding
Due to the finite number of bits available to store mantissas and exponents, floating point
arithmetic can introduce rounding errors. For example, the decimal number 0.1 cannot be
precisely represented in binary floating point, leading to a small error when repeatedly adding
0.1 in a binary computer system.
Special Values
Floating point systems include representations for special values such as "infinity" (for results of
operations that exceed the maximum representable value) and "NaN" (Not a Number, for results
of undefined operations like 0/0).
Normalization
Normalization in floating point numbers ensures that the mantissa is within a certain range
(usually between 1 and 2 for binary systems). For example, the number 0.0125 would be
normalized to
1.25 𝑥 10−2

Unit: 1 – Number Systems and Errors 10


O02CA501: Computational Mathematics

in scientific notation.
Importance in Computing
Floating point arithmetic is crucial in fields that require extensive numerical computations, such as
physics simulations, financial modelling, and machine learning. It allows for efficient and
reasonably accurate calculations over a vast range of magnitudes, from subatomic scales to
astronomical distances.
Understanding floating point arithmetic, including its limitations and best practices, is essential for
software developers, engineers, and scientists who rely on precise and efficient numerical
computations in their work.

2.4 Conversions of Number System


Decimal to other number system
In the earlier examples (in Section 2.1) we saw how to convert from all number systems to decimal.
From here onwards, we will see how to convert the number from decimal to all other number
systems.
The decimal number must be divided by the base of the other number systems to convert decimal
numbers (i.e. Integer part) to other numbering systems. For example, if we want to convert the
decimal number to a binary number system, we must divide the decimal number by 2 repeatedly
till our quotient becomes zero as shown in the following example. The remainder moving from the
bottom to the top will be our binary number. It is similar to all other number systems.
Example:
Convert 10010 into binary, octal and hexadecimal
Solution:

Unit: 1 – Number Systems and Errors 11


O02CA501: Computational Mathematics

Let’s concentrate on the fractional conversion.


The fractional decimal number must be multiplied by the base of the other number system to
convert it into other number systems as shown in the following example. For example decimal
number is to be converted into binary. The fractional part has to be multiplied by 2 repeatedly till
we will make the fractional part zero. If the fractional part goes on and on, we can terminate in
between.
Example:
Convert 100.26510 into binary, octal, and hexadecimal.
Solution

Unit: 1 – Number Systems and Errors 12


O02CA501: Computational Mathematics

For the fractional part, follow the multiplication process as shown below:
.265 x 2 .265 x 8 .265 x 16
0.530 x 2 2.120 x 8 4.240 x 16
1.060 x 2 0.960 x 8 3.840 x 16
0.120 x 2 7.680 x 8 D.440 x 16
0.240 x 2 5.440 x 8 7.040 x 16
0.480 3.520 0.640
(0.265)10 = (0.0100)2 (0.265)10 = (0.20753)8 (0.265)10 = (0.43𝐷70)16
Therefore,
(100.265)10 = (1100100.0100)2
(100 ⋅ 265)10 = (144 ⋅ 20753)8
(100.265)10 = (64 ⋅ 43𝐷70)16

Octal and Hexadecimal to Binary


To convert Octal to Binary or vice-a-versa, we should now the equivalent numbers for both
systems.
Octal Binary
0 000

Unit: 1 – Number Systems and Errors 13


O02CA501: Computational Mathematics

1 001
2 O10
3 011
4 100
5 101
6 110
7 111
Similarly, to convert Hexadecimal to Binary or vice-a-versa, we should now the equivalent
numbers for both systems.
Decimal Binary Octal Hexadecimal
0 0000 0 0
1 0001 1 1
2 0010 2 2
3 0011 3 3
4 0100 4 4
5 0101 5 5
6 0110 6 6
7 0111 7 7
8 1000 10 8
9 1001 11 9
10 1010 12 A
11 1011 13 B
12 1100 14 C
13 1101 15 D
14 1110 16 E
15 1111 17 F
Example:
Convert 61358 :
Solution:

Unit: 1 – Number Systems and Errors 14


O02CA501: Computational Mathematics

6 1 3 5
110 001 011 101

61358 = 1100010111012
Example:
Convert 1𝐴2𝐶16 to binary
1 A 2 C
0001 1010 0010 1100

1𝐴2𝐶16 = 11010001011002

Unit: 1 – Number Systems and Errors 15


O02CA501: Computational Mathematics

3. SIGNIFICANT DIGITS/FIGURES

There are two kinds of numbers, exact and approximate numbers. The numbers like 1, 2, 3, …,

1 (= 0.5), 3 (= 1.5), ... are treated as exact numbers. But there are numbers 2 (= 0.285714….),
2 2 7

 (= 3.14159….), 2 (= 1.4142….), e (= 2.71828….) which cannot be expressed by a finite


number of digits. These may be approximated by numbers 0.2857, 3.1416, 1.4142, 2.7183
respectively by omitting some digits, then these numbers are called approximate numbers. Thus
numbers represent the given numbers to a certain degree of accuracy are called approximate
numbers. For example, the approximate value of  is 3.1416 or if we desire a better approximation,
it is 3.1415926589793, but we cannot write the exact value of .

3.1 Definition:
The digits that are used to express a number are called significant digits or significant figures. The
significant figures of a number are defined as follows:

Rule 1: (Numbers without decimal point): If the number does not have any decimal point, the
significant figures of the number are the digits counted from the first non-zero digit on the left to
the last non-zero digit on the right. Therefore, the number 12040 has four significant figures.

Rule 2: (Numbers with decimal point): If the number has a decimal point, the significant figures of
the number are the digits counted from the first non-zero digit on the left to the last digit on the
right side (irrespective of whether it is zero or non-zero). Therefore, the number, 2100.4, has five
significant figures, and the number 0.015, has two significant figures.
Thus each of the numbers 3.1416, 0.60125 and 4.0002 contain five significant digits while the
numbers 0.00386, 0.000587 and 0.00205 contain only three significant digits, since zeros only
help to fix the position of the decimal point.

3.2 Note:
The following statements describe the notion of significant digits,
1. All non-zero digits are significant
2. All zeros occurring between non-zero digits are significant digits.
3. Trailing zeros following a decimal point are significant. For example 3.500, 65.00 and 0.3210
have four significant digits each.

Unit: 1 – Number Systems and Errors 16


O02CA501: Computational Mathematics

4. Zeros between the decimal point and preceding a non-zero digit are not significant.

The following numbers have four significant digits,


0.0001234, 0.002001, 0.01321
5. When the decimal point is not written, trailing zeros are not considered to be significant. For
example 4500 may be written as 45102and contains only two significant digits but 4500.0
contains four significant digits.

Integer numbers with trailing zeros may be written in scientific notation to specify the significant
digits. The concept of accuracy and precision are closely related to significant digits. They are
related as follows.

3.3 Example:
i) 7.560 has four significant digits
25000 has two significant digits
2.00004 has six significant digits
0.04500 has four significant digits
0.0201 has three significant digits
0.00001 has one significant digit
100.00001 has eight significant digits

ii) Accuracy refers to the number of significant digits in a value. For example, the number 57.396
is accurate to five significant digits.

Example: The accuracy of the numbers


(a) 95.763 (b) 0.008472 (c) 0.0456000 (d) 36 (e) 3600.00are given below.

Solution:
(a) This has five significant digits.
(b) This has four significant digits. The leading or higher order zeros are only place holders.
(c) This has six significant digits.
(d) This has two significant digits.
(e) This has four significant digits. The zeros were made significant by writing .00 after 3600.

iii) Precision refers to the number of decimal positions, i.e. the order magnitude of the last digit
in a value. The number 57.396 has precision of 0.001or 10-3.

Unit: 1 – Number Systems and Errors 17


O02CA501: Computational Mathematics

Example: Which of the following numbers has the greatest precision


(a) 4.2301 (b) 4.23 (c) 4.230106

Solution:
(a) 4.2301 has a precision of 10 –4
(b) 4.23 has a precision of 10 –2
(c) 4.230106 has a precision of 10 –6.

Therefore the number 4.230106 has greatest precision.

Unit: 1 – Number Systems and Errors 18


O02CA501: Computational Mathematics

4. ERRORS

4.1 Inherent errors


It is that quantity of error which present in the statement of the problem itself, before finding its
solution. It arises due to the simplified assumptions made in the mathematical modeling of a
problem. It can also arise when the data is obtained from certain physical measurements of the
parameters of the problem.
These are two components, namely, data errors and conversion errors.
1. Data error (also known as empirical error) arises when data for a problem are obtained by
some experimental means and are, therefore, of limited accuracy and precision. This may be
due to some limitations in instrumentation and reading, and therefore may be unavoidable. A
physical measurement, such as a distance, a voltage, or a time period cannot be exact. To
remember that there is use in performing arithmetic operations to, say, four decimals when the
original data themselves are only correct to two decimal places.
For instance, the scale reading in a weighing machine may be accurate to only one decimal
place.
2. Conversion errors (also known as representation errors) arise due to the limitations of the
computer to store the data exactly. We know that the floating point representation retains only
a specified number of digits. The digits that are not retained constitute the round off error.

4.2 Numerical errors


Numerical errors are introduced during the process of implementation of a numerical method.
They come in two forms, round off errors and truncation errors. The total numerical error is the
summation of these two errors. The total error can be reduced by devising suitable techniques for
implementing the solution. We shall see in this section the magnitude of these errors.

4.3 Round off Errors


Every computer has a finite word length and therefore it is possible to store only a fixed number
of digits of a given input number. Since computers store information in binary form, storing an
exact decimal number in its binary form into the computer memory gives an error. This error is
computer dependent. Also, at the end of computation of a particular problem, the final results in
the computer, which is obviously in binary form, should be converted into decimal form – a form

Unit: 1 – Number Systems and Errors 19


O02CA501: Computational Mathematics

understandable to the user – before their print out. Therefore, an additional error is committed at
this stage too. This error is called local round-off error.
It is clear that (0.7625)10 = (0.11000011 0011)2. If a particular computer system has a word length
of 12 bits only, then the decimal number 0.7625 is stored in the computer memory in binary form
as 0.110000110011. However, it is equivalent to 0.76245. Thus, in storing the number 0.7625, we
have committed an error equal to 0.00005, which is the round-off error; inherent with the computer
system considered.

4.4 Definition:
We define the error as
Error = True value – Computed value

4.5 Numbers rounded-off to n significant digits


To round-off a number to n significant digits, discard all digits to the right of the n th digits and if this
discarded number is
i) less than half a unit in the nth place, leave the nth digit unchanged.
ii) greater than half the nth place, increase the nth digit by unity.
iii) exactly half a unit in the nth place, increase the nth digit by unity if it is odd, otherwise leave it
unchanged.
The number thus rounded-off is said to be correct to n significant digits.

4.6 Example:
The following numbers rounded-off to four significant digits:
7.8926 to 7.893
128.614 to 128.6
3.14159 to 3.142
0.859321 to 0.8593
8476.7 to 8477
In any Numerical computation, we come across the following types of errors.

4.7 Truncation errors


Truncation errors arise from using an approximation in place of an exact mathematical procedure.
Typically, it is the error resulting from the truncation of the numerical process. We often use some
finite number of terms to estimate the sum of an infinite series. For example,

Unit: 1 – Number Systems and Errors 20


O02CA501: Computational Mathematics

 n
S =  a i x i is replaced by the finite sum  ai x i

i =0 i =0

Consider the following infinite series:


5
x3 x
sinx = x – + – ….
3! 5!
When we calculate the sine of an angle using this series, we cannot use all the terms in the series
for computation. We usually terminate the process after a certain term is calculated. The terms
“truncated” introduce an error which is called truncation error.
Many of the iterative procedures used in numerical computing are infinite and, therefore, a
knowledge of this error is important. Truncation error can be reduced by using a better numerical
model which usually increases the number of arithmetic operations.
In numerical integration, the truncation error can be reduced by increasing the number of points at
which the function is integrated. But care should be exercised to see that the roundoff error which is
bound to increase due to increase in arithmetic operations does not off-set the reduction in truncation
error.
Example:
1
Find the truncation error in the result of the following function for x = when we use
5
i) first three terms
ii) first four terms
iii) first five terms
2 3 4 5 6
ex = 1 + x + x + x + x + x + x .
2! 3! 4! 5! 6!
Solution:
i) The truncation error when first three terms are added
2 3 4 5 6
= ex – (1 + x + x ) = + x + x + x + x
2! 3! 4! 5! 6!

(0.2)3 (0.2) 4 (0.2)5 (0.2)6


=+ + + +
3! 4! 5! 6!

= 0.1402755  10 –2.
ii) Truncation error when first four terms are added.

Unit: 1 – Number Systems and Errors 21


O02CA501: Computational Mathematics

2 3 4 5 6
Truncation error = ex –(1 + x + x – x ) = + x + x + x
2 ! 3! 4! 5! 6!

= 0.694222  10 – 4.
iii) The truncation error when first five terms are added = 0. 275555  10-5.

Unit: 1 – Number Systems and Errors 22


O02CA501: Computational Mathematics

5. ABSOLUTE, RELATIVE AND PERCENTAGE


ERROR
5.1 Definition:
Absolute error is the numerical difference between its true value of a quantity and its approximate
value. If X is the true quantity and Xa is its approximate value then the absolute error Ea is given
by

Ea = True value _ Approximate value = X _ Xa .

Let X be a number such that X _ X a  X, then X is an upper limit on the magnitude of

absolute error and is said to measure absolute accuracy.

5.2 Definition:
The relative error is the absolute error divided by the true value of the quantity and this is denoted
by Er,
Absolute error Ea
Relative error Er = = .
True value X

X X
Similarly, the quantity ~ measures the relative accuracy.
X Xa

5.3 Definition:
The percentage error Epis given by Ep = E a 100 =Er 100.
X
Observations:
1. The relative and percentage errors are independent of the units used while absolute error is
expressed in terms of these units.

2. If the number X is rounded to N decimal places, then X = 1  10 −N , where X is the


2
absolute accuracy.

Example:
If the number X = 0.51 and is correct to two decimal places, then

Unit: 1 – Number Systems and Errors 23


O02CA501: Computational Mathematics

X = 1  10 −2 = 0.005
2

and the relative accuracy is X = 0.005 ~ 0.98


X 0.51

Example:

If 2 is approximated by 0.667, find the absolute and relative errors.


3

Solution:
Absolute error Ea = True value – Approximate value

2 2 − 2.01 1
= − 0.667 = =  10−3
3 3 3
Absolute error
Relative error Er =
True value

1
(  10−3 )
1
= 3 =  10−3
2/3 2

Unit: 1 – Number Systems and Errors 24


O02CA501: Computational Mathematics

6. MATHEMATICAL PRELIMINARIES
In this section we state without proof, certain mathematical results which would be useful in the
sequel.

6.1 Theorem:
If f(x) is continuous in a ≤ x ≤ b and if f(a) and f(b) are of opposite signs, then f(c) = 0 for at least
one number c such that a < c < b.

6.2 Rolle’s Theorem:


If f(x) is (i) continuous in [a, b]. (ii) Differentiable in (a, b) and (iii) f(a) = f(b) then c  (a, b) such
that 𝑓 ′ (𝑐)= 0.

6.3 Generalized Rolle’s Theorem:


Let f(x)be a function which is continuous on [a, b] n times differentiable on (a, b). If f(x) vanishes
at the (n+1) distinct points x0<x1< …<xn in (a, b), then there exists a number c in (a, b) such that
f n(c) = 0.

6.4 Example:
Verify Rolle’s theorem for the function f(x) = |x| in (–1, 1).
Solution: Here f(x) = –x for – 1 < x < 0.
= 0 for x = 0
= x for 0 < x < 1
f(-1) = 1 = f(1)
Hence f(-1) = f(1)
f1(x) = -1 for – 1  x  0
f1(x) = 1 for 0  x  1

Therefore f 1(x) does not exist at x = 0 and hence f(x) is not differentiable in (–1, 1) Rolle’s theorem
is not applicable to the function f(x) = |x| in (–1, 1)

y y = f(x) = |x|

x1 O x

y1

Unit: 1 – Number Systems and Errors 25


O02CA501: Computational Mathematics

6.5 Intermediate Value Theorem:


Let f(x) be continuous in [a, b] and k be any number between f(a) and f(b). Then there exists a
number c in (a, b) such that f(c) = k.

Example: f(x) = x2 + x – 1
f(0) = 0 + 0 –1 = –1 < 0
f(1) = 1 + 1 –1 = 1 > 0

Here f(x) = x2 + x –1 is a continuous function and f(0) and f(1) are of different signs, therefore at
least one real root lies between 0 and 1.

6.6 Lagrange’s Mean value Theorem:


If f(x) is (i) continuous in [a, b] and (ii) differentiable in (a, b) then there exists at least one value ‘c’
f ( b )−f ( a )
in (a, b) such that f1(c) = .
b −a

6.7 Geometrical Interpretation of Lagrange’s Mean Value theorem:


P and Q are two points on the continuous curve y = f(x) corresponding to
x = a and x = b respectively. Therefore P[a, f(a)], Q [b, f(b)] are two points on the curve. Slope of
f ( b )−f ( a )
the line joining the points P and Q is , R is a point on the curve between P and Q
b −a
corresponding to x = c, so that f1(c) is the slope of the tangent line at R[c, f(c)] where f(c) =
f ( b )−f ( a )
means that the tangent at R is parallel to the chord PQ.
b −a

R
y
Q

P f(c)
f(b)
f(a)
x1 x
O x=a x=c x=b
1
y

Hence this theorem tells that there is at least one point R on the curve PQ where the tangent to
the curve is parallel to the chord PQ.

6.8 Taylor’s Series for a function of one variable:

Unit: 1 – Number Systems and Errors 26


O02CA501: Computational Mathematics

If f(x) is continuous and possesses continuous derivatives of order n in an interval that includes x
= a, then in that interval
( x − a) 2 ( x − a) n −1 (n–1)
f(x) = f(a)+ (x–a) f1(a) + f (a) + .... +
''
f (a) +Rn (x)
2! (n − 1)!
where Rn (x), the remainder term can be expressed as
( x −a ) n n
Rn (x) = f (), a << x.
n!

6.9 Maclaurin’s Expansion:


Taylor’s series at the origin i.e., at a=0
𝑥 2 ′′ 𝑥𝑛
𝑓(𝑥) = 𝑓(0) + 𝑥𝑓 ′ (0) + 𝑓 (0) + ⋯ + 𝑓 𝑛 (0)
2! 𝑛!
6.10 Complex Numbers
Let C denote the set of all ordered pairs of real numbers.
That is, C = (x , y ); x , y  R.
On this set C define addition “+” and multiplication “.” by,
(x1, y1) + (x2,y2) = (x1 + x2, y1 + y2) … (1)
(x1, y1) . (x2, y2) = (x1x2 – y1y2, x1y2 + x2y1) … (2)

Then the elements of C which satisfy the above rules of addition and multiplication are called
complex numbers. If z = (x, y) is a complex number then x is called the real part and y is called
the imaginary part of the complex number z and they are denoted by x = Re z and y = Im z. If (x1,
y1) and (x2, y2) are two complex numbers then (x1, y1) = (x2, y2) if and only if
x1 = x2 and y1 = y2.

(a) Properties of addition


1. Closure law: If z1 = (x1, y1), z2 = (x2, y2) then from (1)
z1 + z2 = (x1, y2) + (x2, y2)= (x1 + x2, y1 + y2), which is also an ordered
pair of real numbers. Hence z1 + z2 C. Therefore for every z1, z2 C,
z1 + z2 C.
2. Commutative law: z1 + z2 = z2 + z1 for every z1, z2 C
Considerz1 + z2 = (x1, y1) + (x2, y2) = (x1 + x2, y1 + y2)
= (x2 + x1, y2 + y1) = (x2, y2) + (x1, y1) = z2 + z1.

Unit: 1 – Number Systems and Errors 27


O02CA501: Computational Mathematics

3. Associative law: z1 + (z2 + z3) = (z1 + z2) + z3 for every z1, z2, z3 C Proof of this is similar to
above proof.
4. Existence of identity element: There exists an element (0, 0)  C such that,
(x, y) + (0, 0) = (x + 0, y + 0) = (x, y) for every (x, y)  C. Here (0, 0) is called the additive
identity element of C.
5. Existence of inverse: For every (x, y)  C there exists (–x, –y)  C such that
(x, y) + (–x, –y) = (x – x, y – y) = (0, 0).
Hence (–x, –y) is the additive inverse of (x, y).
Thus we have shown that the set C is an abelian group w.r.t. the addition of complex numbers
defined by (1).
(b) Properties of multiplication
1. Closure law: If z1 = (x1, y1), z2 = (x2, y2)  C then from (2)
z1z2 = (x1, y1) (x2, y2) = (x1x2 – y1y2, x1y2 + x2y1), which is also an ordered pair of real numbers.
Hence z1z2is also a complex number.
Thus, for every z1, z2 C, z1z2 C.
2. Commutative law: z1z2 = z2z1for every z1, z2 C.
Nowz1z2 = (x1, y1) (x2, y2) = (x1x2 – y1y2, x1y2 + x2y1) ….. (i)
and z2z1 = (x2, y2) (x1, y1) = (x2x1 – y2y1, x2y1 + x1y2)
= (x1x2 – y1y2, x1y2 + x2y1) ….. (ii)
From (i) and (ii) z1z2 = z2z1.
3. Associative law: z1(z2z3) = (z1z2) z3, for every z1, z2, z3 C.
Proof is similar to above proof.
4. Existence of identity element: There exists (1, 0)  C such that
(x, y) (1, 0) = (x . 1 – y . 0, x . 0 + 1 . y) = (x, y) for every (x, y)  C.
Here (1, 0) is called the multiplicative identity element.
5. Existence of inverse: Let z = (x, y)  (0, 0), be a complex number. Let (u, v) be the inverse of
(x, y).
Then (u, v) . (x, y) = (1, 0), the identity element.
i.e. (ux – vy, uy + vx) = (1, 0).
Hence ux – vy = 1, and uy + vx = 0.

Unit: 1 – Number Systems and Errors 28


O02CA501: Computational Mathematics

x −y
Solving for u and v, we get, u = , v=
x +y
2 2
x + y2
2

 x −y 
Hence  ,   C is the multiplicative inverse of (x, y).
x +y
2 2
x 2 + y 2 

Thus we have shown that the set of non-zero complex numbers forms an abelian group w.r.t. the
multiplication defined by (2).
Also we can prove that the multiplication is distributive over addition.

(c) Distributive law: For all z1, z2, z3 C


i) z1 (z2 + z3) = z1z2 + z1z3 (left distributive law)
ii) (z2 + z3) z1 = z2z1 + z3z1 (right distributive law)

The complex numbers whose imaginary parts are equal to zero possess the following properties.

(x1, 0) + (x2, 0) = (x1 + x2, 0).


And, (x1, 0) . (x2, 0) = (x1 x2, 0).
Which are essentially the rules for addition and multiplication of real numbers. We identify the
complex number (x, 0) with the real number x. Denote the complex number (0, 1) by i.
Now i2 = (0, 1) (0, 1) = (0 . 0 – 1 . 1, 0 . 1 + 1 . 0)
= (–1, 0) = –1.
Hence i2 = –1.
With this convention we shall show that the ordered pair (x, y) is equal to x + iy.
For, (x, y) = (x, 0) + (0, y)
= (x, 0) + (0, 1) (y, 0)
= x + iy
Since (x, 0) = x, (y, 0) = y and (0, 1) = i.
Because of the extreme manipulative convenience we shall continue to use the notation x + iy for
the complex number (x, y).

6.11 Conjugate of a Complex Number


Let z = x + iy be a complex number. Then the complex number x – iy is called the complex conjugate
or simply, the conjugate of z and is denoted by z .

Thus, if z = x + iy then z = x − iy .

Unit: 1 – Number Systems and Errors 29


O02CA501: Computational Mathematics

For example, if z = 3+4i then z = 3 − 4i .


̅̅̅̅) = 𝑧
Clearly (𝑧̅
z + z = (x + iy ) + (x − iy ) = 2 x = 2 Re z,

and z − z = (x + iy ) − (x − iy ) = 2 iy = 2 i lm z.

Also, z . z = (x + iy ) . (x − iy )
= x2 – i2y2
=x2 + y2, which is a real number.
Thus the product of complex number and its conjugate is a real number.

Theorem: For all z1, z2 C


1. (z1 + z2 ) = z1 + z2
i.e., the conjugate of a sum is equal to the sum of the conjugates.
2. (z1 . z2 ) = z1 . z2
i.e., the conjugate of a product is equal to the product of the conjugates.

 z1  z1
3.  = , z2  0
 z2  z2
i.e., the conjugate of a quotient is equal to the quotient of the conjugates.

6.12 Modulus of a Complex Number


If z = x + iy is a complex number then x 2 + y 2 is called the modulus or absolute value of z and

is denoted by | z |.

Thus z = x 2 + y 2 .

Clearly | z | is a non-negative real number i.e., z  0.

Let z = 3-i4, then z = 3 2 + (− 4 )2 = 9 + 16 = 5.

We can easily verify the following:


2
1. z . z = z 2. z = z 3. − z  Re z  z .

Theorem: For all z1, z2 C


1. z1 . z 2 = z1 . z 2

Unit: 1 – Number Systems and Errors 30


O02CA501: Computational Mathematics

i.e., modulus of a product is equal to the product of their moduli.


z1 z1
2. = , z2  0
z2 z2

i.e., modulus of a quotient is equal to the quotient of the moduli.


3. z1 + z2  z1 + z2

4. z1 − z2  z1 − z2

Unit: 1 – Number Systems and Errors 31


O02CA501: Computational Mathematics

7. SUMMARY

➢ Foundational Concepts: Unit 1 introduces the foundational elements of number systems,


crucial for understanding data representation in computing within an MCA program.
➢ Number Systems: Explores Decimal (Base-10), Binary (Base-2), Octal (Base-8), and
Hexadecimal (Base-16) systems, emphasizing their importance in computational
mathematics.
➢ Representation of Integers: Discusses how integers, including positive, negative, and zero,
are represented and manipulated across different number systems for efficient computation.
➢ Fractions and Decimals: Covers the representation of fractions and their conversion into
decimal form, highlighting their role in precise expressions of non-integer values.
➢ Floating Point Arithmetic: Introduces floating-point arithmetic as a method to represent real
numbers, facilitating calculations over a wide range of values.
➢ Errors in Computations: Delves into the concept of errors inherent in computational
mathematics, focusing on approximation, rounding errors, and their implications on accuracy.
➢ Practical Applications: Emphasizes the practical application of number systems in various
computational tasks, including data storage, algorithm design, and error minimization.
➢ Conversion Techniques: Provides insights into the conversion techniques between different
number systems, enhancing the understanding of data representation.

Unit: 1 – Number Systems and Errors 32


O02CA501: Computational Mathematics

8. SELF-ASSESSMENT QUESTIONS

1. The decimal system is based on _______ unique digits.


2. In the binary system, each position represents a power of _______.
3. The hexadecimal number system uses digits and letters up to _______.
4. Floating-point arithmetic is used to represent _______ numbers in computing.
5. _______ errors occur due to the finite representation of numbers in computers.
6. The octal number system is based on base _______.
7. A negative number is represented in binary using the _______ complement method.
8. The representation of real numbers that cannot be accurately depicted as simple fractions
or integers uses _______ point numbers.
9. Rounding errors in floating-point arithmetic can lead to _______ in computational results.
10. Conversion from decimal to binary involves dividing by _______ and noting the remainder.
11. Which of the following is not a base used in number systems?
A) 2
B) 8
C) 10
D) 12
12. What does the MSB in a binary number represent?
A) Most Significant Bit
B) Medium Size Bit
C) Minimum Significant Bit
D) Most Small Bit
13. Which of the following is true for the hexadecimal system?
A) It uses only numbers.
B) It is based on base-8.
C) It includes letters A to F.
D) It is rarely used in computing.
14. What type of error does not occur in digital computing?
A) Rounding error
B) Syntax error

Unit: 1 – Number Systems and Errors 33


O02CA501: Computational Mathematics

C) Truncation error
D) Overflow error
15. In floating-point arithmetic, the part of the number that represents the significant digits is
called the:
A) Exponent
B) Mantissa
C) Base
D) Coefficient

Unit: 1 – Number Systems and Errors 34


O02CA501: Computational Mathematics

9. TERMINAL QUESTIONS

1. Convert the decimal number 156 to binary.


2. Convert the binary number 101101 to decimal.
3. Convert the hexadecimal number A3F to decimal.
4. Convert the decimal number 345 to octal.
5. Convert the binary number 1101101 to hexadecimal.
6. If the binary representation of a number is 100111, what is its decimal equivalent?
7. Convert the octal number 764 to binary.
8. Convert the hexadecimal number 1C2 to binary.
9. What is the sum of the binary numbers 1010 and 1101?
10. Subtract the binary number 1001 from 11011.
11. Multiply the binary numbers 101 and 11.
12. Divide the binary number 110010 by 101.
13. Convert the decimal fraction 0.375 to binary.
14. Convert the binary fraction 0.101 to decimal.
15. If a floating-point number is represented as 1.01 × 23 , what is its decimal equivalent?
16. Convert the octal number 357 to decimal.
17. What is the result of adding the hexadecimal numbers A4 and 9B?
18. Subtract the hexadecimal number 3F from 8A.
19. Multiply the octal numbers 17 and 21.
20. Divide the hexadecimal number F4 by 2.

Unit: 1 – Number Systems and Errors 35


O02CA501: Computational Mathematics

10. ANSWERS

10.1. Self-Assessment Questions


1. 10
2. 2
3. F
4. Real
5. Rounding
6. 8
7. Two's
8. Floating
9. Inaccuracies
10. 2
11. D) 12
12. A) Most Significant Bit
13. C) It includes letters A to F.
14. B) Syntax error
15. B) Mantissa

10.2. Terminal Questions


1. (156)10 = (10011100)2
2. (101101)2 = (45)10
3. (𝐴3𝐹)16 = (2623)10
4. (345)10 = (531)8
5. (1101101)2 = (6𝐷)16
6. (100111)2 = (39)10
7. (764)8 = (111110100)2
8. (1𝐶2)16 = (111000010)2
9. (1010)2 + (1101)2 = (10111)2
10. (11011)2 − (1001)2 = (10110)2
11. (101)2 × (11)2 = (1111)2

Unit: 1 – Number Systems and Errors 36


O02CA501: Computational Mathematics

12. (110010)2 ÷ (101)2 = (110)2


13. (0.375)10 = (0.011)2
14. (0 ⋅ 101)2 = (0 ⋅ 625)10
15. 1.01 × 23 = (8 ⋅ 2)10
16. (357)8 = (239)10
17. (𝐴4)16 + (9𝐵)16 = (13𝐹)16
18. (8𝐴)16 − (3𝐹)16 = (4𝐵)16
19. (17)8 × (21)8 = (367)8
20. (𝐹4 )16 ÷ 2 = (7𝐴)16

Unit: 1 – Number Systems and Errors 37

You might also like