0% found this document useful (0 votes)

80 views15 pages

CFD 1st Unit

This document discusses number representation in computers. It begins by explaining how integers are represented in binary format. It then discusses how fractions are represented using binary fractions which may be terminating or non-terminating. The document concludes by introducing floating-point number representation which separates a number into a mantissa and exponent to allow for a wider range of representable numbers.

Uploaded by

Obula Reddy K

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

80 views15 pages

CFD 1st Unit

Uploaded by

Obula Reddy K

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Previous Home

CHAPTER
ONE
NUMBER SYSTEMS AND ERRORS

In this chapter we consider methods for representing numbers on com-

puters and the errors introduced by these representations. In addition, we
examine the sources of various types of computational errors and their
subsequent propagation. We also discuss some mathematical preliminaries.

1.1 THE REPRESENTATION OF INTEGERS

In everyday life we use numbers based on the decimal system. Thus the
number 257, for example, is expressible as
257 = 2·100 + 5·10 + 7·1
= 2·102 + 5·101 + 7·1000
We call 10 the base of this system. Any integer is expressible as a
polynomial in the base 10 with integral coefficients between 0 and 9. We
use the notation
N = (a n a n - 1 ··· a 0 ) 1 0
= a n 10 n + a n-1 10 n-1 + ··· + a 0 10 0 (1.1)
to denote any positive integer in the base 10. There is no intrinsic reason to
use 10 as a base. Other civilizations have used other bases such as 12, 20,
or 60. Modern computers read pulses sent by electrical components. The
state of an electrical impulse is either on or off. It is therefore convenient to
represent numbers in computers in the binary system. Here the base is 2,
and the integer coefficients may take the values 0 or 1.

1
2 NUMBER SYSTEMS AND ERRORS

A nonnegative integer N will be represented in the binary system as

(1.2)
where the coefficients ak are either 0 or 1. Note that N is again represented
as a polynomial, but now in the base 2. Many computers used in scientific
work operate internally in the binary system. Users of computers, however,
prefer to work in the more familiar decimal system. It is therefore neces-
sary to have some means of converting from decimal to binary when
information is submitted to the computer, and from binary to decimal for
output purposes.
Conversion of a binary number to decimal form may be accomplished
directly from the definition (1.2). As examples we have

The conversion of integers from a base to the base 10 can also be

accomplished by the following algorithm, which is derived in Chap. 2.

Algorithm 1.1 Given the coefficients an, . . . , a0 of the polynomial

(1.3)
and a number Compute recursively the numbers

Then

Since, by the definition (1.2), the binary integer

represents the value of the polynomial (1.3) at x = 2, we can use Algo-
rithm 1.1, with to find the decimal equivalents of binary integers.
Thus the decimal equivalent of (1101)2 computed using Algorithm 1.1
is
1.1 THE REPRESENTATION OF INTEGERS 3

and the decimal equivalent of (10000)2 is

Converting a decimal integer N into its binary equivalent can also be

accomplished by Algorithm 1.1 if one is willing to use binary arithmetic.
For if then by the definition (1.1), N = p(10). where
p(x) is the polynomial (1.3). Hence we can calculate the binary representa-
tion for N by translating the coefficients into binary integers
and then using Algorithm 1.1 to evaluate p(x) at x = 10 = (1010) 2 in
binary arithmetic. If, for example, N = 187, then

and using Algorithm 1.1 and binary arithmetic,

Therefore 187 = (10111011)2.

Binary numbers and binary arithmetic, though ideally suited for
today’s computers, are somewhat tiresome for people because of the
number of digits necessary to represent even moderately sized numbers.
Thus eight binary digits are necessary to represent the three-decimal-digit
number 187. The octal number system, using the base 8, presents a kind of
compromise between the computer-preferred binary and the people-pre-
ferred decimal system. It is easy to convert from octal to binary and back
since three binary digits make one octal digit. To convert from octal to
binary, one merely replaces all octal digits by their binary equivalent; thus

Conversely, to convert from binary to octal, one partitions the binary digits
in groups of three (starting from the right) and then replaces each three-
group by its octal digit; thus

If a decimal integer has to be converted to binary by hand, it is usually

fastest to convert it first to octal using Algorithm 1.1, and then from octal
to binary. To take an earlier example,
4 NUMBER SYSTEMS AND ERRORS

Hence, using Algorithm 1.1 [with 2 replaced by 10 = (12)8, and with octal
arithmetic],

Therefore, finally,

EXERCISES

1.1-l Convert the following binary numbers to decimal form:

1.1-2 Convert the following decimal numbers to binary form:

82, 109, 3433
1.1-3 Carry out the conversions in Exercises 1. l-l and 1.1-2 by converting first to octal form.
1.1-4 Write a FORTRAN subroutine which accepts a number to the base BETIN with the
NIN digits contained in the one-dimensional array NUMIN, and returns the NOUT digits of
the equivalent in base BETOUT in the one-dimensional array NUMOUT. For simplicity,
restrict both BETIN and BETOUT to 2, 4, 8, and 10.

1.2 THE REPRESENTATION OF FRACTIONS

If x is a positive real number, then its integral part xI is the largest integer
less than or equal to x, while

is its fractional part. The fractional part can always be written as a decimal
fraction:

(1.4)

where each b k is a nonnegative integer less than 10. If b k = 0 for all k

greater than a certain integer, then the fraction is said to terminate. Thus

is a terminating decimal fraction, while

is not.
If the integral part of x is given as a decimal integer by
1.2 THE REPRESENTATION OF FRACTIONS 5

while the fractional part is given by (1.4), it is customary to write the two
representations one after the other, separated by a point, the “decimal
point”:

Completely analogously, one can write the fractional part of x as a

binary fraction:

where each bk is a nonnegative integer less than 2, i.e., either zero or one. If
the integral part of x is given by the binary integer

then we write

using a “binary point.”

The binary fraction (.b 1 b 2 b 3 · · · ) 2 for a given number xF between
zero and one can be calculated as follows: If

then

Hence b1 is the integral part of 2xF, while

Therefore, repeating this procedure, we find that b2 is the integral part of

2(2xF)F, b3 is the integral part of 2(2(2xF)F)F, etc.
If, for example, x = 0.625 = xF, then

and all further bk’s are zero. Hence

This example was rigged to give a terminating binary fraction. Un-

happily, not every terminating decimal fraction gives rise to a terminating
binary fraction. This is due to the fact that the binary fraction for
6 NUMBER SYSTEMS AND ERRORS

is not terminating. We have

and now we are back to a fractional part of 0.2, so that the digits cycle. It
follows that

The procedure just outlined is formalized in the following algorithm.

Algorithm 1.2 Given x between 0 and 1 and an integer greater than

1. Generate recursively b1, b2, b3, . . . by

Then

We have stated this algorithm for a general base rather than for the
specific binary base for two reasons. If this conversion to binary is
carried out with pencil and paper, it is usually faster to convert first to
octal, i.e., use and then to convert from octal to binary. Also, the
algorithm can be used to convert a binary (or octal) fraction to decimal, by
choosing and using binary (or octal) arithmetic.
To give an example, if x = (.lOl)2, then, with and
binary arithmetic, we get from Algorithm 1.2

Hence subsequent bk’s are zero. This shows that

confirming our earlier calculation. Note that if xF is a terminating binary

1.3 FLOATING-POINT ARITHMETIC 7

fraction with n digits, then it is also a terminating decimal fraction with n

digits, since

EXERCISES
1.2-l Convert the following binary fractions to decimal fractions:
(.1100011)2 (. 1 1 1 1 1 1 1 1)2
1.2-2 Find the first 5 digits of .1 written as an octal fraction, then compute from it the first 15
digits of .1 as a binary fraction.
1.2-3 Convert the following octal fractions to decimal:
(.614)8 (.776)8
Compare with your answer in Exercise 1.2-1.
1.2-4 Find a binary number which approximates to within 10-3.
1.2-5 If we want to convert a decimal integer N to binary using Algorithm 1.1, we have to use
binary arithmetic. Show how to carry out this conversion using Algorithm 1.2 and decimal
arithmetic. (Hint: Divide N by the appropriate power of 2, convert the result to binary, then
shift the “binary point” appropriately.)
1.2-6 If we want to convert a terminating binary fraction x to a decimal fraction using
Algorithm 1.2, we have to use binary arithmetic. Show how to carry out this conversion using
Algorithm 1.1 and decimal arithmetic.

1.3 FLOATING-POINT ARITHMETIC

Scientific calculations are usually carried out in floating-point arithmetic.

An n-digit floating-point number in base has the form
(1.5)
where is a called the mantissa, and e is an
integer called the exponent. Such a floating-point number is said to be
normalized in case or else
For most computers, although on some, and in hand
calculations and on most desk and pocket calculators,
The precision or length n of floating-point numbers on any particular
computer is usually determined by the word length of the computer and
may therefore vary widely (see Fig. 1.1). Computing systems which accept
FORTRAN programs are expected to provide floating-point numbers of
two different lengths, one roughly double the other. The shorter one, called
single precision, is ordinarily used unless the other, called double precision,
is specifically asked for. Calculation in double precision usually doubles
the storage requirements and more than doubles running time as compared
with single precision.
8 NUMBER SYSTEMS AND ERRORS

Figure 1.1 Floating-point characteristics.

The exponent e is limited to a range

(1.6)
for certain integers m and M. Usually, m = - M, but the limits may vary
widely; see Fig. 1.1.
There are two commonly used ways of translating a given real number
x into an n floating-point number fl(x), rounding and chopping. In
rounding, fl(x) is chosen as the normalized floating-point number nearest
x; some special rule, such as symmetric rounding (rounding to an even
digit), is used in case of a tie. In chopping, fl(x) is chosen as the nearest
normalized floating-point number between x and 0. If, for example, two-
decimal-digit floating-point numbers are used, then

and

On some computers, this definition of fl(x) is modified in case

(underflow), where m and M are the
bounds on the exponents; either fl(x) is not defined in this case, causing a
stop, or else fl(x) is represented by a special number which is not subject to
the usual rules of arithmetic when combined with ordinary floating-point
numbers.
The difference between x and fl(x) is called the round-off error. The
round-off error depends on the size of x and is therefore best measured
relative to x. For if we write
(1.7)
where is some number depending on x, then it is possible to
bound independently of x, at least as long as x causes no overflow or
underflow. For such an x, it is not difficult to show that
in rounding (1.8)

while in chopping (1.9)

1.3 FLOATING-POINT ARITHMETIC 9

See Exercise 1.3-3. The maximum possible value for is often called the
unit roundoff and is denoted by u.
When an arithmetic operation is applied to two floating-point num-
bers, the result usually fails to be a floating-point number of the same
length. If, for example, we deal with two-decimal-digit numbers and

then

Hence, if denotes one of the arithmetic operations (addition, subtraction,

multiplication, or division) and denotes the floating-point operation of
the same name provided by the computer, then, however the computer
may arrive at the result for two given floating-point numbers x and
y, we can be sure that usually

Although the floating-point operation corresponding to may vary in

some details from machine to machine, is usually constructed so that
(1.10)
In words, the floating-point sum (difference, product, or quotient) of two
floating-point numbers usually equals the floating-point number which
represents the exact sum (difference, product, or quotient) of the two
numbers. Hence (unless overflow or underflow occurs) we have
(1.11 a)
where u is the unit roundoff. In certain situations, it is more convenient to
use the equivalent formula
(1.116)
Equation (1.11) expresses the basic idea of backward error analysis (see J.
H. Wilkinson [24]†). Explicitly, Eq. (1.11) allows one to interpret a float-
ing-point result as the result of the corresponding ordinary arithmetic, but
performed on slightly perturbed data. In this way, the analysis of the effect
of floating-point arithmetic can be carried out in terms of ordinary
arithmetic.
For example, the value of the function at a point x0 can be
calculated by n squarings, i.e., by carrying out the sequence of steps

with In floating-point arithmetic, we compute instead, accord-

ing to Eq. (1.1 la), the sequence of numbers

†Numbers in brackets refer to items in the references at the end of the book.
10 NUMBER SYSTEMS AND ERRORS

with all i. The computed answer is, therefore,

To simplify this expression, we observe that, if then

for some (see Exercise 1.3-6). Also then

for some Consequently,

for some In words, the computed value is the

exact value of f(x) at the perturbed argument
We can now gauge the effect which the use of floating-point arithmetic
has had on the accuracy of the computed value for f(x0) by studying how
the value of the (exactly computed) function f(x) changes when the
argument x is perturbed, as is done in the next section. Further, we note
that this error is, in our example, comparable to the error due to the fact
that we had to convert the initial datum x0 to a floating-point number to
begin with.
As a second example, of particular interest in Chap. 4, consider
calculation of the number s from the equation

(1.12)

by the formula

If we obtain s through the steps

then the corresponding numbers computed in floating-point arithmetic

satisfy

Here, we have used Eqs. (1.11a ) and (1.11b), and have not bothered to
1.3 FLOATING-POINT ARITHMETIC 11

distinguish the various by subscripts. Consequently,

This shows that the computed value for s satisfies the perturbed equation

(1.13)
Note that we can reduce all exponents by 1 in case ar+1 = 1, that is, in
case the last division need not be carried out.

1.3-1 The following numbers are given in a decimal computer with a four-digit normalized
mantissa:

Perform the following operations, and indicate the error in the result, assuming symmetric
rounding:

1.3-2 Let be given by chopping. Show that and that

(unless overflow or underflow occurs).
13-3 Let be given by chopping and let be such that (If
Show that then is bounded as in (1.9).
1.3-4 Give examples to show that most of the laws of arithmetic fail to hold for floating-point
arithmetic. (Hint: Try laws involving three operands.)
1.3-5 Write a FORTRAN FUNCTION FL(X) which returns the value of the n-decimal-digit
floating-point number derived from X by rounding. Take n to be 4 and check your
calculations in Exercise 1.3-l. [Use ALOG10(ABS(X)) to determine e such that

1.3-6 Let Show that for all there exists o

that Show also that fo r
so m e provided all have the same sign.
1.3-7 Carry out a backward error analysis for the calculation of the scalar product
Redo the analysis under the assumption that double-precision ac-
cumulation is used. This means that the double-precision results of each multiplicatioin are
retained and added to the sum in double precision, with the resulting sum rounded only at the
end to single precision.
12 NUMBER SYSTEMS AND ERRORS

1.4 LOSS OF SIGNIFICANCE AND ERROR PROPAGATION;

CONDITION AND INSTABILITY

If the number x* is an approximation to the exact answer x, then we call

the difference x - x* the error in x*; thus
Exact = approximation + error (1.14)
The relative error in x*, as an approximation to x, is defined to be the
number (x - x*)/x. Note that this number is close to the number (x -
x * ) / x * if it is at all small. [Precisely, if then (x -
x* )/x * =
Every floating-point operation in a computational process may give
rise to an error which, once generated, may then be amplified or reduced
in subsequent operations.
One of the most common (and often avoidable) ways of increasing the
importance of an error is commonly called loss of significant digits. If x* is
an approximation to x, then we say that x* approximates x to r significant
provided the absolute error |x - x*| is at most in the rt h
significant of x. This can be expressed in a formula as

(1.15)
with s the largest integer such that For instance, x* = 3 agrees
with to one significant (decimal) digit, while
is correct to three significant digits (as an approximation to ). Suppose
now that we are to calculate the number

and that we have approximations x* and y* for x and y, respectively,

available, each of which is good to r digits. Then

is an approximation for z, which is also good to r digits unless x* and y*

agree to one or more digits. In this latter case, there will be cancellation of
digits during the subtraction, and consequently z* will be accurate to fewer
than r digits.
Consider, for example,

and assume each to be an approximation to x and y, respectively, correct

to seven significant digits. Then, in eight-digit floating-point arithmetic,

is the exact difference between x* and y*. But as an approximation to

z = x - y,z* is good only to three digits, since the fourth significant digit
of z* is derived from the eighth digits of x* and y*, both possibly in error.
1.4 LOSS OF SIGNIFICANCE, ERROR PROPAGATION; CONDITION, INSTABILITY 13

Hence, while the error in z* (as an approximation to z = x - y) is at most

the sum of the errors in x* and y*, the relative error in z* is possibly 10,000
times the relative error in x* or y*. Loss of significant digits is therefore
dangerous only if we wish to keep the relative error small.
Such loss can often be avoided by anticipating its occurrence. Con-
sider, for example, the evaluation of the function

in six-decimal-digit arithmetic. Since for x near zero, there will

be loss of significant digits for x near zero if we calculate f(x) by first
finding cos x and then subtracting the calculated value from 1. For we
cannot calculate cos x to more than six digits, so that the error in the
calculated value may be as large as 5 · 10-7, hence as large as, or larger
than, f(x) for x near zero. If one wishes to compute the value of f(x) near
zero to about six significant digits using six-digit arithmetic, one would
have to use an alternative formula for f(x), such as

which can be evaluated quite accurately for small x; else, one could make
use of the Taylor expansion (see Sec. 1.7) for f(x),

which shows, for example, that for agrees with f(x) to at

least six significant digits.
Another example is provided by the problem of finding the roots of
the quadratic equation
(1.16)
We know from algebra that the roots are given by the quadratic formula

(1.17)

Let us assume that b2 - 4ac > 0, that b > 0, and that we wish to find the
root of smaller absolute value using (1.17); i.e.,

(1.18)

If 4ac is small compared with b 2 , then will agree with b to

several places. Hence, given that will be calculated correctly
only to as many places as are used in the calculations, it follows that the
numerator of (1.18), and therefore the calculated root, will be accurate to
fewer places than were used during the calculation. To be specific, take the
14 NUMBER SYSTEMS AND ERRORS

equation
(1.19)
Using (1.18) and five-decimal-digit floating-point chopped arithmetic, we
calculate

while in fact,

is the correct root to the number of digits shown. Here too, the loss of
significant digits can be avoided by using an alternative formula for the
calculation of the absolutely smaller root, viz.,

(1.20)

Using this formula, and five-decimal-digit arithmetic, we calculate

which is accurate to five digits.

Once an error is committed, it contaminates subsequent results. This
error propagation through subsequent calculations is conveniently studied
in terms of the two related concepts of condition and instability.
The word condition is used to describe the sensitivity of the function
value f(x) to changes in the argument x. The condition is usually measured
by the maximum relative change in the function value f(x) caused by a
unit relative change in the argument. In a somewhat informal formula,
condition off at x =

(1.21)

The larger the condition, the more ill-conditioned the function is said to
be. Here we have made use of the fact (see Sec. 1.7) that

i.e., the change in argument from x to x* changes the function value by

approximately
If, for example,
1.4 LOSS OF SIGNIFICANCE, ERROR PROPAGATION; CONDITION, INSTABILITY 15

then hence the condition of f is, approximately,

This says that taking square roots is a well-conditioned process since it

actually reduces the relative error. By contrast, if

then so that

and this number can be quite large for |x| near 1. Thus, for x near 1 or
- 1, this function is quite ill-conditioned. It very much magnifies relative
errors in the argument there.
The related notion of instability describes the sensitivity of a numerical
process for the calculation of f(x) from x to the inevitable rounding errors
committed during its execution in finite precision arithmetic. The precise
effect of these errors on the accuracy of the computed value for f(x) is
hard to determine except by actually carrying out the computations for
particular finite precision arithmetics and comparing the computed answer
with the exact answer. But it is possible to estimate these effects roughly by
considering the rounding errors one at a time. This means we look at the
individual computational steps which make up the process. Suppose there
are n such steps. Denote by xi the output from the ith such step, and take
x0 = x. Such an xi then serves as input to one or more of the later steps
and, in this way, influences the final answer xn = f(x). Denote by f i the
function which describes the dependence of the final answer on the
intermediate result xi . In particular, f0 is just f. Then the total process is
unstable to the extent that one or more of these functions fi is ill-condi-
tioned. More precisely, the process is unstable to the extent that one or
more of the fi ’s has a much larger condition than f = f0 has. For it is the
condition of fi which gauges the relative effect of the inevitable rounding
error incurred at the ith step on the final answer.
To give a simple example, consider the function

for “large” x, say for Its condition there is

which is quite good. But, if we calculate f(12345) in six-decimal arithmetic,

Digital Circuits and Design by S. Salivahanan
100% (1)
Digital Circuits and Design by S. Salivahanan
39 pages
Unit 1
No ratings yet
Unit 1
211 pages
Module-1: Web Programming
100% (1)
Module-1: Web Programming
50 pages
Student Solutions Manual for Mathematics for Economics, fourth edition
From Everand
Student Solutions Manual for Mathematics for Economics, fourth edition
Michael Hoy
No ratings yet
Lecture 4 - Number System
No ratings yet
Lecture 4 - Number System
50 pages
01) NUMBER SYSTEM - Sushil Goel
No ratings yet
01) NUMBER SYSTEM - Sushil Goel
22 pages
DocScanner Feb 24, 2025 3-25 PM
No ratings yet
DocScanner Feb 24, 2025 3-25 PM
69 pages
7-8 Data Representation and Computer Arithmetic
No ratings yet
7-8 Data Representation and Computer Arithmetic
23 pages
Code Conversion
No ratings yet
Code Conversion
132 pages
Number Systems and Data Representation
No ratings yet
Number Systems and Data Representation
12 pages
A MATLAB-based Mean-Field-Type Games Toolbox: Continuous-Time
No ratings yet
A MATLAB-based Mean-Field-Type Games Toolbox: Continuous-Time
16 pages
Number Systems and Negative Numbers: Madhusanka Liyanage
No ratings yet
Number Systems and Negative Numbers: Madhusanka Liyanage
61 pages
Lecture 1 (NumberBaseConversion)
No ratings yet
Lecture 1 (NumberBaseConversion)
43 pages
CH08.1-Number Systems
No ratings yet
CH08.1-Number Systems
5 pages
Computer Arithmetic
No ratings yet
Computer Arithmetic
9 pages
Digital Electronics I - Notes - 115416321
No ratings yet
Digital Electronics I - Notes - 115416321
155 pages
Unit-1 Material
No ratings yet
Unit-1 Material
51 pages
Number System
No ratings yet
Number System
35 pages
Ch01 Number Systems
No ratings yet
Ch01 Number Systems
46 pages
1.data Representation A Level
No ratings yet
1.data Representation A Level
128 pages
Building Joints
No ratings yet
Building Joints
11 pages
Chapter 2 - Number System
No ratings yet
Chapter 2 - Number System
62 pages
Chapter 1
No ratings yet
Chapter 1
12 pages
Chapter 2
No ratings yet
Chapter 2
36 pages
Digital Fundamental - Logic Gates
No ratings yet
Digital Fundamental - Logic Gates
37 pages
Chapter 4 Intro
No ratings yet
Chapter 4 Intro
34 pages
Edited TD QB 0
No ratings yet
Edited TD QB 0
11 pages
Number Systems
No ratings yet
Number Systems
9 pages
Developing With Web Services
No ratings yet
Developing With Web Services
6 pages
Study Material For BEE (Textbook) Part-1
No ratings yet
Study Material For BEE (Textbook) Part-1
23 pages
Integral and Fractional Number Representations
No ratings yet
Integral and Fractional Number Representations
6 pages
Bicimal Numbers. Concept and Operations
No ratings yet
Bicimal Numbers. Concept and Operations
6 pages
NAChapter 1
No ratings yet
NAChapter 1
24 pages
Chapter 4
No ratings yet
Chapter 4
6 pages
Revised CET Draft TS (Rev 03) For Proposed HPWMS System at PBS As Per Comments Observations 26 12 2023
No ratings yet
Revised CET Draft TS (Rev 03) For Proposed HPWMS System at PBS As Per Comments Observations 26 12 2023
137 pages
English 7 Q2 Module 2
67% (3)
English 7 Q2 Module 2
24 pages
Introduction To Software Testing Tools
100% (1)
Introduction To Software Testing Tools
3 pages
Refrigerant
No ratings yet
Refrigerant
59 pages
4a (Digital System) Number System
No ratings yet
4a (Digital System) Number System
54 pages
Number Systems: Decimal Binary Octal Hexadecimal
No ratings yet
Number Systems: Decimal Binary Octal Hexadecimal
36 pages
ICE&GT Unit-1
No ratings yet
ICE&GT Unit-1
53 pages
Delta Ia-Cnc Solution en 20190123
No ratings yet
Delta Ia-Cnc Solution en 20190123
44 pages
Number System
No ratings yet
Number System
47 pages
Computer Learning 2
No ratings yet
Computer Learning 2
26 pages
Format For GWA
No ratings yet
Format For GWA
6 pages
1 - Icue49301.2020.9307075
No ratings yet
1 - Icue49301.2020.9307075
7 pages
EEE 204 - DIgital Logic Techniques and Computer - TEXTBOOK
No ratings yet
EEE 204 - DIgital Logic Techniques and Computer - TEXTBOOK
91 pages
Computer Organization and Architecture Computer Arithmetic
No ratings yet
Computer Organization and Architecture Computer Arithmetic
78 pages
P N M T: PNMT (Java Version) Operation Manual
No ratings yet
P N M T: PNMT (Java Version) Operation Manual
118 pages
Chapter Three ICT - Updated
No ratings yet
Chapter Three ICT - Updated
71 pages
Handover - Event GSM
No ratings yet
Handover - Event GSM
2 pages
1.number System
No ratings yet
1.number System
73 pages
Low Power Square and Cube Architectures Using Vedic
100% (1)
Low Power Square and Cube Architectures Using Vedic
18 pages
CS2230 OS Lab Mahesh Jangid CourseHandout
No ratings yet
CS2230 OS Lab Mahesh Jangid CourseHandout
5 pages
Chapter 2
No ratings yet
Chapter 2
37 pages
L04-Number System Conversion
No ratings yet
L04-Number System Conversion
35 pages
BMW Innovations & RND
No ratings yet
BMW Innovations & RND
7 pages
Unit-1 Number System
No ratings yet
Unit-1 Number System
41 pages
4a (Digital System) Number System
No ratings yet
4a (Digital System) Number System
52 pages
FAC1002 - Binary Number System
No ratings yet
FAC1002 - Binary Number System
31 pages
Number Theory
No ratings yet
Number Theory
31 pages
2 Lecture Two
No ratings yet
2 Lecture Two
42 pages
Web Browsing Intro It
No ratings yet
Web Browsing Intro It
29 pages
Ch1-Digital Systems and Binary Numbers
No ratings yet
Ch1-Digital Systems and Binary Numbers
55 pages
Binary Octal and Hexadecimal Numbers
No ratings yet
Binary Octal and Hexadecimal Numbers
10 pages
Unit 4 Notes CC Ramadevi
No ratings yet
Unit 4 Notes CC Ramadevi
31 pages
Chapter 2: Number System
100% (1)
Chapter 2: Number System
60 pages
Data Storage in Computer System: BITS Pilani
No ratings yet
Data Storage in Computer System: BITS Pilani
30 pages
LM - Unit-3 2
No ratings yet
LM - Unit-3 2
11 pages
Parsing Dependency
No ratings yet
Parsing Dependency
26 pages
Digital Electronics
0% (1)
Digital Electronics
102 pages
DLD Chapter - 2
No ratings yet
DLD Chapter - 2
26 pages
Properties of GCD and LCM
No ratings yet
Properties of GCD and LCM
1 page
LOCOS-fabrication Unit 2
No ratings yet
LOCOS-fabrication Unit 2
39 pages
Pro Python System Administration 2nd Edition Rytis Sileika Download
100% (1)
Pro Python System Administration 2nd Edition Rytis Sileika Download
60 pages
Number System and Codes: Introduction
No ratings yet
Number System and Codes: Introduction
35 pages
Module 2 Network Threats
No ratings yet
Module 2 Network Threats
40 pages
Chapter 2
No ratings yet
Chapter 2
20 pages
Digital Electronics: UNIT-4 Digital Fundamentals & Logic Gates
No ratings yet
Digital Electronics: UNIT-4 Digital Fundamentals & Logic Gates
37 pages
IPM Lab Manual - Exp - 1
No ratings yet
IPM Lab Manual - Exp - 1
9 pages
Us2 Ss 56 Mothers Day Math Codebreaker Differentiated Activity Sheets English - Ver - 2
No ratings yet
Us2 Ss 56 Mothers Day Math Codebreaker Differentiated Activity Sheets English - Ver - 2
8 pages
Robotics and Cobotics
No ratings yet
Robotics and Cobotics
12 pages
Number System Conversions
100% (1)
Number System Conversions
46 pages
Code No: A66C5 Bvraju Institute of Technology, Narsapur: CFD Applications in FM and HT
No ratings yet
Code No: A66C5 Bvraju Institute of Technology, Narsapur: CFD Applications in FM and HT
2 pages
Subscription License Suites - January 2025
No ratings yet
Subscription License Suites - January 2025
10 pages
14 - Binary Hexadecimal Number
No ratings yet
14 - Binary Hexadecimal Number
10 pages
Tendernotice 1
No ratings yet
Tendernotice 1
16 pages
Engine Speed Circuit Fault
No ratings yet
Engine Speed Circuit Fault
7 pages
Tellio - Job Ad 07-17-2023
No ratings yet
Tellio - Job Ad 07-17-2023
2 pages
Digital Number System
No ratings yet
Digital Number System
44 pages
Math Quad
No ratings yet
Math Quad
4 pages
Data Representation: Number Systems
No ratings yet
Data Representation: Number Systems
32 pages
1numbering Systems
No ratings yet
1numbering Systems
20 pages
Object Oriented Programming in Java Binary and Hexadecimal Numeration and Logical Operations
No ratings yet
Object Oriented Programming in Java Binary and Hexadecimal Numeration and Logical Operations
12 pages
Code No: A65C2 Bvraju Institute of Technology, Narsapur: Internal Combustion Engines and Gas Turbines
No ratings yet
Code No: A65C2 Bvraju Institute of Technology, Narsapur: Internal Combustion Engines and Gas Turbines
2 pages

CFD 1st Unit

Uploaded by

CFD 1st Unit

Uploaded by

Previous Home

In this chapter we consider methods for representing numbers on com-

1.1 THE REPRESENTATION OF INTEGERS

A nonnegative integer N will be represented in the binary system as

The conversion of integers from a base to the base 10 can also be

Algorithm 1.1 Given the coefficients an, . . . , a0 of the polynomial

Since, by the definition (1.2), the binary integer

and the decimal equivalent of (10000)2 is

Converting a decimal integer N into its binary equivalent can also be

and using Algorithm 1.1 and binary arithmetic,

Therefore 187 = (10111011)2.

If a decimal integer has to be converted to binary by hand, it is usually

1.1-l Convert the following binary numbers to decimal form:

1.1-2 Convert the following decimal numbers to binary form:

1.2 THE REPRESENTATION OF FRACTIONS

where each b k is a nonnegative integer less than 10. If b k = 0 for all k

is a terminating decimal fraction, while

Completely analogously, one can write the fractional part of x as a

using a “binary point.”

Hence b1 is the integral part of 2xF, while

Therefore, repeating this procedure, we find that b2 is the integral part of

and all further bk’s are zero. Hence

This example was rigged to give a terminating binary fraction. Un-

is not terminating. We have

The procedure just outlined is formalized in the following algorithm.

Algorithm 1.2 Given x between 0 and 1 and an integer greater than

Hence subsequent bk’s are zero. This shows that

confirming our earlier calculation. Note that if xF is a terminating binary

fraction with n digits, then it is also a terminating decimal fraction with n

1.3 FLOATING-POINT ARITHMETIC

Scientific calculations are usually carried out in floating-point arithmetic.

Figure 1.1 Floating-point characteristics.

The exponent e is limited to a range

On some computers, this definition of fl(x) is modified in case

while in chopping (1.9)

Hence, if denotes one of the arithmetic operations (addition, subtraction,

Although the floating-point operation corresponding to may vary in

with In floating-point arithmetic, we compute instead, accord-

with all i. The computed answer is, therefore,

To simplify this expression, we observe that, if then

for some (see Exercise 1.3-6). Also then

for some Consequently,

for some In words, the computed value is the

If we obtain s through the steps

then the corresponding numbers computed in floating-point arithmetic

distinguish the various by subscripts. Consequently,

1.3-2 Let be given by chopping. Show that and that

1.3-6 Let Show that for all there exists o

1.4 LOSS OF SIGNIFICANCE AND ERROR PROPAGATION;

If the number x* is an approximation to the exact answer x, then we call

and that we have approximations x* and y* for x and y, respectively,

is an approximation for z, which is also good to r digits unless x* and y*

and assume each to be an approximation to x and y, respectively, correct

is the exact difference between x* and y*. But as an approximation to

Hence, while the error in z* (as an approximation to z = x - y) is at most

in six-decimal-digit arithmetic. Since for x near zero, there will

which shows, for example, that for agrees with f(x) to at

If 4ac is small compared with b 2 , then will agree with b to

Using this formula, and five-decimal-digit arithmetic, we calculate

which is accurate to five digits.

i.e., the change in argument from x to x* changes the function value by

then hence the condition of f is, approximately,

This says that taking square roots is a well-conditioned process since it

for “large” x, say for Its condition there is

which is quite good. But, if we calculate f(12345) in six-decimal arithmetic,

You might also like