0% found this document useful (0 votes)
262 views

Problem 1. Specified: If Solution

This document contains solutions to 7 problems about floating-point numbers and round-off errors in the Marc-32 format. Problem 1 finds the round-off error and relative error when 3/5 is rounded to a binary number. Problem 2 shows that 54(1-2-24) is a machine number. Problem 3 converts a decimal number to Marc-32 format. Problems 4 and 5 give the bit representations of numbers. Problem 6 proposes a way to compute sin(x)+cos(x)-1 near zero. Problem 7 analyzes rounding near a particular number.

Uploaded by

Rabia Saeed
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
262 views

Problem 1. Specified: If Solution

This document contains solutions to 7 problems about floating-point numbers and round-off errors in the Marc-32 format. Problem 1 finds the round-off error and relative error when 3/5 is rounded to a binary number. Problem 2 shows that 54(1-2-24) is a machine number. Problem 3 converts a decimal number to Marc-32 format. Problems 4 and 5 give the bit representations of numbers. Problem 6 proposes a way to compute sin(x)+cos(x)-1 near zero. Problem 7 analyzes rounding near a particular number.

Uploaded by

Rabia Saeed
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

CIS541 Floating-Point Numbers and Roundoff Errors.

Examples of Problems Solution

Slide 1

Problem 1.
Specified: If 3 5 is correctly rounded to the normalized binary number (1.a1 a 2 . . .a 23 ) 2 2 m , what is
the roundoff error? What is the relative roundoff error?
Solution: x = 3 5 = 0.610
0.6000
2
c1 = [1].2000
2
c2 = [0].4000
2
c3 = [0].8000

c4
c5
c6

2
= [1].6000
2
= [1].2000
2
= [0].4000

c7
c8
c9

2
= [0].8000
2
= [1].6000
2
= [1].2000, and etc.

x = 3 5 = (.1001 1001. . .) 2 2 0 = (1.001 1001 1001 1001 1001 1001 . . .) 2 1 , where 23-rd bit of

mantissa is -1.
Chopping gives x = (1.001 1001 1001 1001 1001 1001) 2 2 1 and rounding-up gives
x+ = (1.001 1001 1001 1001 1001 1001) 2 21 + 21 23 = (1.001 1001 1001 1001 1001 1010) 2 21
Now we have to determine what is closer to number x . x or x+ ?
x x = (0.1001 1001 . . . 1001)2 2 24 = (3 5) 224 and
x+ x = (x+ x ) (x x ) = 224 (3 5) 224 = (2 5) 224.
So x+ closer to x.
The absolute roundoff error is
fl( x) x = (2 5) 224
The relative error is then
fl( x) x 25 224 2 24
=
= 2
x
35
3
CIS541 Floating-Point Numbers and Round-off Errors. Examples of Problems Solution Slide 1 of 7

CIS541 Floating-Point Numbers and Roundoff Errors. Examples of Problems Solution

Slide 2

Problem 2.
Specified: Is 54 (1 2 24 ) a machine number in the Marc-32? Explain
Solution: We omit here the conversion of 4 5 to binary fraction.
4
5

= (0.1100 1100 1100 1100 1100 1100. . .) 2


After chopping x to 24 digits we will get x .

x = (0.1100 1100 1100 1100 1100 1100) 2 =


( 1 2 ) + ( 1 2 ) 2 + ( 1 2 )5 + ( 1 2 ) 6 + . . . + ( 1 2 ) 21 + ( 1 2 ) 22 =

[(

1 )4 )
1

(
(
2
1 )2
( 12)
+
(
2
1 ( 12 ) 4
6

((

] [
1 (( ) )
= ((
1 ( )

)1 + ( 1 2 )5 + . . . + ( 1 2 ) 21 + ( 1 2 ) 2 + ( 1 2 )6 + . . . + ( 1 2 ) 22 =

4 6

1 )4 )
1

(
(
2
2
1 ) + (1 ) )
=
2
2
4
1
1 ( 2)
6

1 2
1 4
1 4
1 4
1 4
1 4
2 ) + ( 2 ) )(1 ( 2 ) ) + (1 ( 2 ) ) + (1 ( 2 ) ) + (1 ( 2 ) ) + (1 ( 2 ) ) + 1
5

= 54 (1 2 24 ).

CIS541 Floating-Point Numbers and Round-off Errors. Examples of Problems Solution Slide 2 of 7

CIS541 Floating-Point Numbers and Roundoff Errors. Examples of Problems Solution

Slide 3

Problem 3.
Specified: Determine the representation in the Marc-32 for the decimal number 64.015625. Also, determine the
machine precision in single, double, and extended precision (10 points).

Solution:
(64)10 = (100)8 = (1 000 000) 2
(0.015625)10 = (0.01)8 = (0.000 001) 2
(64.015625)10 = (1 000 000.000 001) 2 = (1.000 000 000 001) 2 26
e 127 = 6
e = 127 + 6 = (133)10 = (205)8 = (10 000 101) 2
[0 10 000 101 000 0000 0000 1000 0000 0000]2 = [42800800]16

CIS541 Floating-Point Numbers and Round-off Errors. Examples of Problems Solution Slide 3 of 7

CIS541 Floating-Point Numbers and Roundoff Errors. Examples of Problems Solution

Slide 4

Problem 4.
Specified: Identify the floating-point numbers corresponding to the following bit strings in the Marc-32:

[0 01111011 1001 1001 1001 1001 1001 100]

Solution:
(0111 1011 ) 2 = (173) 8 = 12310 ; m = e 127 = 123 127 = 4

(1.1001 1001 1001 1001 1001 100) 2 2 4 = (0.00011001 1001 1001 1001 1001 100) 2 =
= (0.063146314)8 = 89 [((((((((0)8 + 6)8 + 3)8 + 1)8 + 4)8 + 6)8 + 3)8 + 1)8 + 4] =
= 13421772 134217728 = 0.09999999404

CIS541 Floating-Point Numbers and Round-off Errors. Examples of Problems Solution Slide 4 of 7

CIS541 Floating-Point Numbers and Roundoff Errors. Examples of Problems Solution

Slide 5

Problem 5.
Specified: In the Marc-32, what are the bit-string representations for the subnormal number 2 127 + 2 128 ?
Solution:
(1 + 2 1 )2 127 = (1.1) 2 2 127 (0 00000000 100 0000 0000 0000 0000 0000) 2

CIS541 Floating-Point Numbers and Round-off Errors. Examples of Problems Solution Slide 5 of 7

CIS541 Floating-Point Numbers and Roundoff Errors. Examples of Problems Solution

Problem 6
Specified: Find a good way to compute sin x + cos x 1 for x

Slide 6

near zero.

Solution:

sin x = x (x 3 3!) + (x 5 5!) L


cos x = 1 (x 2 2!) + (x 4 4!) L
sin x + cos x 1 = x (x 2 2!) (x 3 3!) + (x 4 4!) + (x 5 5!) + L
x(1 + ( x 2 )( 1 + ( x 3)(1 + ( x 4)(1 + ( x 5)))))

CIS541 Floating-Point Numbers and Round-off Errors. Examples of Problems Solution Slide 6 of 7

CIS541 Floating-Point Numbers and Roundoff Errors. Examples of Problems Solution

Slide 7

Problem 7.
Specified: Let x = 25 + 217 + 2 21 . Find the machine numbers on the Marc-32 that are just to the right
and just to the left of x . Determine fl( x), the absolute error x fl( x) , and the relative error
x fl( x) x . Verify that the relative error in this case does not exceed 224 .
Solution:
x = 25 + 217 + 2 21 = (100000.0000 0000 0000 0000 1000 1) 2
= (.1000 0000 0000 0000 0000 0010 001) 2 26 ,
x = (.1000 0000 0000 0000 0000 0010) 26 = 25 + 217 , and
x+ = (.1000 0000 0000 0000 0000 0011) 26 = 25 + 217 + 2 18.
x x = 2 21 , x+ x = 2 18 221
x is closer to x . So fl( x) = x = 25 + 217 . The absolute error
221
x fl( x)
21
x fl( x) = 2 , and the relative error
= 5
< 2 26 < 2 24 .

17

21
x 2 +2 +2

CIS541 Floating-Point Numbers and Round-off Errors. Examples of Problems Solution Slide 7 of 7

You might also like