10 MIPS Floating Point Arithmetic

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

MIPS floating-point arithmetic

Floating-point computations are vital for many applications, but correct implementation of floating-point hardware and software is very tricky. Today well study the IEEE 754 standard for floating-point arithmetic. Floating-point number representations are complex, but limited. Addition and multiplication operations require several steps. The MIPS architecture includes support for floating-point arithmetic. Machine Problem 2 will include some floating-point programming in MIPS. Sections this week will review the last three lectures on arithmetic.

February 26, 2003

2001-2003 Howard Huang

Floating-point representation
IEEE numbers are stored using a kind of scientific notation. mantissa 2
exponent

We can represent floating-point numbers with three binary fields: a sign bit s, an exponent field e, and a fraction field f. s e f

The IEEE 754 standard defines several different precisions. Single precision numbers include an 8-bit exponent field and a 23-bit fraction, for a total of 32 bits. Double precision numbers have an 11-bit exponent field and a 52-bit fraction, for a total of 64 bits. There are also various extended precision formats. For example, Intel processors use an 80-bit format internally.

February 26, 2003

MIPS floating-point arithmetic

Sign
s e f

The sign bit is 0 for positive numbers and 1 for negative numbers. But unlike integers, IEEE values are stored in signed magnitude format.

February 26, 2003

MIPS floating-point arithmetic

Mantissa
s e f

The field f contains a binary fraction. The actual mantissa of the floating-point value is (1 + f). In other words, there is an implicit 1 to the left of the binary point. For example, if f is 01101, the mantissa would be 1.01101 There are many ways to write a number in scientific notation, but there is always a unique normalized representation, with exactly one non-zero digit to the left of the point. 0.232 103 = 23.2 101 = 2.32 102 = A side effect is that we get a little more precision: there are 24 bits in the mantissa, but we only need to store 23 of them.

February 26, 2003

MIPS floating-point arithmetic

Exponent
s e f

The e field represents the exponent as a biased number. It contains the actual exponent plus 127 for single precision, or the actual exponent plus 1023 in double precision. This converts all single-precision exponents from -127 to +127 into unsigned numbers from 0 to 255, and all double-precision exponents from -1024 to +1023 into unsigned numbers from 0 to 2047. Two examples with single-precision numbers are shown below. If the exponent is 4, the e field will be 4 + 127 = 131 (100000112). If e contains 01011101 (9310), the actual exponent is 93 - 127 = -34. Storing a biased exponent before a normalized mantissa means we can compare IEEE values as if they were signed integers.

February 26, 2003

MIPS floating-point arithmetic

Converting an IEEE 754 number to decimal


s e f

The decimal value of an IEEE number is given by the formula: (1 - 2s) (1 + f) 2


e-bias

Here, the s, f and e fields are assumed to be in decimal. (1 - 2s) is 1 or -1, depending on whether the sign bit is 0 or 1. We add an implicit 1 to the fraction field f, as mentioned earlier. Again, the bias is either 127 or 1023, for single or double precision.

February 26, 2003

MIPS floating-point arithmetic

Example IEEE-decimal conversion


Lets find the decimal value of the following IEEE number. 1 01111100 11000000000000000000000

First convert each individual field to decimal. The sign bit s is 1. The e field contains 01111100 = 12410. The mantissa is 0.11000 = 0.7510. Then just plug these decimal values of s, e and f into our formula. (1 - 2s) (1 + f) 2
e-bias

This gives us (1 - 2) (1 + 0.75) 2124-127 = (-1.75 2-3) = -0.21875.

February 26, 2003

MIPS floating-point arithmetic

Converting a decimal number to IEEE 754


What is the single-precision representation of 347.625? 1. First convert the number to binary: 347.625 = 101011011.1012. 2. Normalize the number by shifting the binary point until there is a single 1 to the left: 101011011.101 20 = 1.01011011101 28 3. The bits to the right of the binary point, 010110111012, comprise the fractional field f. 4. The number of times you shifted gives the exponent. In this case, the field e should contain 8 + 127 = 135 = 100001112. 5. The number is positive, so the sign bit is 0. The final result is: 0
February 26, 2003

10000111

01011011101000000000000
8

MIPS floating-point arithmetic

Special values
The smallest and largest possible exponents e=00000000 and e=11111111 (and their double precision counterparts) are reserved for special values. If the mantissa is always (1 + f), then how is 0 represented? The fraction field f should be 0000...0000. The exponent field e contains the value 00000000. With signed magnitude, there are two zeroes: +0.0 and -0.0. There are representations of positive and negative infinity, which might sometimes help with instances of overflow. The fraction f is 0000...0000. The exponent field e is set to 11111111. Finally, there is a special not a number value, which can handle some cases of errors or invalid operations such as 0.00.0. The fraction field f is set to any non-zero value. The exponent e will contain 11111111.

February 26, 2003

MIPS floating-point arithmetic

Range of single-precision numbers


(1 - 2s) (1 + f) 2
e-127

The largest possible normal number is (2 - 2-23) 2127 = 2128 - 2104. The largest possible e is 11111110 (254). The largest possible f is 11111111111111111111111 (1 - 2-23). And the smallest positive non-zero number is 1 2-126 = 2-126. The smallest e is 00000001 (1). The smallest f is 00000000000000000000000 (0). In comparison, the smallest and largest possible 32-bit integers in twos complement are only -231 and 231 - 1 How can we represent so many more values in the IEEE 754 format, even though we use the same number of bits as regular integers?

February 26, 2003

MIPS floating-point arithmetic

10

Finiteness
There arent more IEEE numbers. With 32 bits, there are 232-1, or about 4 billion, different bit patterns. These can represent 4 billion integers or 4 billion reals. But there are an infinite number of reals, and the IEEE format can only represent some of the ones from about -2128 to +2128. This causes enormous headaches in doing floating-point arithmetic. Not all values between -2128 to +2128 can be represented. Small roundoff errors can quickly accumulate with multiplications or exponentiations, resulting in big errors. Rounding errors can invalidate many basic arithmetic principles such as the associative law, (x + y) + z = x + (y + z). The IEEE 754 standard guarantees that all machines will produce the same resultsbut those results may not be mathematically correct!

February 26, 2003

MIPS floating-point arithmetic

11

Limits of the IEEE representation


Even some integers cannot be represented in the IEEE format.
int x float y printf( printf( = 33554431; = 33554431; "%d\n", x ); "%f\n", y );

Some simple decimal numbers cannot be represented exactly in binary to begin with. 0.1010 = 0.0001100110011...2

February 26, 2003

MIPS floating-point arithmetic

12

0.10
During the Gulf War in 1991, a U.S. Patriot missile failed to intercept an Iraqi Scud missile, and 28 Americans were killed. A later study determined that the problem was caused by the inaccuracy of the binary representation of 0.10. The Patriot incremented a counter once every 0.10 seconds. It multiplied the counter value by 0.10 to compute the actual time. However, the (24-bit) binary representation of 0.10 actually corresponds to 0.099999904632568359375, which is off by 0.000000095367431640625. This doesnt seem like much, but after 100 hours the time ends up being off by 0.34 secondsenough time for a Scud to travel 500 meters! Professor Skeel wrote a short article about this. Roundoff Error and the Patriot Missile. SIAM News, 25(4):11, July 1992.

February 26, 2003

MIPS floating-point arithmetic

13

Floating-point addition example


To get a feel for floating-point operations, well do an addition example. To keep it simple, well use base 10 scientific notation. Assume the mantissa has four digits, and the exponent has one digit. The text shows an example for the addition: 99.99 + 0.161 = 100.151 As normalized numbers, the operands would be written as: 9.999 101 1.610 10-1

February 26, 2003

MIPS floating-point arithmetic

14

Steps 1-2: the actual addition


1. Equalize the exponents. The operand with the smaller exponent should be rewritten by increasing its exponent and shifting the point leftwards. 1.610 10-1 = 0.0161 101 With four significant digits, this gets rounded to 0.016 101. This can result in a loss of least significant digitsthe rightmost 1 in this case. But rewriting the number with the larger exponent could result in loss of the most significant digits, which is much worse. 2. Add the mantissas. 9.999 101 + 0.016 101 10.015 101

February 26, 2003

MIPS floating-point arithmetic

15

Steps 3-5: representing the result


3. Normalize the result if necessary. 10.015 101 = 1.0015 102 This step may cause the point to shift either left or right, and the exponent to either increase or decrease. 4. Round the number if needed. 1.0015 102 gets rounded to 1.002 102. 5. Repeat Step 3 if the result is no longer normalized. We dont need this in our example, but its possible for rounding to add digitsfor example, rounding 9.9995 yields 10.000. Our result is 1.002 102, or 100.2. The correct answer is 100.151, so we have the right answer to four significant digits, but theres a small error already.
February 26, 2003 MIPS floating-point arithmetic 16

Extreme errors
As we saw, rounding errors in addition can occur if one argument is much smaller than the other, since we need to match the exponents. An extreme example with 32-bit IEEE values is the following. (1.5 1038) + (1.0 100) = 1.5 1038 The number 1.0 100 is much smaller than 1.5 1038, and it basically gets rounded out of existence. This has some nasty implications. The order in which you do additions can affect the result, so (x + y) + z is not always the same as x + (y + z)!
float x float y printf( printf( = -1.5e38; = 1.5e38; %f\n, (x + y) + 1.0 ); %f\n, x + (y + 1.0) );

February 26, 2003

MIPS floating-point arithmetic

17

Multiplication
To multiply two floating-point values, first multiply their magnitudes and add their exponents. 9.999 101 1.610 10-1 16.098 100 You can then round and normalize the result, yielding 1.610 101. The sign of the product is the exclusive-or of the signs of the operands. If two numbers have the same sign, their product is positive. If two numbers have different signs, the product is negative. 00=0 01=1 10=1 11=0

This is one of the main advantages of using signed magnitude.

February 26, 2003

MIPS floating-point arithmetic

18

The history of floating-point computation


In the past, each machine had its own implementation of floating-point arithmetic hardware and/or software. It was impossible to write portable programs that would produce the same results on different systems. Many strange tricks were needed to get correct answers out of some machines, such as Crays or the IBM System 370. It wasnt until 1985 that the IEEE 754 standard was adopted. The standard is very complex and difficult to implement efficiently. But having a standard at least ensures that all compliant machines will produce the same outputs for the same program.

February 26, 2003

MIPS floating-point arithmetic

19

Floating-point hardware
Intel introduced the 8087 coprocessor around 1981. The main CPU would call the 8087 for floating-point operations. The 8087 had eight separate 80-bit floating-point registers that could be accessed in a stack-like fashion. Some of the IEEE standard is based on the 8087. Intels 80486, introduced in 1989, included floating-point support in the main processor itself. The MIPS floating-point architecture and instruction set still reflect the old coprocessor days, with separate floating-point registers and special instructions for accessing those registers.

February 26, 2003

MIPS floating-point arithmetic

20

MIPS floating-point architecture


MIPS includes a separate set of 32 floating-point registers, $f0-$f31. Each register is 32 bits long and can hold a single-precision value. Two registers can be combined to store a double-precision number. You can have up to 16 double-precision values in registers $f0-$f1, $f2-$f3, ..., $f30-$f31. $f0 is not hardwired to the value 0.0! There are also separate instructions for floating-point arithmetic. The operands must be floating-point registers, and not immediate values.
add.s add.d $f1, $f2, $f3 $f2, $f4, $f6 # Single-precision $f1 = $f2 + $f3 # Double-precision $f2 = $f4 + $f6

There are other basic operations as you would expect. sub.s and sub.d for subtraction mul.s and mul.d for multiplication div.s and div.d for division

February 26, 2003

MIPS floating-point arithmetic

21

Floating-point register transfers


mov.s and mov.d copy data between floating-point registers. Use mtc1 and mfc1 to transfer data between the integer registers $0-$31 and the floating-point registers $f0-$f31. These are raw data transfers that do not convert between integer and floating-point representations. Be careful with the order of the operands in these instructions.
mtc1 mfc1 $t0, $f0 $t0, $f0 # $f0 = $t0 # $t0 = $f0

There are also special loads and stores for transferring data between the floating-point registers and memory. (The base address is still given in an integer register.)
lwc1 swc1 $f2, 0($a0) $f4, 4($sp) # $f2 = M[$a0] # M[$sp+4] = $f4

The c1 in the instruction names stands for coprocessor 1.

February 26, 2003

MIPS floating-point arithmetic

22

Floating-point comparisons
We also need special instructions for comparing floating-point values, since slt and sltu only apply to signed and unsigned integers.
c.le.s c.eq.s c.lt.s $f2, $f4 $f2, $f4 $f2, $f4

The comparison result is stored in a special coprocessor register. You can then branch based on whether this register contains 1 or 0.
bc1t bc1f Label Label # branch if true # branch if false

Here is how you can branch to the label Exit if $f2 = $f4.
c.eq.s bc1t $f2, $f4 Exit

February 26, 2003

MIPS floating-point arithmetic

23

Floating-point functions
There are conventions for passing data to and from functions. Floating-point arguments are placed in $f12-$f15. Floating-point return values go into $f0-$f1. We also split the register-saving chores, just like earlier. $f0-$f19 are caller-saved. $f20-$f31 are callee-saved. These are the same basic ideas as before because we still have the same problems to solvenow its just with different registers.

February 26, 2003

MIPS floating-point arithmetic

24

Floating-point constants
MIPS does not support immediate floating-point arithmetic instructions, so you must load constant values into a floating-point register first. One solution is to store floating-point constants in the data segment, and to load them with a l.s or l.d pseudo-instruction.
.data .float 0.55555 .text l.s

alpha:

# 5.0 / 9.0

$f6, alpha

# $f6 = 0.55555

Newer versions of SPIM also support the li.s and li.d pseudo-instructions, which make life much easier.
li.s $f6, 0.55555 # $f6 = 0.55555

February 26, 2003

MIPS floating-point arithmetic

25

Type conversions
You can also cast integers to floating-point values using the MIPS type conversion instructions. Type to convert to cvt.s.w Type to convert from Floating-point destination $f4, $f2 Floating-point source register

Possible types for conversions are integers (w), single-precision (s) and double-precision (d) floating-point.
li mtc1 cvt.s.w $t0, 32 $t0, $f2 $f4, $f2 # $t0 = 32 # $f2 = 32 # $f4 = 32.0

February 26, 2003

MIPS floating-point arithmetic

26

A complete example
Here is a slightly different version of the textbook example of converting single-precision temperatures from Fahrenheit to Celsius. celsius = (fahrenheit - 32.0) 5.0 9.0
celsius: li mtc1 cvt.s.w li.s sub.s mul.s jr

$t0, $t0, $f4, $f6, $f0, $f0, $ra

32 $f4 $f4 0.55555 $f12, $f4 $f0, $f6

# # # #

$f4 $f6 $f0 $f0

= = = =

32.0 5.0 / 9.0 $f12 - 32.0 $f0 * 5.0/9.0

This example demonstrates a couple of things. The argument is passed in $f12, and the return value is placed in $f0. We use two different ways of loading floating-point constants. We used only caller-saved floating-point registers.
February 26, 2003 MIPS floating-point arithmetic 27

Summary
The IEEE 754 standard defines number representations and operations for floating-point arithmetic. Having a finite number of bits means we cant represent all possible real numbers, and errors will occur from approximations. MIPS processors implement the IEEE 754 standard. There is a separate set of floating-point registers, $f0-$f31. New instructions handle basic floating-point operations, comparisons and branches. There is also support for transferring data between the floating-point registers, main memory and the integer registers. We still have to deal with issues of argument and result passing, and register saving and restoring in function calls.

February 26, 2003

MIPS floating-point arithmetic

28

You might also like