Binary and Floating Point Arithmetic: What Is Covered
This document provides an overview of binary and floating point numbers. It discusses binary number systems, conversion between number bases, negative binary numbers, binary arithmetic, and floating point representations including issues like overflow, underflow, and the IEEE 754 standard. It also briefly mentions error correcting codes.
Download as PPT, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
67 views
Binary and Floating Point Arithmetic: What Is Covered
This document provides an overview of binary and floating point numbers. It discusses binary number systems, conversion between number bases, negative binary numbers, binary arithmetic, and floating point representations including issues like overflow, underflow, and the IEEE 754 standard. It also briefly mentions error correcting codes.
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 42
Binary and Floating Point
Arithmetic What is covered Appendix A: Binary Numbers (P 631-643) Appendix B: Floating-Point Numbers (P 643-653)
Lecture II: Binary Numbers 1
Contents • Finite Precision Numbers • Radix Number System • Conversion from One Radix to Another • Negative Binary Numbers • Binary Arithmetic • Principles of Floating Point Numbers • IEEE Floating Point Standard 754 Lecture II: Binary Numbers 2 Finite Precision Numbers • In arithmetic, no attention is paid to the amount of memory taken to store numbers. • Computers have finite memory, hence memory matters – need a representation • Algebra differs in finite precision arithmetic. – Closure violated due to overflow and underflow – Density lost in case of real and rational numbers
Lecture II: Binary Numbers 3
Radix Number System • Base 10 Example
Lecture II: Binary Numbers 4
• Requires k different symbols to represent digits digits 0 through k-1. • Decimal Numbers: (Base 10) – Built from 0,1,2,3,4,5,6,7,8,9 • Binary Numbers: (Base 2) – Built from 0,1 • Octal numbers: (Base 8) – Built from 0,1,2,3,4,5,6,7 • Hexadecimal numbers: (Base 16) – Built from 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F Lecture II: Binary Numbers 5 Examples
Lecture II: Binary Numbers 6
Lecture II: Binary Numbers 7 Conversion from One Radix to Another • Octal <-> Hexadecimal <-> Binary – Easy – Binary to Octal: • Divide into groups of three bits –starting from immediate left (and right) of the decimal point • Convert each group into octal from 0 to 7. • Example: » Binary 1010111.011 » Octal 127.3 • May be necessary to add leading or trailing zeros to form groups of three – Hexadecimal <-> Octal, Binary – Same procedure
Lecture II: Binary Numbers 8
More Examples
Lecture II: Binary Numbers 9
Decimal to Binary: First Method • Take the largest power of 2 that divides the number and subtract that amount • Repeat process • Example: Decimal 22 – Largest exponent of 2 is 4. I.e. 16 – 22 –16 = 6 – 6 = 4+2 – Hence Binary number is 10110 – I.e. 2**4 + 2**2 + 2**1
Lecture II: Binary Numbers 10
Decimal to Binary Second Method (Integers only) • Divide the number by 2 – Quotient written directly under original number – Remainder written next to quotient • Repeat process until 0 is reached • Directly use Euclidean Algorithm
Lecture II: Binary Numbers 11
Example
Lecture II: Binary Numbers 12
Binary to Decimal • Method 1- Sum up powers of 2 – Example: 10110 is 2**4 + 2**2 + 2**1 = 22 • Method 2: Successive doubling – Write the binary number vertically – Lines are numbered from bottom to top. – Put 1 on line 1 (bottom) – Entry on line n = 2*(entry on line n-1) + bit on line n (0 or 1)
Lecture II: Binary Numbers 13
Example
Lecture II: Binary Numbers 14
Negative Binary Numbers • Signed Magnitude: Leftmost bit has sign (0 is +, 1 is -) • One’s Complement: To negate a number switch all 0’s and 1’s (including sign bit) Obsolete now. – Example: 00000110 (+6) – (-6 in ones complement) 11111001
Lecture II: Binary Numbers 15
Negative Binary Numbers • Two’s Complement: (Has sign bit 0 for + and 1 for -) • Negating a number: – Switch 0’s and 1’s (as in one’s complement) – Add 1 to the result • Example 00000110 (+6) – (-6 in one;’s complement) 11111001 – (-6 in two’s complement) 11111010 – Any carry over from leftmost bit is thrown away
Lecture II: Binary Numbers 16
Negative Binary Numbers • Excess 2**(m-1): Represents a number by (number) + 2**(m-1) • Example: For 8-bit numbers –3 is –3+127 = 125 in binary is 01111101 • Numbers from –128 to +127 mapped to 0 to 255, expressible as 8 bit positive integers • Identical to two’s complement with sign bit reversed Lecture II: Binary Numbers 17 Lecture II: Binary Numbers 18 Some Properties of Negative Numbers
• Signed magnitude and one’s complement has two
representatives for zero (+0 and –0) • Two’s complement has only one zero • Problem with two’s complement negative of 100000 = 100000 • What is desired: – Only one representation for zero – Exactly the same number of + and – numbers – Need to have an odd count if this is to be achieved.
Lecture II: Binary Numbers 19
Binary Arithmetic • Two binary (octal, hexadecimal) numbers can be added just as decimal numbers • The carry need to be taken to the next position (left) • In one’s complement, carry generated by the leftmost bit is added to the rightmost bit. • In two’s complement, carry generated by the leftmost bit is thrown away Lecture II: Binary Numbers 20 Binary Arithmetic
Lecture II: Binary Numbers 21
Examples of Binary Arithmetic
Lecture II: Binary Numbers 22
Overflow • If addend and augend are of opposite sides no overflow occurs • If both are of same sign, result is opposite sign, overflow occurs • In both one’s and two’s complement overflow occurs iff carry into sign bit differs from carry out of sign bit • Most machines have overflow bit Lecture II: Binary Numbers 23 Floating Point Arithmetic • Two Important issues in representing real (floating point) numbers: – Range: The length of the interval – Precision: What small differences can be shown • N = f* (10**e) – Fraction = f: Number of digits here determines precision – Exponent = e : Determines the range • Example: 3.14 = 0.314 * (10**1) Lecture II: Binary Numbers 24 Two digit exponent and signed three digit fraction • Range +0.100*(10**-99) to +.999*(10+99)
Lecture II: Binary Numbers 25
Overflow, Underflow • Regions 1 and 7 represents overflows – answer incorrect • Regions 3 and 5 represents underflow errors –less serious than overflow errors • Problems with Floating Point Numbers: – No density – Do not form a continumm – Spacing between two consecutive numbers not constant throughout regions. I.e separation between – +0.998* (10**98) and +0.999* (10**98) vastly different from +0.998* (10**0) and +0.999* (10**0) Lecture II: Binary Numbers 26 Lecture II: Binary Numbers 27 Normalized Digits • Shifting the number of digits between exponents and fraction shifts boundaries of regions 2 and 6. • Increasing number of digits in fraction increases density – thus improves accuracy • Increasing size of exponent increases regions 2 and 6, by shrinking others. • In computers: – Base 2,4, 8 or 16 – If leftmost digit is zero shift one place left and decrease exponent by 1 – A fraction with nonzero leftmost digit is said to be normalized. Lecture II: Binary Numbers 28 Normalized Digits • Normalized Digits: – There is only one normalized expression, whereas there can be more than one non- normalized floating point expressions. • Example next slide
Lecture II: Binary Numbers 29
Lecture II: Binary Numbers 30 IEEE Standard 754 • Designed by William Kahan of UCB • Three representations: – Single Precision: 32 bits – Double Precision: 64 bits – Extended Precision: 80 bits • Both Single and Double precision uses – Radix2 for fractions end excess notation for exponents – Starts with sign bits (0 for +, 1 for -)
Lecture II: Binary Numbers 31
Lecture II: Binary Numbers 32 Normal Fractions, Examples • Normal Fraction: – Decimal point, – 1 – Other numbers – Omit 1 to begin with (implied) – Omit binary point (implied) – Either 23 or 52 fraction bits: If all 0 then fraction 1.0 – If all 1, then fraction taken to be 2.0 (slightly less) • Example – 0.5 = 3F000000 – 1 = 3F800000 – 1.5 3FC00000 Lecture II: Binary Numbers 33 Dealing with Underflow I • If calculation results in a number smaller than the smallest representable:Use Denormalized Numbers • Have exponent zero, and fraction given by 23 or 52 bits • These have bit left of decimal as 0 • The smallest denormalized number has 1 as exponent and 0 as fraction = 1.0 *(2**-127) • The largest denormalized number has 0 as exponent and all 1’s = 1.0 *(2**-127) Lecture II: Binary Numbers 34 Dealing with Underflow II • As numbers go further down, first few bits become zero: – exponent represents 2**(-127), and – fraction representing 2**(-23) – So the number represents 2**(-150) • Gives graceful underfull without jumping to zero • Two zeros are present, with fraction 0 and exponent 0 Lecture II: Binary Numbers 35 Dealing with Overflow • There are no bits to represent overflow • Represent infinity with – exponent with all 1’s – Fraction 0 – Not a normalized number • Behaves like mathematical infinity • Infinity/Infinity cannot be determined, represented by NaN Lecture II: Binary Numbers 36 Lecture II: Binary Numbers 37 Lecture II: Binary Numbers 38 Error Correcting Codes • Due to occasional errors of voltage fluctuations, can cause bit errors • To prevent these errors, some memories and networks use error correcting and error detecting codes. • Memory word of n bits has m data bits and r check bits for error checking/correcting
Lecture II: Binary Numbers 39
Hamming Distance • How many bits differ: Take exclusive or: • Example 10001001 and 10110001, EXOR reveals three bit error. • Hamming Distance: The distance in which two code words differ. • If two code words are a hamming distance apart, it will take d single bit errors to convert one to the other. Lecture II: Binary Numbers 40 Correcting vs. Detecting • Error Detecting Codes: To detect d single bit errors need distance d+1 apart codes • To correct d single bit errors, need 2d+1 distance apart code words. • Parity Bit: Takes two errors to go from one correct word to another. • M data and r check bits allows 2**m legal memory words, has n illegal word with distance 1, thus each word has n+1 words dedicated to it. Since total bit patters is 2**n So we get m+r+1 < 2**r, Given m we get a lower bound for r. Lecture II: Binary Numbers 41 Lecture II: Binary Numbers 42