Module01-02 Intro Bits-Bytes
Module01-02 Intro Bits-Bytes
Introduction to
Computer Systems
University of Pennsylvania
School of Engineering and Applied Science
Computer and Information Systems Department
Dr. Thomas Farmer
• Section 1:
• Motivation for taking this course
• What is an abstraction in Computer Science?
• Section 2:
• Course Structure
• Logistics
• Keys to Success
• Section 3:
• Bits & Bytes…The world of binary #s!
SECTION 1
• No magic
• Computers should not be magic to computer scientists!
• Bottom up approach
• Computing systems from transistors on up
• Many of you have already seen them from the middle (Java)
Abstractions
• Abstractions are one of the principal tools that computer scientists use to handle
complexity
• Non-CS example: Taking a taxi to the airport!
SECTION 2
COURSE STRUCTURE, LOGISTICS, KEYS TO SUCCESS!
Course Structure
Hardware Assembly C-
Language Property of Penn Engineering 8
Introduction to Computer Systems
Why C?
C-
Language
? Property of Penn Engineering 9
Introduction to Computer Systems
SECTION 3
BITS, BYTES, THE BINARY WORLD!
What Is a Bit?
• The fact that instructions are also “just 0s and 1s” makes a computer universal: programs are just
another kind of data!
SECTION 3: Part 2
WEIGHTED POSITIONAL
REPRESENTATION
Representation of Integer in
Binary
• To represent an integer in a binary system, we use:
2. Divide by two – keep the remainder; it’s the least significant bit!
3. Keep dividing by two until answer is zero, recording remainders from right to left
I’ve added a leading 0 to make this a full “byte” Property of Penn Engineering 20
Introduction to Computer Systems
SECTION 3: Part 3
BINARY ARITHMETIC
Binary # Table
Using Weighted Positional Representation
• This table shows the all of the “unsigned” numbers we can make with 2-bits & 3-bits
Binary Arithmetic
carry
Binary Multiplication
1310 11012
x 1210 x 11002
--------- -------
26 0000
+ 130 00000
--------- 110100
15610 + 1101000
----------
100111002
SECTION 3: Part 3
1111 (15)
overflow + 0001 (1)
(carry-out)
10000 (16)
• These limitations also show up in programming languages; different basic types have different sizes
• Some basic types in C
• char – typically 8 bits
• short int – typically 16 bits
• int – typically 32 bits
• long int – typically 64 bits
• Note these sizes are not guaranteed and can change on different architectures
Terminology so Far…
• 1 bit = a bit
• 4 bits = a nibble
• 8 bits = a byte
• Overflow
• For unsigned numbers…when width of data goes beyond the width of the machine
• 4-bit machine, cannot hold a 5-bit number!
• In binary #’s
• Positive #s and zero are referred to as “unsigned #s”
• Because they don’t require a +/- sign, we are safe to assume they are positive!
SECTION 3: Part 4
REPRESENTATION OF NEGATIVE
INTEGERS IN BINARY (2C)
• Technique
• Take the positive representation of a binary # and “flip” AKA “complement” each bit
• Example
• Assuming a 4-bit wide computer
• +5 in 1’s complement would be: 0101 (you must pad w/ leading 0)
• -5 in 1’s complement would be: 1010 (you simply flip each digit)
• Good features
• Easy to do, addition works better 0101 + 1010 = 1111 (which is 0)
• The leading 1 indicated a negative number (like sign-magnitude)
• Problem with 1C
• Still two representations of 0: 0000 … AND … 1111
• Example
• Assuming a 4-bit wide computer
• +5 in 2’s complement would be: 0101 (you must pad w/ leading 0)
• -5 in 1’s complement would be: 1010 (recall we flipped each bit)
• -5 in 2’s complement would be: 1011 (we add “1” to get the “2s” complement)
• If there had been a carry-out…we would only have stored only the lower 4-bits
• This means we “throw away” any resulting carry-out past the width of the machine
• Notice: this means overflow is going to get a new definition for 2C numbers
4-Bit 2C as an Example
• The table shows all the 2C numbers we can make with 4-bits (16 possibilities)
• All of the positive numbers have 0 in the MSB, all of the negative numbers have 1 in the MSB
• Range of an n-bit number: –2n-1 through 2n–1 – 1
• Most negative number (–2n–1) has no positive counterpart
0 0 0 0 0 1 0 0 0 –8
0 0 0 1 1 1 0 0 1 –7
0 0 1 0 2 1 0 1 0 –6
0 0 1 1 3 1 0 1 1 –5
0 1 0 0 4 1 1 0 0 –4
0 1 0 1 5 1 1 0 1 –3
0 1 1 0 6 1 1 1 0 –2
0 1 1 1 7 1 1 1 1 –1
Note: most CPU architectures today use 2C representation
Property of Penn Engineering 34
Introduction to Computer Systems
SECTION 3: Part 5
Addition in 2C
• The 2C representation is convenient because it makes ‘regular’ addition work for both positive and
negative numbers:
• We know how to negate and we know how to add, so subtraction comes for free!
SECTION 3: Part 6
• As a programmer you need to be aware of this and guard against arithmetic operations that could
lead to overflow
• How? In a programming language you must be aware of a “types” width and what the largest positive & negative
#s you can store!!
SECTION 3: Part 7
0110 (6) 0101 (5) 1011 (-5) 1100 (-4) 0000 (0)
10 (copy) 1 (copy) 1 (copy) 100 (copy) 0000 (copy)
1010 (flip) 1011 (flip) 0101 (flip) 0100 (flip) 0000 (flip)
Terminology So Far…
• 1 bit = a bit
• 4 bits = a nibble
• 8 bits = a byte
• In binary #’s
• Positive #s and zero are referred to as “unsigned #s”
• Because they don’t require a +/- sign, we are safe to assume they are positive!
• Positive #s (with sign), 0 and, Negative #s are called “signed #s”
SECTION 3: Part 8
SIGN EXTENSION
Sign Extension
510 = 0000 01012 (8 bit 2C) 510 = 0000 01012 (8 bit 2C)
-210 = 11102 (4 bit 2C) -210 = 1111 11102 (4 bit 2C)
+ ---------------------------------- + ----------------------------------
0001 00112 = 1910 0000 00112 = +310
Wrong answer: 5-2 is not 19! Right answer!
We forgot to sign extend! We sign extended the 4-bit 2C
Sign Extension
• When performing arithmetic operations on numbers of different lengths we must sign extend
numbers before we perform operations so that both numbers are the same length
• Rules:
• If a number is positive (leading 0), pad with 0s to meet length
• If a number is negative (leading 1), pad with 1s to meet length
SECTION 3: Part 9
• “Encoding” data simply means an agreed upon “mapping” of data from one representation to another
• At some point, it is the choice of an engineer to define the encoding of data between two forms
UNICODE n
o
110
111
01101110
01101111
N
O
078
079
01001110
01001111
• Today we use UNICODE p 112 01110000 P 080 01010000
q 113 01110001 Q 081 01010001
• Multi-byte encoding r 114 01110010 R 082 01010010
• 232 possibilities! s 115 01110011 S 083 01010011
• ASCII is a 8-bit subset of UNICODE (UTF-8) t 116 01110100 T 084 01010100
u 117 01110101 U 085 01010101
v 118 01110110 V 086 01010110
w 119 01110111 W 087 01010111
x 120 01111000 X 088 01011000
y 121 01111001 Y 089 01011001
z 122 01111010 Z 090 01011010
Representing Images/Video,
Graphics in Binary
• Pixel representation
• Breaking image down into discrete points… X=0,y=25
6
• Apply a coordinate system
• At each point determine intensity of the
color
• RGB
X=256,y=
X=0,y=0
0
SECTION 3: Part 10
REPRESENTING NUMBERS IN
HEXADECIMAL
• Any binary number (unsigned or signed) can be converted to HEX by translating the nibbles:
• 01101101 = 0110 1101 = x6D
• 0011011110101110 = 0011 0111 1010 1110 = x37AE
• 0xDEADC0DE
• 0xBA5EBA11
• 0xB01DFACE
• 0xBADA55
SECTION 3: Part 11
Fixed-Point
20 = 1 2-1 = 0.5
21 = 2 2-2 = 0.25
22 = 4 2-3 = 0.125
00101000.101 (40.625) One problem…how can
+ 11111110.110 (-1.25) you represent a binary
point when you only have
00100111.011 (39.375) 0 and 1 in a computer???
Floating-Point
+1.01000101 x 25 exponent
sign
fraction
Floating-Point Standard
IEEE 754
• The IEEE defines the representation of a 32-bit signed floating point number N as follows
1 8-bits 23-bits
N= S exponent fraction
32 bits
Floating-Point Standard
Example
1 8-bits 23-bits
N= S exponent fraction
• Example: N = 11000001011010000000000000000000
Floating-Point Standard
Information & Approximation
• The IEEE 754 Floating Point Standard:
• Single precision: 32-bits – variable type: float
• Double precision: 64-bits – variable type: double
Number line
IEEE 754 Floating point numbers
Floating point numbers can only represent a small handful of the infinitely infinite set
of real numbers
Almost all operations with floating point numbers are approximate because of
roundoff errors
Because of the fact that numbers are stored with a finite number of bits on the
computer floating point operations are inherently approximate
As a programmer you must be aware of these limitations!