0% found this document useful (0 votes)
76 views639 pages

Lecture1 Merged

The document discusses digital logic design and Boolean algebra. It covers: - The basics of digital vs analog signals and binary values of 1 and 0. - Logic gates like AND, OR, NOT and their truth tables. Gates can be combined to implement any logical function. - Boolean algebra which defines the operators and properties used to simplify logical expressions algebraically. - The Karnaugh map method to minimize logical expressions graphically by circling groups of 1s on a map based on the number of variables. This results in the essential prime implicants.

Uploaded by

Kotla Nishanth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views639 pages

Lecture1 Merged

The document discusses digital logic design and Boolean algebra. It covers: - The basics of digital vs analog signals and binary values of 1 and 0. - Logic gates like AND, OR, NOT and their truth tables. Gates can be combined to implement any logical function. - Boolean algebra which defines the operators and properties used to simplify logical expressions algebraically. - The Karnaugh map method to minimize logical expressions graphically by circling groups of 1s on a map based on the number of variables. This results in the essential prime implicants.

Uploaded by

Kotla Nishanth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 639

CS 322M Digital Logic & Computer Architecture

Basics of Digital Logic Design

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Digital vs Analog

+5 +5

1 0 1
V V
Time Time

–5 –5

Digital: Only assumes Analog: values vary over a


discrete values broad range continuously
Digital Information

Computers deals with digital information.


It deals with the information that is represented by
binary digits.

+5 +5

1 0 1
V V
Time Time

–5 –5

Digital: Only assumes Analog: values vary over a


discrete values broad range continuously
Binary Values
 Typically consider only two discrete values:
1’s and 0’s
1, TRUE, HIGH
0, FALSE, LOW
 1 and 0 can be represented by specific voltage levels,
rotating gears, fluid levels, etc.
 Digital circuits usually depend on specific voltage levels to
represent 1 and 0
 Bit: Binary digit
Digital Logic Basics

 Hardware consists of simple building blocks - logic gates

AND, OR, NOT, …

NAND, NOR, XOR, …

 Logic gates are built using transistors

NOT gate can be implemented by a single transistor

AND/OR gate requires 3 transistors

NAND/NOR gate requires 2 transistors


Logic Gates
 Simple gates
AND
OR
NOT
 Functionality can be
expressed by a truth table
 A truth table lists output for
each possible input
combination
 Precedence
NOT > AND > OR
Logic Gates

 Additional useful gates


NAND
NOR
XOR (EXOR)
 NAND = AND + NOT
 NOR = OR + NOT
 XOR implements exclusive-
OR function
 XNOR

 Total no. of functions


Logic Gates
AND, OR, NOT

AND AB C OR AB C NOT A C
0 0 0 0 0 0 0 1
0 1 0 0 1 1 1 0
1 0 0 1 0 1
1 1 1 1 1 1

A A
1 A 1 1

A A
0 A
0 0

0 dominates in AND 1 dominates in OR


Logic Gates
 Complete sets
 A set of gates is complete if we can implement any logical
function using only the type of gates in the set
 {AND, OR, NOT}
 {AND, NOT}
 {OR, NOT}
 {NAND}
 {NOR}
 Minimal complete set is a complete set with no redundant
elements.
 Universal Gates
Combinational Logic vs Boolean Algebra

a ab
b ab + cd
c e (ab+cd)
d cd
e

Schematic Diagram: Boolean Algebra:


5 primary inputs 5 literals
4 components 4 operators
9 signal nets
12 pins
Boolean Algebra
 Boolean algebra is an algebraic structure defined by a set
of elements, B, together with two binary operators, + and .,
provided that the following postulates are satisfied:
1. (a) Closure with respect to the operator +
(b) Closure with respect to the operator .
2. (a) An identity element with respect to +, designated by 0: X+0 = 0+X = X
(b) An identity element with respect to ., designated by 1: X.1 = 1.X = X
3. (a) Commutative with respect to +: X + Y = Y +X
(b) Commutative with respect to ., X.Y = Y.X
4. (a) . is distributive over +: X.(Y+Z) = X.Y + X.Z
(b) + is distributive over .: X+(Y.Z) = (X+Y).(X+Z)
5. For every element x ∈ B, there exists an element x’ ∈ B (called the complement of x)
such that (a) x + x’ = 1 and (b) x.x’ = 0
6. There exists at least two elements x, y ∈ B such that x ≠ y
Boolean Algebra Definitions
 Complement: variable with a bar over it
A, B, C
 Literal: variable or its complement
A, A, B, B, C, C
 Implicant: product of literals
ABC, AC, BC
 Minterm: product that includes all input variables
ABC, ABC, ABC
 Maxterm: sum that includes all input variables
(A+B+C), (A+B+C), (A+B+C)
T1: Identity Element (Postulate 2)

B 1 = B
B + 0 = B
[Duality Principle: Interchange + and .
operations and replace 1’s by 0’s and 0’s by
1’s]
B
1 = B

B
0 = B
T2: Null Element Theorem

B 0 = 0
B + 1 = 1 [ B + 1 = 1.(B+1) …. 2(b) Identity
= (B+B’).(B+1) …. 5(a) Complement
= B+B’.1 … 4(b) Distributive
= B + B’ …..2(b) Identity
= 1 ….. 5(a) Complement ]

B
0 = 0

B
1 = 1
T3: Idempotency Theorem

B B = B
B + B = B

B
B = B

B
B = B
T4: Involution Theorem

B = B

B = B
T5: Complement Theorem (Postulate 5)

B B = 0
B +B = 1

B
B = 0

B
B = 1
Basic Boolean Theorems
Boolean Theorems of Several Variables
Reference

Text Book:

Digital Design by M. Morris Mano


Third Edition: Chapter 2: Boolean Algebra and Logic Gates
Page no. 33 to 42 and Page no. 51 - 57

Fifth Edition: Section 1.9: Binary Logic


Page No. 30 – 33
Section 2.1 to 2.5 & Section 2.7 & 2.8
Page No. 38 – 48 & Page No. 58 - 63
CS 322M Digital Logic & Computer Architecture

Minimization of Boolean Expression

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Algebraic Method

Boolean algebra(algebraic method)

1. x.(x’+y) = x.y

x.(x’+y) = x.x’ + x.y


= 0 + x.y
= x.y
Algebraic Method

Boolean algebra(algebraic method)

2. x.y + x’.z + y.z = x.y + x’.z + y.z.(x+x’)


= x.y + x’.z + y.z.x + y.z.x’
= x.y.(1+z) + x’.z.(1+y)
= x.y + x’.z

3. (x+y).(x’+z).(y+z) = (x+y).(x’+z)
Algebraic Method

Boolean algebra(algebraic method)

4. x’.y.z + x.y’.z + x.y.z’ + x.y.z = x’.y.z + x.y’.z + x.y.z’ + x.y.z + x.y.z


= x.y.(z’+z) + y.z.(x’+x) + x.y’.z
= x.y + y.z + x.y’.z

(may be = x’.y.z + x.z + x.y)

Is it the minimized form?


Algebraic Method

Boolean algebra(algebraic method)

5. x’.y’.z’ + x’.y’.z + x’.y.z + x.y.z’ + x.y.z + wxy’z


= x’.y’.(z’+z) + x’.y.z + x.y. (z+z’) + wxy’z
= x’.y’ + x’.y.z + x.y + wxy’z

= x’.y’+x.y+ x’.y.z+x’.y’.z +wxy’z | = x’.y’+x.y+ x’.y.z+x.y.z +wxy’z


= x’.y’ + x.y + x’.z + wxy’z | = x’.y’ + x.y + y.z + wxy’z

Minterm: w.x.y’.z
Implicant: x’.y.z, x’.y’.z, …….
Prime Implicant: y.z, x’.z, x’.y’ (x’.y’.z’ + x’.y’.z), ……
Essential Prime Implicant: x’.y’, x.y

Minimized form may not be unique


Algebraic Method

Boolean algebra(algebraic method)

5. x’.y’.z’ + x’.y’.z + x’.y.z + x.y.z’ + x.y.z + wxy’z


= x’.y’.(z’+z) + x’.y.z + x.y. (z+z’) + wxy’z
= x’.y’ + x’.y.z + x.y + wxy’z

= x’.y’+x.y+x’.y.z+x’.y’.z+wxy’z | = x’.y’+x.y+x’.y.z+x.y.z+wxy’z
= x’.y’ + x.y + x’.z + wxy’z | = x’.y’ + x.y + y.z + wxy’z

Standard Form: Sum-of-Product form

Minimized form may not be unique

Canonical Form: Sum of Minterms (Unique Representation)

Similarly: Product-of-Sum form and Product-of-Maxterms


Algebraic Method
Karnaugh Map (Map Method)

Two Variables map:

XY+XY=X(Y+Y)=X
mo m1 m3 m2
Karnaugh Map (Map Method)

Three Variables map:


Karnaugh Map (Map Method)

Four Variables map:

Gray code
Karnaugh Map (Map Method)
Karnaugh Map (K- Map) Steps
1. Sketch a Karnaugh map grid for the given problem.in power of 2N Squares
2. Fill in the 1’s and 0’s from the truth table of sop or pos Boolean function
3. Circle groups of 1’s.
 Circle the largest groups of 2, 4, 8, etc. first.
 Minimize the number of circles but make sure that every 1 is in a circle.
4. Write an equation using these circles.

F(X,Y,Z)=m(2,3,4,5) =XY+XY F(X,Y,Z)=m(0,2,4,6) = XZ+XZ  =Z(X+X)=Z


Karnaugh Map (Map Method)
Four-Variable K-Map : 16 minterms : m0 ~ m15
Rectangle group
– 2-squares(minterms) : 3-literals product term
– 4-squares : 2-literals product term
– 8-squares : 1-literals product term
– 16-squares : logic 1
Karnaugh Map (Map Method)

F(W, X,Y,Z)=m(0,2,7,8,9,10,11) = WX’ + X’Z’ + W’XYZ

Minimize the following expression


AB’C+A’BC+A’B’C+A’B’C’+AB’C’
= B’+A’C
Karnaugh Map (Map Method)

Minimize the following expression


B’C’D’+A’BC’D’+ABC’D’+A’B’CD+AB’CD+A’B’CD’+A’BCD’+ABCD’+AB’CD’
= D’+B’C

B’C’D’ = (A’+A)B’C’D’
=A’B’C’D’+AB’C’D’
Karnaugh Map (Map Method)
Don’t Care Conditions
• it really does not matter since they will never occur(its output is either
‘0’ or ‘1’)
• The don’t care terms can be used to advantage on the Karnaugh
map
Karnaugh Map (Map Method)
K- Map for POS
(B+C+D)(A+B+C’+D)(A’+B+C+D’)(A+B’+C+D)(A’+B’+C+D)
(B+C+D)=(A’A+B+C+D)=(A’+B+C+D)(A+B+C+D)
(1+0+0+0)(0+0+0+0)(0+0+1+0)(1+0+0+1)(0+1+0+0)(1+1+0+0)
=(C+D)(A’+B+C)(A+B+D)

(C’D’)’ = C+D
(AB’C’)’=A’+B+C
(A’B’D’)’=A+B+D
Karnaugh Map (Map Method)

Minimize the following expression


F=B’C’D’+A’BC’D’+ABC’D’+A’B’CD+AB’CD+A’B’CD’+A’BCD’+ABCD’+AB’CD’
= D’+B’C

Product of Sum form:


F’ = A’B’C’D+A’BC’D+A’BCD+
ABC’D+ABCD+AB’C’D
= C’D + BD
(F’)’ = F = (C’D +BD)’
= (C+D’).(B’+D’)
Karnaugh Map (Map Method)
Converting Between POS and
SOP Using the K-map
(A’+B’+C+D)(A+B’+C+D)(A+B+C+D’)(A+B+C’+D’)
(A’+B+C+D’)(A+B+C’+D)
Karnaugh Map (Map Method)
F=A’B’C’D’+AB’C’C’+A’BC’D’+ABC’D’+A’B’CD+AB’CD+A’B’CD’+A’BCD’+ABC
D’+AB’CD’ (Sum-of-Minterms)
F = D’+B’C (Sum-of-Product)
F’ = A’B’C’D+A’BC’D+A’BCD+ABC’D+ABCD+AB’C’D
F=(F’)’ = (A’B’C’D+A’BC’D+A’BCD+ABC’D+ABCD+AB’C’D)’
=(A+B+C+D’).(A+B’+C+D’).(A+B’+C’+D’).(A’+B’+C+D’).(A’+B’+C’+D’).(A’+B+C
+D’) (Product-of-Maxterms)
F = (C+D’).(B’+D’) (Product-of-Sum)

F = m(0,2,3,4,6,8,10,11,12,14)
F = πM(1,5,7,9,13,15)
Reference

Text Book:

Digital Design – M. Morris Mano (Third Edition)

Section 2.4: Page No.: 40 – 51

Chapter 3: Page No. 64 – 89


CS 322M Digital Logic & Computer Architecture

Combinational Circuits

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Logic Blocks
Binary Binary
Digital Gates Digital
. Output
Input .
. Signal
Signal

Types of Basic Logic Blocks

 Combinational Logic Block


Logic Blocks whose output logic value depends
only on the input logic values

 Sequential Logic Block


Logic Blocks whose output logic value depends
on the input values and the state (stored
information) of the blocks
Combinational vs Sequential Logic

In Out
Logic Logic
In Out
Circuit Circuit

State

(a) Combinational (b) Sequential

Output = f(In ) Output = f(In, Previous In)

 A combinational circuit consists of input


variables, logic gates, and output variables.
Example of Combinational Logic Circuits

Design of logic circuit for control of Design of building alarm device


water pumping
Multiplexers

S = 0, Y = I0 Truth Table S Y Y = S’I0 + SI1


S = 1, Y = I1 0 I0
1 I1
Multiplexers
 Multiplexer
2n data inputs 4-data input MUX
n selection inputs
a single output

 Selection input
determines the input
that should be
connected to the output
Multiplexers
4-data input MUX implementation
Quadruple 2-to-1 Line Multiplexer
 Multiplexer circuits can be combined with common selection
inputs to provide multiple-bit selection logic.

I0
Y

I1
Demultiplexers
Multiplexer & Demultiplexer
Decoders
 Decoder: 2-to-4-line decoder
Decoders

Logic function implementation

(Full Adder)
Decoders to Index in Memory
Decoders to Index in Memory
Encoders
 An encoder is the inverse operation of a decoder.
C = D1 + D3 + D5 + D7
B = D2 + D3 + D6 + D7
A = D4 + D5 + D6 + D7
Encoders

Truth table for Octal-to-Binary Encoder


Inputs Outputs
D0 D1 D2 D3 D4 D5 D6 D7 ABC
1 0 0 0 0 0 0 0 000
0 1 0 0 0 0 0 0 001
0 0 1 0 0 0 0 0 010
0 0 0 1 0 0 0 0 011
0 0 0 0 1 0 0 0 100
0 0 0 0 0 1 0 0 101
0 0 0 0 0 0 1 0 110
0 0 0 0 0 0 0 1 111
Decoders & Encoders
Reference

Text Book:

Digital Design – M. Morris Mano (Third Edition)

Section 4.8: Page No.: 134 – 145


CS 322M Digital Logic & Computer Architecture

Binary Number System and Information


Representation

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Digital Systems

DIGITAL
CIRCUITS
Why Binary Arithmetic?

3+5 =8

0011 + 0101 = 1000


Why Binary Arithmetic?
 Hardware can only deal with binary digits, 0 and 1.
 Must represent all numbers, integers or floating point,
positive or negative, by binary digits, called bits.
 Can devise electronic circuits to perform arithmetic
operations: add, subtract, multiply and divide, on binary
numbers.
Positive Integers

 Decimal system: made of 10 digits, {0,1,2, . . . , 9}


41 = 4×101 + 1×100
255 = 2×102 + 5×101 + 5×100
 Binary system: made of two digits, {0,1}
00101001= 0×27 + 0×26 + 1×25 + 0×24
+1×23 + 0×22 + 0×21 + 1×20
= 32 + 8 +1 = 41
 11111111= 255, largest number with 8 binary digits, 28-1
 LSB and MSB
Base or Radix
 For decimal system, 10 is called the base or radix.
 Decimal 41 is also written as 4110 or 41ten
 Base (radix) for binary system is 2.
 41ten = 1010012 or 101001two
111ten = 1101111two
111two = 7ten
Base or Radix
 For Hexadecimal system, 16 is the base or radix.
 Needs 16 symbols: 0, 1, …..,9, A, B, C, D, E, F

111ten = 01101111two = 6F16 = 6*161 + 15*160


Number Systems
Representation of positive numbers same in most systems
What about negative numbers?
Major differences are in how negative numbers are
represented
Three major schemes:
 sign and magnitude
 ones complement
 twos complement
Sign and Magnitude Representation
 Use fixed length binary representation
 Use left-most bit (called most significant bit or MSB) for sign:
 0 for positive
 1 for negative

 Example: +18ten = 00010010two


–18ten = 10010010two
Sign and Magnitude Representation
-7 +0
-6 1111 0000 +1
1110 0001
-5 +2 +
1101 0010
-4 1100 0011 +3 0 100 = + 4

-3 1011 0100 +4 1 100 = - 4


1010 0101
-2 +5 -
1001 0110
-1 1000 0111 +6
-0 +7

High order bit is sign: 0 = positive (or zero), 1 = negative


Three low order bits is the magnitude: 0 (000) thru 7 (111)
n-1
Number range for n bits = +/-2 -1
Two representations for 0
Difficulties with Signed Magnitude
 Sign and magnitude bits should be differently treated in
arithmetic operations.
 Addition and subtraction require different logic circuits.
 Overflow is difficult to detect.
 “Zero” has two representations:
 + 0ten = 00000000two
 – 0ten = 10000000two
 Signed-integers are not used in modern computers.
Addition and Subtraction of Numbers
Sign and Magnitude Form
4 0100 -4 1100
result sign bit is the same as
the operands‘ sign +3 0011 + (-3) 1011
7 0111 -7 1111

when signs differ, operation is 4 0100 -4 1100


subtract, sign of result -3 1011
depends on sign of number +3 0011
with the larger magnitude 1 0001 -1 1001
Integers With Sign – Two Ways

 Use fixed-length representation, but no explicit sign bit:


1’s complement: To form a negative number, complement
each bit in the given number.
2’s complement: To form a negative number, start with the
given number, subtract one, and then complement each
bit, or first complement each bit, and then add 1.
 2’s complement is the preferred representation.
Ones Complement
-0 +0
-1 1111 0000 +1
1110 0001
-2 +2 +
1101 0010
-3 1100 0011 +3 0 100 = + 4

-4 1011 0100 +4 1 011 = - 4


1010 0101
-5 +5 -
1001 0110
-6 1000 0111 +6
-7 +7

Subtraction implemented by addition & 1's complement

Still two representations of 0! This causes some problems


1’s Complement Numbers

Decimal Binary number


0000 magnitude
1111 0001 Positive Negative
15 0
-0 0 0000 1111
1100 12 -3 4 0100 1 0001 1110
-7 7 2 0010 1101
8
0111 3 0011 1100
1000
4 0100 1011
Negation rule: invert bits. 5 0101 1010
6 0110 1001
Problem: 0 ≠ – 0 7 0111 1000
Addition and Subtraction of Numbers
Ones Complement Calculations
4 0100 -4 1011
+3 0011 + (-3) 1100
7 0111 -7 10111
End around carry 1
1000
4 0100 -4 1011
-3 1100 +3 0011
1 10000 -1 1110
End around carry
1
0001
Twos Complement
-1 +0
-2 1111 0000 +1
1110 0001
-3 +2 +
1101 0010
like 1's comp except
shifted one position -4 1100 0011 +3 0 100 = + 4
clockwise
-5 1011 0100 +4 1 100 = - 4
1010 0101
-6 +5 -
1001 0110
-7 1000 0111 +6
-8 +7

Only one representation for 0

One more negative number than positive number


2’s Complement Numbers

Decimal Binary number


0000
magnitude Positive Negative
1111 0001
15 0 0 0000
-1
1 0001 1111
1100 12 -4 4 0100
2 0010 1110
-8 7
8 3 0011 1101
0111
1000 4 0100 1100
5 0101 1011
6 0110 1010
Negation rule: invert bits 7 0111 1001
and add 1 8 1000
2’s Complement Numbers
N* = 2n - N
24 = 10000
Example: Twos complement of 7
sub 7 = 0111
1001 = repr. of -7

Example: Twos complement of -7 24 = 10000


sub -7 = 1001
0111 = repr. of 7
Shortcut method:
Twos complement = bitwise complement + 1
0111 -> 1000 + 1 -> 1001 (representation of -7)
1001 -> 0110 + 1 -> 0111 (representation of 7)
Addition and Subtraction of Numbers
Twos Complement Calculations
4 0100 -4 1100
+3 0011 + (-3) 1101
If carry-in to sign =
carry-out then ignore 7 0111 -7 11001
carry

if carry-in differs from


carry-out then overflow
4 0100 -4 1100
-3 1101 +3 0011
1 10001 -1 1111

Simpler addition scheme makes twos complement the most


common choice for integer number systems within digital
systems
Three Systems (n = 4)

10000
0000 0000 0000
1111 1111 1111
0 0010 0 0
–7 2 –0
–1

6 –6 6
–2 7 7 7
–0 –5 –7 1010 –8
1010 0111 1010 0111 0111
1000 1000 1000

1010 = – 2 1010 = – 5 1010 = – 6

Signed magnitude 1’s complement integers 2’s complement integers


Three Representations

Sign-magnitude 1’s complement 2’s complement

000 = +0 000 = +0 000 = +0


001 = +1 001 = +1 001 = +1
010 = +2 010 = +2 010 = +2
011 = +3 011 = +3 011 = +3
100 = - 0 100 = - 3 100 = - 4
101 = - 1 101 = - 2 101 = - 3
110 = - 2 110 = - 1 110 = - 2
111 = - 3 111 = - 0 111 = - 1
(Preferred)
Overflow Conditions
Add two positive numbers to get a negative number

or two negative numbers to get a positive number


-1 +0 -1 +0
-2 1111 +1 -2
0000 1111 0000 +1
1110 0001 1110
-3 +2 -3 0001
1101 1101 +2
0010 0010
-4 -4
1100 0011 +3 1100 0011 +3
-5 1011 -5 1011
0100 +4 0100 +4
-6 1010 1010
0101 -6 0101
+5 +5
1001 1001
0110 0110
-7 1000 +6 -7 +6
0111 1000 0111
-8 +7 -8 +7

5 + 3 = -8! -7 - 2 = +7!
Overflow: An Error
 Examples: Addition of 3-bit integers (range - 4 to +3)

• -2-3 = -5 110 = -2 000


+ 101 = -3 111 0 001
1
= 1011 = 3 (error) -1
– +
2 010
110 -2
• 3+2 = 5 011 = 3 3
-3 011
010 = 2 101 -4

= 101 = -3 (error) 100 Overflow


crossing

 Overflow rule: If two numbers with the same sign bit (both
positive or both negative) are added, the overflow occurs if
and only if the result has the opposite sign.
Overflow and Finite Universe

Decrease Increase
Infinite -∞ . . .1111 0000 0001 0010 0011 0100 0101 . . .∞
universe
of integers
No overflow
0000
1111 0001
1110 0010 Finite
1101 Decrease Universe

Increase
0011 of 4-bit
1100
0100 binary
1011 integers
0101
1010 0110
0111
1001 1000
Forbidden fence
Overflow Conditions
0111 1000
5 0101 -7 1001
3 0011 -2 1100
-8 1000 7 10111
Overflow Overflow

0000 1111
5 0101 -3 1101
2 0010 -5 1011
7 0111 -8 11000
No overflow No overflow
Overflow when carry in to sign does not equal carry out
Real Numbers

• Numbers with fractions


• Could be done in pure binary
– 1001.1010 = 24 + 20 +2-1 + 2-3 =9.625
• Where is the binary point?
• Fixed?
– Very limited
• Moving?
– How do you show where it is?
Floating Point

• +/- .significand x 2exponent


• Misnomer
• Point is actually fixed between sign bit and body of
mantissa
• Exponent indicates place value (point position)
Signs of Floating Point

• Mantissa is stored in 2’s compliment


• Exponent is in excess or biased notation
– e.g. Excess (bias) 128 means
– 8 bit exponent field
– Pure value range 0-255
– Subtract 128 to get correct value
– Range -128 to +127
Normalization

• FP numbers are usually normalized


• i.e. exponent is adjusted so that leading bit
(MSB) of mantissa is 1
• Since it is always 1 there is no need to
store it
• (c.f. Scientific notation where numbers are
normalized to give a single digit before the
decimal point
• e.g. 3.123 x 103)
Floating Point Examples

Exponent is presented in biased-127 format


FP Ranges

• For a 32 bit number


– 8 bit exponent
– +/- 2127  1.5 x 1077
• Accuracy
– The effect of changing lsb of mantissa
– 23 bit mantissa 2-23  1.2 x 10-7
– About 6 decimal places
IEEE 754 Formats

• Standard for floating point storage


• 32 and 64 bit standards
• 8 and 11 bit exponent respectively
IEEE 754 Formats
Other Codes

• Excess Code (Excess-128)


• GREY Code
• BCD (Binary Coded Decimal)
Character Representation

• ASCII (American Standard Code for


Information Interchange)
• EBCDIC (Extended Binary Coded Decimal
Interchange Code)
• UNICODE
Reference
Text Book:
Digital Design – M. Morris Mano, Third Edition
Chapter 1: Page No. 1 – 24

Computer Organization and Architecture


Designing for Performance – William Stallings,
Seventh Edition
Chapter 9: Page No.: 289 – 301, 312 – 319
CS 322M Digital Logic & Computer Architecture

Examples: Combinational Circuits

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Comparator

 Used to implement comparison operators (= , > , < ,  , )


Comparator

 Used to implement comparison operators (= , > , < ,  , )


Magnitude Comparator

 We inspect the relative


magnitudes of pairs of MSB.
If equal, we compare the
next lower significant pair of
digits until a pair of unequal
digits is reached.
 If the corresponding digit of
A is 1 and that of B is 0, we
conclude that A>B.
Comparator

4-bit magnitude comparator chip


Comparator

Serial construction of an 8-bit comparator


Addition

 Adding bits:
0+0= 0
0+1= 1
1+0= 1
carry
 1 + 1 = (1) 0
 Adding integers:

1 1 0
0 0 0...... 0 1 1 1 two = 7ten
+ 0 0 0...... 0 1 1 0 two = 6ten
= 0 0 0 . . . . . . 1 (1)1 (1)0 (0)1 two = 13ten
Subtraction

 Direct subtraction

0 0 0......0 1 1 1 two = 7ten


– 0 0 0......0 1 1 0 two = 6ten
= 0 0 0...... 0 0 0 1two = 1ten

 Two’s complement subtraction by adding


1 1 1......1 1 0
0 0 0......0 1 1 1 two = 7ten
+ 1 1 1......1 0 1 0 two = – 6ten
= 0 0 0 . . . . . . 0 (1) 0 (1) 0 (0)1 two = 1ten
Adding Two Bits

s=a+b
a b
Decimal Binary
0 0 0 00
0 1 1 01
1 0 1 01
1 1 2 10

CARRY
SUM
Half-Adder

• Adding two bits:


a b a+b
0 0 00 AND carry
0 1 01 HA
a
1 0 01 XOR sum
b
1 1 10

carry sum
Full Adder

s=a+b+c
a b c
Decimal value Binary value
0 0 0 0 00
0 0 1 1 01
0 1 0 1 01
0 1 1 2 10
1 0 0 1 01
1 0 1 2 10
1 1 0 2 10
1 1 1 3 11
CARRY SUM
Full-Adder

OR carry
AND
AND
HA
HA
a XOR sum
XOR
b
c
FA
Ripple Carry Adder

All 2n input bits available at the same time


Carries propagate from the FA in position 0 (with inputs
x0 and y0) to position i before that position produces
correct sum and carry-out bits
Carries ripple through all n FAs before we can claim that
the sum outputs are correct and may be used in further
calculations
Ripple Carry Adder

Ripple-carry adders can be slow


Delay proportional to number of bits
32-bit Ripple-Carry Adder

c32
c32 c31 . . . c2 c1 0 a31 (discard)
a31 . . . a2 a1 a0 b31 FA31
+ b31 . . . b2 b1 b0 s31
c31
s31 . . . s2 s1 s0

a2
b2 FA2 s2
a1
b1 FA1 c2 s1
a0
b0 FA0 c1 s0
c0 = 0
How Fast is Ripple-Carry Adder?
 Longest delay path (critical path) runs from (a0, b0) to
sum31.
 Suppose delay of full-adder is 100ps.
 Critical path delay = 3,200ps
 Clock rate cannot be higher than 1/(3,200×10 –12) Hz =
312MHz.
 Must use more efficient ways to handle carry.
Speeding Up the Adder

a0-a15 16-bit
ripple s0-s15
b0-b15
carry
c0 = 0 adder
a16-a31 16-bit
ripple 0
b16-b31

Multiplexer
carry
0 adder s16-s31
a16-a31 16-bit
ripple 1
b16-b31
carry
1 adder
Binary Subtractor
M = 1subtractor ; M = 0adder
Concept of Fast Adders

Carry lookahead adders


Eliminate the delay of ripple-carry adders
Carry-ins are generated independently
C0 = A0 B0
C1 = A0 B0 A1 + A0 B0 B1 + A1 B1
. . .
Requires complex circuits
Carry Propagation Issue
 Because the propagation delay will affect the output signals
on different time, so the signals are given enough time to get
the precise and stable outputs.
 The most widely used technique employs the principle of
carry look-ahead to improve the speed of the algorithm.
Carry-Look-Ahead Adders

 Objective - generate all incoming carries in parallel


 Feasible - carries depend only on xn-1,xn-2,...,x0 and yn-1,yn-
2,…,y0 - information available to all stages for calculating
incoming carry and sum bit
 Requires large number of inputs to each stage of adder -
impractical
 Number of inputs at each stage can be reduced - find out
from inputs whether new carries will be generated and
whether they will be propagated
Carry Propagation
 If Ai=Bi=1 - carry-out generated regardless of incoming
carry - no additional information needed
 If Ai,Bi=10 or Ai,Bi=01 - incoming carry propagated
 If Ai=Bi=0 - no carry propagation
 Gi=Ai . Bi - generated carry ;
 Pi=Ai Bi - propagated carry
 Ci+1= Ai .Bi + Ci (xi yi) = Gi + Ci Pi
Carry Propagation
 Gi=Ai . Bi - generated carry ;
 Pi=Ai Bi - propagated carry
 Ci+1= Ai .Bi + Ci (xi yi) = Gi + Ci Pi
 Substituting Ci=Gi-1+Ci-1Pi-1
 Ci+1=Gi+Gi-1Pi+Ci-1Pi-1Pi
 All carries can be calculated in parallel from
xn-1,xn-2,...,x0 , yn-1,yn-2,…,y0 , and forced carry c0
Carry look-ahead generator

 C3 is propagated at the same time as C2 and C1.


Carry Propagation – 4bit Adder

PG: Group Propagate, GG: Group Generate


4-bit adder with carry lookahead
Reference

Text Book:

Digital Design – M. Morris Mano, Third Edition

Chapter 4: Page No. 111 – 134


CS 322M Digital Logic & Computer Architecture

Sequential Circuits: Latch and Flip-Flop

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Sequential Circuit

Inputs Outputs
Combinational
circuit
Storage
Next
state Present
state

Timing signal
(clock)

Clock
a periodic external event (input)
Clock
S-R Latch
S-R Latch
S-R Latch with control input
D Flip-Flop

D S
Q
C

Q’
Y R

S R C Q Q’
0 0 1 Q0 Q0’ Store
D C Q Q’ 0 1 1 0 1 Reset
1 0 1 1 0 Set
0 1 0 1 1 1 1 1 1 Disallowed
1 1 1 0 X X 0 Q0 Q0’ Store
X 0 Q0 Q0’
D Flip-Flop

x E
D Q
z x
E C

The D flip flop stores data indefinitely, regardless of input D values, if C = 0

Forms basic storage element


Master-Slave D Flip-Flop

Consider two latches combined together


Only one C value active at a time
Output changes on low level of the clock
D Flip-Flop

Positive edge triggered

D C Q Q’
D Q 0 0 1
1 1 0
C Q’ X 0 Q0 Q0’

Hi-Lo edge Lo-Hi edge


Positive edge-triggered D flip-flop
Positive & Negative edge-triggered D flip-flop

Lo-Hi edge Hi-Lo edge


J-K flip-flop

J K CLKQ Q’
0 0 1 Q0 Q0’
0 1 1 0 1
1 0 1 1 0
1 1 1 Q0’ Q0
J-K flip-flop

J K CLKQ Q’
Created from D flop
J sets 0 0 1 Q0 Q0’
K resets 0 1 1 0 1
J=K=1 -> invert output 1 0 1 1 0
1 1 1 Q0’ Q0
Edge Triggered J-K flip-flop
Edge Triggered T flip-flop

C T Q Q’
0 Q0 Q0’
1 TOGGLE
Asynchronous Input
Reference

Text Book:

Digital Design – M. Morris Mano, Third Edition

Chapter 5: Page No. 167 – 179


CS 322M Digital Logic & Computer Architecture

Sequential Circuits

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Sequential Circuit: Analysis

x D0
Q1 Q0
Q
D
Q’
y
Q
D Q1
Q0 D1 Q’

Clk

y(t) = x(t)Q1(t)Q0(t)
Q0(t+1) = D0(t) = x(t)Q1(t)
Q1(t+1) = D1(t) = x(t) + Q0(t)
Sequential Circuit: Analysis
Next State Output
State Table
Present
State
x=0 x=1 x=0 x=1

00 00 10 0 0
01 10 10 0 0
10 00 11 0 0
11 10 11 0 1

Q1(t) Q0(t) Q1(t+1) Q0(t+1)

y(t) = x(t)Q1(t)Q0(t)
Q0(t+1) = D0(t) = x(t)Q1(t)
Q1(t+1) = D1(t) = x(t) + Q0(t)
Sequential Circuit: Analysis
State Table and State Diagram
Next State Output
Present
State
x=0 x=1 x=0 x=1

Let: s0 s2 0 0
s0 = 00
s0
s1 s2 s2 0 0
s1 = 01
s2 s0 s3 0 0
s2 = 10
s3 = 11 s3 s2 s3 0 1

1/1
0/0 0/0
S0 S1 0/0 S2 1/0 S3
1/0
1/0 0/0
Sequential Circuit: Example
Mealy vs Moore
Mealy Model

Inputs
Input Output Outputs
Logic Logic
Combina- Memory Combina-
tional Element tional

Moore Model

Inputs Input Output Outputs


Logic Logic
Combina- Memory Combina-
tional Element tional
Design Example

Design a circuit that detects three or more


consecutive 1’s in a String of bits coming through
an Input line

Input:
0100011011101111110
Output:
0000000000100011110
Design Example
0

Design a circuit that detects three


or more consecutive 1’s in a
String of bits coming through an
Input line

Input:
0100011011101111110
Output:
0000000000100011110
Design Example

Present Next State Assignment:


Input Output
State State S0: 00
S1: 01
A B x A B y S2: 10
S3: 11
0 0 0 0 0 0
0 0 1 0 1 0 Q(t) Q(t+1) D
0 1 0 0 0 0
0 1 1 1 0 0 0 0 0
1 0 0 0 0 0 0 1 1
1 0 1 1 1 0 1 0 0
1 1 1
1 1 0 0 0 1
1 1 1 1 1 1
A(t+1) = DA(A,B,x) = ∑(3,5,7)
B(t+1) = DB(A,B,x) = ∑(1,5,7)
y(A,B,x) = ∑(6,7)
Design Example: with D flip-flop

A(t+1) = DA(A,B,x) = ∑(3,5,7)


B(t+1) = DB(A,B,x) = ∑(1,5,7)
y(A,B,x) = ∑(6,7)
Logic Diagram with D flip-flop

DA = Ax + Bx
DB = Ax + B’x
Y = AB
Implementation by other flip-flop
Characteristic Table
D Q(t+1) T Q(t+1) J K Q(t+1)
0 0 0 Q(t) 0 0 Q(t)
1 1 1 Q’(t) 0 1 0
1 0 1
1 1 Q’(t)
Excitation Table

Q(t) Q(t+1) D Q(t) Q(t+1) T Q(t) Q(t+1) J K


0 0 0 0 0 0 0 0
0 1 1 0 1 1 0 1
1 0 0 1 0 1 1 0
1 1 1 1 1 0 1 1
Implementation by other flip-flop
Characteristic Table
D Q(t+1) T Q(t+1) J K Q(t+1)
0 0 0 Q(t) 0 0 Q(t)
1 1 1 Q’(t) 0 1 0
1 0 1
1 1 Q’(t)
Excitation Table

Q(t) Q(t+1) D Q(t) Q(t+1) T Q(t) Q(t+1) J K


0 0 0 0 0 0 0 0 0 X
0 1 1 0 1 1 0 1 1 X
1 0 0 1 0 1 1 0 X 1
1 1 1 1 1 0 1 1 X 0
Design: with J-K Flip-Flop
Design: with J-K Flip-Flop
Design: with J-K Flip-Flop
Design: with J-K Flip-Flop
Reference

Text Book:

Digital Design – M. Morris Mano, Third Edition

Chapter 5: Page No. 168 – 190


Page No. 203 – 211
CS 322M Digital Logic & Computer Architecture

Counters and Registers

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Counter Design: T Flip-Flop
3-bit counter
Counter Design: T Flip-Flop
Counter Design: T Flip-Flop
Counter
n-bit counter: Range:

Decade counter:

Mod-m counter:
Binary Ripple Counter

• Reset signal sets all outputs


to 0
• Count signal toggles output
of low-order flip flop
• Low-order flip flop provides
trigger for adjacent flip flop
• Not all flops change value
simultaneously
– Lower-order flops change first
Asynchronous Ripple Counter
Synchronous Counter
Synchronous Up/Down Counter

Up Down Function
1 X count up
0 1 count down
0 0 no change

Function Table
Counter with Parallel Load

Clear Clk Load Count Function


0 X X X Clear to 0
1 ↑ 1 X Load inputs
1 ↑ 0 1 Count
1 ↑ 0 0 No Change

Function Table
Register with Parallel Load
Register with Parallel Load
Shift Register
Serial Transfer

Time Reg A Reg B


T0 1011 0011
T1 1101 1001
T2 1110 1100
T3 0111 0110
T4 1011 1011
Universal Shift Register
Reference

Text Book:

Digital Design – M. Morris Mano, Third Edition

Chapter 6: Page No. 217 – 244


CS 322M Digital Logic & Computer Architecture

Computer Fundamentals

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Model of Computer

Why do we use Computer?

How Computer Works?

-- Model of Computer
Computer Model
Computer Model
Algorithm: Procedure/Method to achieve
desired result

Computer Program:
- Set of Instructions
- Executes in Sequence

Body --- Hardware

Life --- Software


• Operating System, Compiler, editor,
other tools
Computer Programming Languages

High Level Language: User Readable and understandable


( C, Pascal, Java, Cobol……)

Assembly Language: (instruction as: add, mov, mul, div, etc…)

Machine Language: sequence of 0s & 1s


Von Neumann Principle
• Stored Program concept
• Main memory storing programs and data
• ALU operating on binary data
• Control unit interpreting instructions from
memory and executing
• Input and output equipment operated by
control unit
• Princeton Institute for Advanced Studies
– IAS
• Completed 1952
Structure of Von Neumann machine
Computer : Structure & Function
• Structure is the way in which components
relate to each other
• Function is the operation of individual
components as part of the structure
Function
• All computer functions are:
– Data processing
– Data storage
– Data movement
– Control
Functional View
Operations (a) Data movement
Operations (b) Storage
Operation (c) Processing from/to storage
Operation (d) Processing from storage to I/O
Structure - Top Level

Peripherals Computer

Central Main
Processing Memory
Unit

Computer
Systems
Interconnection

Input
Output
Communication
lines
Structure - The CPU

CPU

Computer Arithmetic
Registers and
I/O Login Unit
System CPU
Bus
Internal CPU
Memory Interconnection

Control
Unit
Structure - The Control Unit

Control Unit

CPU
Sequencing
ALU Logic
Control
Internal
Unit
Bus
Control Unit
Registers Registers and
Decoders

Control
Memory
Von Neumann Principle
• Stored Program concept
• Main memory storing programs and data
• ALU operating on binary data
• Control unit interpreting instructions from
memory and executing
• Input and output equipment operated by
control unit
• Princeton Institute for Advanced Studies
– IAS
• Completed 1952
Structure of Von Neumann machine
CPU Internal Structure
What is a program?

• A sequence of steps (instructions)


• For each step, an arithmetic or logical
operation is done
• For each operation, a different set of control
signals is needed
Instruction Cycle

• Two steps:
– Fetch
– Execute
Instruction Cycle with Indirect
Structure of Von Neumann machine
Computer Components: Top Level View
CPU With System Bus
Reference

Computer Organization and Architecture –


Designing for Performance
William Stallings, Seventh Edition

Chapter 1: Page no. 7 – 15


CS 322M Digital Logic & Computer Architecture

Computer Fundamentals

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Von Neumann Principle
• Stored Program concept
• Main memory storing programs and data
• ALU operating on binary data
• Control unit interpreting instructions from
memory and executing
• Input and output equipment operated by
control unit
• Princeton Institute for Advanced Studies
– IAS
• Completed 1952
Structure of Von Neumann machine
CPU Internal Structure
What is a program?

• A sequence of steps (instructions)


• For each step, an arithmetic or logical
operation is done
• For each operation, a different set of control
signals is needed
Instruction Cycle

• Two steps:
– Fetch
– Execute
Instruction Cycle with Indirect
Structure of Von Neumann machine
Computer Components: Top Level View
CPU With System Bus
What is a Bus?
• A communication pathway connecting two
or more devices
• Usually broadcast
• Often grouped
– A number of channels in one bus
– e.g. 32 bit data bus is 32 separate single bit
channels
Advantage of Bus System

• Multiple devices may be connected to a bus


• Transfer data between devices through bus
Data Bus

• Carries data
– Remember that there is no difference between
“data” and “instruction” at this level
• Width is a key determinant of performance
– 8, 16, 32, 64 bit
Address bus

• Identify the source or destination of data


• e.g. CPU needs to read an instruction
(data) from a given location in memory
• Bus width determines maximum memory
capacity of system
– e.g. 8080 has 16 bit address bus giving 64k
address space
Address and Data of a Memory Location

Address of a Memory Location


Data of a memory Location
Address and Data of a Memory Location

Byte organized memory


Control Bus

• Control and timing information


– Memory read/write signal
– Interrupt request
– Clock signals
Interface between CPU and Memory

MAR: Memory Address Register


MBR: Memory Buffer Register

Data movement: Memory and CPU register


Interface between CPU and Memory

IR: Instruction Register


PC: Program Counter

Data movement: Between CPU registers


Fetch Cycle
• Program Counter (PC) holds address of
next instruction to fetch
• Processor fetches instruction from memory
location pointed to by PC
• Increment PC
– Unless told otherwise
• Instruction loaded into Instruction Register
(IR)
• Processor interprets instruction and
performs required actions
Indirect Cycle

• May require memory access to fetch


operands
• Indirect addressing requires more memory
accesses
• Can be thought of as additional instruction
subcycle
Execute Cycle
• Processor-memory
– data transfer between CPU and main memory
• Processor I/O
– Data transfer between CPU and I/O module
• Data processing
– Some arithmetic or logical operation on data
• Control
– Alteration of sequence of operations
– e.g. jump
• Combination of above
Instruction Cycle with Interrupts
Reference

Computer Organization and Architecture –


Designing for Performance
William Stallings, Seventh Edition

Chapter 3: Page no. 56 – 79


CS 322M Digital Logic & Computer Architecture

Computer Fundamentals

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
What is a program?

• A sequence of steps (instructions)


• For each step, an arithmetic or logical
operation is done
• For each operation, a different set of control
signals is needed
Instruction Cycle

• Two steps:
– Fetch
– Execute
Computer Components: Top Level View
Example of Program Execution
Data Bus and Address Bus
• Size of Address Bus:
SIZE BINARY DEC HEXA
8 0000 0000 0 00
8 1111 1111 255 FF
8 0101 0111 87 57
8 0000 0110 6 06
10 11 1111 1111 1023 3FF
12 1111 1111 1111 4095 FFF
16 1111 1111 1111 1111 216 -1 FFFF
20 1111 1111 1111 1111 1111 220 -1 FFFFF
30 11 ………………………….. 1111 230 -1 3FFFFFFF
32 1111 ………………………… 1111 232 -1 FFFFFFFF
Data Bus and Address Bus
• Size of Address Bus and Memory Capacity:

SIZE BINARY DEC HEXA Size


8 0000 0000 0 00
8 1111 1111 255 FF 256
10 11 1111 1111 1023 3FF 1K
12 1111 1111 1111 4095 FFF 4K
16 1111 1111 1111 1111 216 -1 FFFF 64K
20 1111 1111 1111 1111 1111 220 -1 FFFFF 1M
30 11 ………………………….. 1111 230 -1 3FFFFFFF 1G
32 1111 ……………………..… 1111 232 -1 FFFFFFFF 4G
Data Bus and Address Bus
• Size of Data Bus/Memory Location:
SIZE BINARY DEC HEXA
8 1111 1111 -127 +127 00 - FF
0111 1111
12 1111 1111 1111 -2047 000 - FFF
0111 1111 1111 +2047
16 1111 1111 1111 1111 -(215 – 1) 0000- FFFF
0111 1111 1111 1111 +(215 – 1)
20 1111 1111 1111 1111 1111 -(219 -1) 00000 - FFFFF
0111 1111 1111 1111 1111 +(219 -1)
32 1111 ……………………….1111 -(231 -1) 00000000 –
0111 ……………………….1111 +(231 -1) FFFFFFFF
Example of Program Execution
CPU Organization

Fetch Cycle:

MAR <- PC
Read
PC <- PC+1
IR <- MBR
Example of Program Execution
Instruction Execution

Fetch Cycle: Format of Instruction: Execute Cycle:

MAR <- PC 4 bits: Operation MAR <- IRAddress


Read 12 bits: Address Read
PC <- PC+1 AC <- MBR
IR <- MBR
(Data Movement)
Machine Instruction
Machine Instruction Format Assembly
Instruction Operation Address Code
1940 0001 1001 0100 0000 LDA M
5941 0101 1001 0100 0001 ADD M
2941 0010 1001 0100 0001 STA M

(LDA M) LOAD AC: Load the accumulator by the contents of memory location
specified in the instruction

(ADD M) ADD AC: Add the contents of memory location specified in the
instruction to accumulator and store the result in accumulator

(STA M) STORE AC: Store the contents of accumulator the memory location
specified in the instruction
Computer Program
High Level Code Assembly Code Machine Code (HEX)
Y=X+Y LDA X 1940
ADD Y 5941
STA Y 2941

Size of Operation Code (Op Code): 4 bits

16 possible instructions Used: 1: LDA M, 5: ADD M, 2: STA M

Size of Address Bus: 12 bits

Addressable Memory Location: 212 = 4096 = 4 K

Size of Data Bus: 16 bits

Size of each location of memory: 16 bits

Size of Memory Module: 4096 x 16 = 4096 x 2 x 8 = 8 KB (Kilo Byte)


Reference

Computer Organization and Architecture –


Designing for Performance
William Stallings, Seventh Edition

Chapter 3: Page no. 59 – 64


CS 322M Digital Logic & Computer Architecture

CPU Registers

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Components of Computer
• The Control Unit and the Arithmetic and Logic
Unit constitute the Central Processing Unit (CPU)
• CPU has temporary storage space - Registers
• Data and instructions need to get into the system
and results out
– Input/output
• Temporary storage of code and results is needed
– Main memory
Connecting

• All the units must be connected


• Different type of connection for different
type of unit
– Memory
– Input/Output
– CPU
Memory Connection

• Receives and sends data


• Receives addresses (of locations)
• Receives control signals
– Read
– Write
– Timing
Input/Output Connection

• Similar to memory from computer’s


viewpoint
• Output
– Receive data from computer
– Send data to peripheral
• Input
– Receive data from peripheral
– Send data to computer
Computer Components: Top Level View
CPU Internal Structure
ALU
• ALU: Arithmetic and Logic Unit
Registers

• CPU must have some working space


(temporary storage)
• Called registers
• Number and function vary between
processor designs
• One of the major design decisions
• Top level of memory hierarchy
User Visible Registers

• General Purpose
• Data
• Address
• Condition Codes
How Many GP Registers?
• Between 8 - 32
• Fewer = more memory references
• More does not reduce memory references and
takes up processor space
• Large enough to hold full address
• Large enough to hold full word
• Often possible to combine two data registers
– C programming
– double int a;
– long int a;
Control & Status Registers

• Program Counter
• Instruction Decoding Register
• Memory Address Register
• Memory Buffer Register
Condition Code Registers

• Sets of individual bits


– e.g. result of last operation was zero
• Can be read (implicitly) by programs
– e.g. Jump if zero
• Can not (usually) be set by programs
• Needs for conditional instructions
Condition Code Registers

• Sets of individual bits


– e.g. result of last operation was zero
For example:

for (i=10; i > 0; i--)


a[i] = a[i]+10;
next instruction
Program Status Word
• A set of bits (Flag bits)
• Includes Condition Codes
• Sign of last result
• Zero
• Carry
• Equal
• Overflow
• Interrupt enable/disable
• Supervisor
Program Status Word
• Condition Flag Bits (Depends on ALU
Operation)
– Sign, Zero, Carry, Equal, Overflow
• Flag bits set by programmer
– Interrupt enable/disable
– Supervisor

X Sup IE OV E C Z S
Reference

Computer Organization and Architecture –


Designing for Performance
William Stallings, Seventh Edition

Chapter 12: Page no. 416 – 423


CS 322M Digital Logic & Computer Architecture

Instruction Sets: Characteristics and


Functions

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
What is an Instruction Set?
• The complete collection of instructions that are
understood by a CPU
• Format of the instruction
• Machine Code
– Binary
• Usually represented by assembly codes

High Level Code Assembly Code Machine Code in HEX (Binary)


Y=X+Y LDA 940 1940 (0001 1001 0100 0000)
ADD 941 5941 (0101 1001 0100 0001)
STA 941 2941 (0010 1001 0100 0001)
What is an Instruction Set?
• Instruction set of a CPU with opcode size of 4 bits.
• CPU has three GPRs: AC, R0 and R1

OPCODE Operation OPCODE Operation

0000 NO OPERATION (NO-OP) 1000 MOV R0, AC (R0 = AC)

0001 LDA M (AC = [M]) 1001 MOV R1, AC (R1 = AC)

0010 STA M ( [M] = AC) 1010 MOV AC, R0 (AC = R0)

0011 DEC AC (AC=AC-1) 1011 MOV AC, R1 (AC = R1)

0100 INC AC (AC=AC+1) 1100 SUB M (AC = AC-[M])

0101 ADD M (AC=AC+[M]) 1101 SUB R0 (AC = AC-R0)

0110 ADD R0 (AC = AC+R0) 1110 SUB R1 (AC = AC-R1)

0111 ADD R1 (AC = AC+R1) 1111 HALT


What is an Instruction Set?
• Instruction set of a CPU with opcode size of 4 bits.
• CPU has three GPRs: AC, R0 and R1

OPCODE Operation
1000 MOV R0, AC (R0 = AC)

1001 MOV R1, AC (R1 = AC)

1010 MOV AC, R0 (AC = R0)

1011 MOV AC, R1 (AC = R1)


Elements of an Instruction
• Operation code (Op code)
– Do this
• Source Operand reference
– To this
• Result Operand reference
– Put the answer here
• Next Instruction Reference
– When you have done that, do this...
Instruction Representation
• In machine code each instruction has a
unique bit pattern
• For human consumption (well,
programmers anyway) a symbolic
representation is used
– e.g. ADD, SUB, LOAD
• Operands can also be represented in this
way
– ADD A,B
Simple Instruction Format
Instruction Types
• Data processing
• Data storage (main memory)
• Data movement (I/O)
• Program flow control
Number of Addresses (a)
• 3 addresses
– Operand 1, Operand 2, Result (2 sources,
one destination)
– a = b + c;
– ADD a, b, c
– May be a forth - next instruction (usually
implicit)
– Not common
– Needs very long words to hold everything
Number of Addresses (b)
• 2 addresses
– One address doubles as operand and result
(source as well as destination)
–a=a+b
– ADD a, b
– Reduces length of instruction
– Requires some extra work
• Temporary storage to hold some results
Number of Addresses (c)
• 1 address
– Implicit address of one operand
– Usually a register (accumulator)
– Common on early machines
Number of Addresses
Number of Addresses (d)
• 0 (zero) addresses
– All addresses implicit
– Uses a stack
– e.g. push a
– push b
– add
– pop c

–c=a+b
How Many Addresses
• More addresses
– More complex (powerful?) instructions
– More registers
• Inter-register operations are quicker
– Fewer instructions per program
• Fewer addresses
– Less complex (powerful?) instructions
– More instructions per program
– Faster fetch/execution of instructions
Design Decisions (1)
• Operation repertoire
– How many ops?
– What can they do?
– How complex are they?
• Data types
• Instruction formats
– Length of op code field
– Number of addresses
Design Decisions (2)
• Registers
– Number of CPU registers available
– Which operations can be performed on which
registers?
• Addressing modes
Types of Operand
• Addresses
• Numbers
– Integer/floating point
• Characters
– ASCII etc.
• Logical Data
– Bits or flags
Specific Data Types
• General - arbitrary binary contents
• Integer - single binary value
• Ordinal - unsigned integer
• Unpacked BCD - One digit per byte
• Packed BCD - 2 BCD digits per byte
• Near Pointer - offset within segment
• Bit field
• Byte String
• Floating Point
Integer Representation
• Only have 0 & 1 to represent everything
• Positive numbers stored in binary
– e.g. 41=00101001
• Sign-Magnitude
• Two’s compliment
Floating Point

• +/- .significand x 2exponent


• Misnomer
• Point is actually fixed between sign bit and body of
mantissa
• Exponent indicates place value (point position)
• IEEE 754 single format
• IEEE 754 double format (64bits = 1+11+52)
Types of Operation

• Data Transfer
• Arithmetic
• Logical
• Conversion
• I/O
• System Control
• Transfer of Control
Data Transfer
• Specify
– Source
– Destination
– Amount of data
• May be different instructions for different
movements
– e.g. IBM 370
• Or one instruction and different addresses
– e.g. VAX
Arithmetic
• Add, Subtract, Multiply, Divide
• Signed Integer
• Floating point
• May include
– Increment (a++)
– Decrement (a--)
– Negate (-a)
Shift and Rotate Operations
SHR

SHL

ASR

ASL

ROR

ROL
Logical
• Bitwise operations
• AND, OR, NOT
Input/Output
• May be specific instructions
• May be done using data movement
instructions (memory mapped)
• May be done by a separate controller
(DMA)
Systems Control
• Privileged instructions
• CPU needs to be in specific state
– Kernel mode
• For operating systems use
Transfer of Control
• Branch
– e.g. branch to x if result is zero
• Skip
– e.g. increment and skip if zero
• Conditional Instruction
– ISZ Register1
– Branch xxxx
– BNZ xxxx
– BP xxxx
• Subroutine call
– interrupt call
Branch Instruction
Nested Procedure Calls
Reference

Computer Organization and Architecture –


Designing for Performance
William Stallings, Seventh Edition

Chapter 10: Page no. 335 – 359


CS 322M Digital Logic & Computer Architecture

Instruction Sets: Addressing Modes


and Formats

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Instructions
• Instruction Set
• Format of Instructions
– Single Operand
– Two Operands
– Three Operands
Addressing Modes
• Immediate
• Direct
• Indirect
• Register
• Register Indirect
• Displacement (Indexed)
• Stack
Immediate Addressing
• Operand is part of instruction
• Operand = address field
• e.g. ADD 5
– Add 5 to contents of accumulator
– 5 is operand
• No memory reference to fetch data
• Fast
• Limited range
Immediate Addressing Diagram

Instruction
Opcode Operand
Direct Addressing
• Address field contains address of operand
• Effective address (EA) = address field (A)
• e.g. ADD A
– Add contents of cell A to accumulator
– Look in memory at address A for operand
• Single memory reference to access data
• No additional calculations to work out
effective address
• Limited address space
Direct Addressing Diagram

Instruction
Opcode Address A
Memory

Operand
Indirect Addressing

• Memory cell pointed to by address field


contains the address of (pointer to) the
operand
• EA = (A)
– Look in A, find address (A) and look there for
operand
• e.g. ADD (A)
– Add contents of cell pointed to by contents of
A to accumulator
Indirect Addressing
• Large address space
• May be nested, multilevel, cascaded
– e.g. EA = (((A)))
• Multiple memory accesses to find operand
• Hence slower
Indirect Addressing Diagram
Instruction
Opcode Address A
Memory

Pointer to operand

Operand
Register Addressing
• Operand is held in register named in
address filed
• EA = R
• Limited number of registers
• Very small address field needed
– Shorter instructions
– Faster instruction execution
Register Addressing
• No memory access
• Very fast execution
• Very limited address space
Register Addressing Diagram

Instruction
Opcode Register Address R
Registers

Operand
Register Indirect Addressing
• Indirect addressing
• EA = (R)
• Operand is in memory cell pointed to by
contents of register R
• One fewer memory access than indirect
addressing
Register Indirect Addressing Diagram

Instruction
Opcode Register Address R
Memory

Registers

Pointer to Operand Operand


Displacement Addressing
• EA = A + (R)
• Address field hold two values
– A = base value
– R = register that holds displacement
– or vice versa
Displacement Addressing Diagram

Instruction
Opcode Register R Address A
Memory

Registers

Pointer to Operand + Operand


Relative Addressing
• A version of displacement addressing
• R = Program counter, PC
• EA = A + (PC)
• i.e. get operand from A cells from current
location pointed to by PC
• c.f locality of reference & cache usage
Base-Register Addressing
• A holds displacement
• R holds pointer to base address
• R may be explicit or implicit
• e.g. segment registers in 80x86
Indexed Addressing
• A = base
• R = displacement
• EA = A + R
• Good for accessing arrays
– EA = A + R
– R++
Stack Addressing
• Operand is (implicitly) on top of stack
• e.g.
– ADD Pop top two items from stack
and add
Instruction Formats

• Layout of bits in an instruction


• Includes opcode
• Includes (implicit or explicit) operand(s)
• Usually more than one instruction format in
an instruction set
Instruction Length
• Affected by and affects:
– Memory size
– Memory organization
– Bus structure
Allocation of Bits

• Number of addressing modes


• Number of operands
• Register versus memory
• Number of registers
Instruction Formats
• MOV R1, M : R1<- [M]
– Instruction format: one word
Instruction Formats
• MOV R1, M : R1<- [M]
– Instruction format: one word

– RTL description of instruction cycle


Instruction Fetch: Execute:
MAR <- PC MAR <- IRAddress
Read Read
PC <- PC+1 R1<-MBR
IR <- MBR
Instruction Formats
• MOV R1, M : R1<- [M]
– Instruction format: two words, first word is the
opcode and second word is the address of the
operand.
Instruction Formats
• MOV R1, M : R1<- [M]
– Instruction format: two words, first word is the
opcode and second word is the address of the
operand.
– RTL description
Instruction Fetch: Execute:
MAR <- PC MAR <- PC
Read Read
PC <- PC+1 PC<- PC+1
IR <- MBR R1 <- MBR
Reference

Computer Organization and Architecture –


Designing for Performance
William Stallings, Seventh Edition

Chapter 11: Page no. 386 – 398


CS 322M Digital Logic & Computer Architecture

8085 Microprocessor

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Intel 8085 CPU Block Diagram
8085 Microprocessor
• Data Bus: 8 bits
– Size of register: 8 bits
– can handle 16 bits
• Address Bus: 16 bits
– memory is byte organized
– Memory size: 64 KB (216)
• Data bus and address bus are multiplexed
Intel 8085 Pin Configuration
Instruction Fetch Cycle
Memory Read and Write Cycle
8085 Instruction Format
• One Byte
– D7 D6 D5 D4 D3 D2 D1 D0 Byte 1
• Two Bytes
– D7 D6 D5 D4 D3 D2 D1 D0 Byte 1
D7 D6 D5 D4 D3 D2 D1 D0 Byte 2

• Three Bytes
D7 D6 D5 D4 D3 D2 D1 D0 Byte 1
– D7 D6 D5 D4 D3 D2 D1 D0 Byte 2
D7 D6 D5 D4 D3 D2 D1 D0 Byte 3
8085 Instruction Set
8085 Instruction Set
8085 Instruction Set
8085 Instruction Set
8085 Instruction Set
8085 Instruction Set
8085 Instruction Set
8085 Instruction Set
8085 Instruction Set
8085 Instruction Set
8085 Instruction Set
8085 Instruction Set

• RETURN (return)
PCL <- (SP)
PCH <- ((SP) + 1)
(SP) <- (SP) + 2
Instruction: CALL/RETURN
Nested Procedure Calls
Instruction Cycle
• Two steps:
—Fetch
—Execute
Instruction Cycle with Interrupts
Subroutine/Procedure/Function
• Independent unit of code to perform a
subtask of the main task.
• Used in modular programming
• How to provide facility for procedure call
Procedure Call
• Tasks to be performed before procedure
CALL
— Retain the current status of the processor
— After returning from procedure/interrupt
routine, we must restart the execution from
the point where we have stopped.
• Current status of the processor
— Program Counter
— Program Status Word (PSW)
• How to Retain these information
• Any other information need to be saved?
Provision in Organization
• Store the relevant information in main
memory
— Implement a stack in MM (Control Stack)
• Need to keep the address of TOP of stack
— Use of a register, SP: Stack Pointer
— To keep the address of the Top of the Stack
• After completion of the procedure, restore
the information from stack
Instructions
• PUSH R
— source is the register R
• POP R
— destination is the register R
• CALL address
— starting address of the procedure
• RETURN

• Four different ways for implementation


PUSH (Execute)
• PUSH Ri
— MAR <- SP
— MDR <- Ri
— Write
— SP <- SP - 1
POP (Execute)
• POP Ri
— SP <- SP +1
— MAR <- SP
— Read
— Ri <- MDR
CALL (Execute)
• CALL
— MAR <- SP
— MDR <- PC
— Write
— SP <- SP -1
— MAR <- SP
— MDR <- PSW
— Write
— SP <- SP -1
— PC <- IRaddress
CALL (Execute)
• CALL address
— MAR <- PC
— Read
— PC <- PC + 1
— TEMP <- MDR
— MAR <- SP
— MDR <- PC
— Write
— SP <- SP -1
— MAR <- SP
— MDR <- PSW
— Write
— SP <- SP -1
— PC <- TEMP
RETURN (Execute)
• RETURN
— SP <- SP + 1
— MAR <- SP
— Read
— PSW <- MDR
— SP <- SP + 1
— MAR <- SP
— Read
— PC <- MDR
Simple Interrupt
Processing
Changes in Memory and Registers
for an Interrupt
CS 322M Digital Logic & Computer Architecture

Main Memory

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Structure - Top Level

Peripherals Computer

Central Main
Processing Memory
Unit

Computer
Systems
Interconnection

Input
Output
Communication
lines
Structure - The CPU

CPU

Computer Arithmetic
Registers and
I/O Login Unit
System CPU
Bus
Internal CPU
Memory Interconnection

Control
Unit
CPU Internal Structure
Computer Components
Semiconductor Memory
• RAM (Random Access Memory)
– Misnamed as all semiconductor memory is
random access
– Read/Write
– Volatile
– Temporary storage
– Static or dynamic
Memory Cell Operation
Memory Cell
Memory Module
Memory Module
Parameters of Memory Module:
• Addressable Memory Space
– Size of Address Bus
• Size of Memory Location
– Size of Data Bus
• Memory Capacity
• Memory Organization
– Byte organized
Dynamic RAM
• Bits stored as charge in capacitors
• Charges leak
• Need refreshing even when powered
• Simpler construction
• Smaller per bit
• Less expensive
• Need refresh circuits
• Slower
• Main memory
• Essentially analogue
– Level of charge determines value
Dynamic RAM Structure
DRAM Operation
• Address line active when bit read or written
– Transistor switch closed (current flows)
• Write
– Voltage to bit line
• High for 1 low for 0
– Then signal address line
• Transfers charge to capacitor
• Read
– Address line selected
• transistor turns on
– Charge from capacitor fed via bit line to sense amplifier
• Compares with reference value to determine 0 or 1
– Capacitor charge must be restored
Static RAM
• Bits stored as on/off switches
• No charges to leak
• No refreshing needed when powered
• More complex construction
• Larger per bit
• More expensive
• Does not need refresh circuits
• Faster
• Cache
• Digital
– Uses flip-flops
Stating RAM Structure

State 1
C1 high, C2 low
T1 T4 off, T2 T3 on

State 0
C1 low, C2 high
T1 T4 on, T2 T3 off
Static RAM Operation
• Transistor arrangement gives stable logic
state
• State 1
– C1 high, C2 low
– T1 T4 off, T2 T3 on
• State 0
– C2 high, C1 low
– T2 T3 off, T1 T4 on
• Address line transistors T5 T6
• Write – apply value to B & compliment to B
• Read – value is on line B
SRAM v DRAM
• Both volatile
– Power needed to preserve data
• Dynamic cell
– Simpler to build, smaller
– More dense
– Less expensive
– Needs refresh
– Larger memory units
• Static
– Faster
– Less dense
– Cache
Reference
Computer Organization and Architecture –
Designing for Performance
William Stallings, Seventh Edition

Chapter 05: Internal Memory

Computer Organization
Hamacher, Vranesic and Zaky, Fifth Edition

Chapter05: Page No.: 291 - 314


CS 322M Digital Logic & Computer Architecture

Main Memory

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
SRAM v DRAM
• Both volatile
– Power needed to preserve data
• Dynamic cell
– Simpler to build, smaller
– More dense
– Less expensive
– Needs refresh
– Larger memory units
• Static
– Faster
– Less dense
– Cache
Read Only Memory (ROM)
• ROM
– Semiconductor memory
– Random Access
• Permanent storage
– Nonvolatile
• Microprogramming
• Library subroutines
• Systems programs (BIOS)
• Function tables
Types of ROM
• Written during manufacture
– ROM
• Programmable (once)
– PROM
– Needs special equipment to program
• Programmable (Read “mostly”)
– Erasable Programmable (EPROM)
• Erased by UV
– Electrically Erasable (EEPROM)
• Takes much longer to write than read
Organisation in detail
• Memory chip of 16Mbit = 16Mx1
• A 16Mbit chip can be organised as 1M of
16 bit words
• A bit per chip system has 16 lots of 1Mbit
chip with bit 1 of each word in chip 1 and so
on
Organisation in detail
• A 16Mbit chip can be organised as a 4096
x 4096 array
– Reduces number of address pins
• Multiplex row address and column address
• 12 pins to address (212=4096)
• Adding one more pin doubles range of values so x4
capacity
Organisation in detail
Organisation in detail
Dynamic RAM Structure
Refreshing
• Refresh circuit included on chip
• Disable chip
• Count through rows
• Read & Write back
• Takes time
• Slows down apparent performance
Typical 16 Mb DRAM (4M x 4)
Memory Module Organisation
256K bit Memory
chip
Memory Module Organisation
256KByte Module
Organisation
Memory Module Organisation
1MByte Module Organisation
Memory Module Organization

Consider memory chip of capacity 1 MB (1Mx8)

- Construct a memory module of capacity 4MB (2Mx16)


(Size of Address bus and size of data bus)

- Construct a memory module of capacity 16 MB (8Mx16)


(Size of Address bus and size of data bus)

- Construct a memory module of capacity 32 MB (8Mx32)


(Size of Address bus and size of data bus)

- Construct a memory module of capacity 128 MB (32Mx32)


(Size of Address bus and size of data bus)
Memory Module Organization

Consider memory chip of capacity 1 MB (1Mx8)

- Construct a memory module of capacity 8 MB (4Mx2)


(Size of Address bus and size of data bus)
Memory Module Organization

Consider memory chip of capacity 1 MB (1Mx8)

- Construct a memory module of capacity 8 MB (4Mx2)


- Byte organized
(Size of Address bus and size of data bus)
Computer Components: Top Level View
Fetch Sequence (symbolic)
• t1: MAR <- PC
• t2: MBR <- (memory)
• PC <- PC +1
• t3: IR <- MBR

– (tx = time unit/clock cycle)


– (Speed of CPU and Memory)
Memory Read(symbolic)
• t1: MAR <- R1
• t2: MBR <- (memory)
• t3: R2 <- MBR

– (tx = time unit/clock cycle)

Address of the memory location is in


register R1 and data to be stored in register
R2
Memory Write(symbolic)
• t1: MAR <- R1
• t2: MBR <- R2
• t3: (memory) <- MBR

– (tx = time unit/clock cycle)

Address of the memory location is in


register R1 and data is in register R2
Reference
Computer Organization and Architecture –
Designing for Performance
William Stallings, Seventh Edition

Chapter 05: Internal Memory

Computer Organization
Hamacher, Vranesic and Zaky, Fifth Edition

Chapter05: Page No.: 291 - 314


CS 322M Digital Logic & Computer Architecture

Cache Memory

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
So you want fast?
• It is possible to build a computer which
uses only static RAM
• This would be very fast
• This would cost high

• Alternatives??
Locality of Reference
• During the course of the execution of a
program, memory references tend to
cluster
Locality of Reference
• During the course of the execution of a
program, memory references tend to
cluster

• e.g. loops
Cache
• Small amount of fast memory
• Sits between normal main memory and
CPU
• May be located on CPU chip or module
Memory Hierarchy
• Registers
• L1 Cache
• L2 Cache
• Main memory
• Disk cache
• Disk
• Optical
• Tape
Cache/Main Memory Structure
Cache operation – overview
• CPU requests contents of memory location
• Check cache for this data
• If present, get from cache (fast)
• If not present, read required block from
main memory to cache
• Then deliver from cache to CPU
• Cache includes tags to identify which block
of main memory is in each cache slot
Cache Read Operation - Flowchart
Cache Design

• Size
• Mapping Function
• Replacement Algorithm
• Write Policy
• Block Size
• Number of Caches
Size does matter
• Cost
– More cache is expensive
• Speed
– More cache is faster (up to a point)
– Checking cache for data takes time
Typical Cache Organization
Write Policy

• Must not overwrite a cache block unless


main memory is up to date
• Multiple CPUs may have individual caches
• I/O may address main memory directly
Write through
• All writes go to main memory as well as
cache
• Multiple CPUs can monitor main memory
traffic to keep local (to CPU) cache up to
date
• Lots of traffic
• Slows down writes

• Remember bogus write through caches!


Write back

• Updates initially made in cache only


• Update bit for cache slot is set when update
occurs
• If block is to be replaced, write to main
memory only if update bit is set
• N.B. 15% of memory references are writes
CS 322M Digital Logic & Computer Architecture

Cache Memory

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Cache Design
• Size
• Mapping Function
• Replacement Algorithm
• Write Policy
• Block Size
• Number of Caches
Mapping Function
• Cache of 64kByte
• Cache block of 16 bytes
– i.e. cache is 4k (212) lines of 16 bytes
• 16MBytes main memory
• 24 bit address
– (224=16M)
Direct Mapping
• Each block of main memory maps to only
one cache line
– i.e. if a block is in cache, it must be in one
specific place
• Address is in two parts
• Least Significant w bits identify unique word
• Most Significant s bits specify one memory
block
• The MSBs are split into a cache line field r
and a tag of s-r (most significant)
Direct Mapping Address Structure

Tag s-r Line or Slot r Word w


8 12 4

• 24 bit address
• 4 bit word identifier (16 byte block)
• 20 bit block identifier
– 8 bit tag (=20-12)
– 12 bit slot or line
• No two blocks in the same line have the same Tag field
• Check contents of cache by finding line and checking Tag
Direct Mapping Function

• Direct mapping function:


– i = j modulo m
• Where
– i = cache line number
– j = main memory block number
– m = number of lines in the cache
Direct Mapping Cache Line Table

• Cache line Main Memory blocks held


• 0 0, m, 2m, 3m,…,2s-m
• 1 1,m+1, 2m+1,…,2s-m+1

• m-1 m-1, 2m-1,3m-1,…,2s-1


Direct Mapping Cache Organization
Direct Mapping Summary
• Address length = (s + w) bits
• Number of addressable units = 2s+w words
or bytes
• Block size = line size = 2w words or bytes
• Number of blocks in main memory = 2s+w/2w
= 2s
• Number of lines in cache = m = 2r
• Size of tag = (s – r) bits
Direct Mapping pros & cons
• Simple
• Inexpensive
• Fixed location for given block
– If a program accesses 2 blocks that map to the
same line repeatedly, cache misses are very
high
Associative Mapping
• A main memory block can load into any line
of cache
• Memory address is interpreted as tag and
word
• Tag uniquely identifies block of memory
• Every line’s tag is examined for a match
• Cache searching gets expensive
Fully Associative Cache Organization
Associative Mapping Address Structure

Word
Tag 20 bit 4 bit

• 20 bit tag stored with each 16 byte block of data


• Compare tag field with tag entry in cache to check for hit
• Least significant 4 bits of address identify which byte is
required from 16 byte data
Associative Mapping Summary
• Address length = (s + w) bits
• Number of addressable units = 2s+w words
or bytes
• Block size = line size = 2w words or bytes
• Number of blocks in main memory = 2s+w/2w
= 2s
• Number of lines in cache = cache size/2w
• Size of tag = s bits
Direct and Associative Mapping
Set Associative Mapping
• Cache is divided into a number of sets
• Each set contains a number of lines
• A given block maps to any line in a given
set
– e.g. Block B can be in any line of set i
• e.g. 2 lines per set
– 2 way associative mapping
– A given block can be in one of 2 lines in only
one set
Set Associative Mapping
• The cache is divided in v sets
• Each set consists of k lines
• Number of lines in the cache
– m=vxk
• The mapping function:
– i = j modulo v
• Where
– i = cache set number
– j = main memory block number
K way set associative Mapping

Cache Line Table


• Set no Main Memory blocks held
• 0 0, v, 2v, 3v,…,2s-v
• 1 1,v+1, 2v+1,…,2s-v+1

• v-1 v-1, 2v-1,3v-1,…,2s-1


K Way Set Associative Cache Organization
Set Associative Mapping Address Structure

Word
Tag 8 bit Set 12 bit 4 bit

• Use set field to determine cache set to look


in
• Compare tag field to see if we have a hit
• e.g
– Address Tag Word Set number
– 1F 17F B 1F B 17E
– 20 17E C 20 C 17E
Set Associative Mapping Summary
• Address length = (s + w) bits
• Number of addressable units = 2s+w words
or bytes
• Block size = line size = 2w words or bytes
• Number of blocks in main memory = 2s
• Number of lines in set = k
• Number of sets = v = 2d
• Number of lines in cache = kv = k * 2d
• Size of tag = (s – d) bits
CS 322M Digital Logic & Computer Architecture

Cache Memory

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Cache Mapping
• Direct Mapping
• Associative Mapping
• Set Associative Mapping
– K way Set Associative Mapping
Mapping: Example
• A block set associative cache consists of a
total of 64 lines divided into 4-line sets. The
main memory contains 4096 blocks, each
consisting of 128 words.
– What is the size of main memory and cache
memory
Mapping: Example
• A block set associative cache consists of a
total of 64 lines divided into 4-line sets. The
main memory contains 4096 blocks, each
consisting of 128 words.
– How many bits are there in a main memory
address
– How many bits are there in each of the TAG,
SET and WORD fields.
Replacement Algorithms
• Direct mapping
– No choice
– Each block only maps to one line
– Replace that line
Replacement Algorithms
• Associative & Set Associative
– Hardware implemented algorithm (speed)
– Least Recently used (LRU)
• e.g. in 2 way set associative
• Which of the 2 block is lru?
– First in first out (FIFO)
• replace block that has been in cache longest
– Least frequently used
• replace block which has had fewest hits
– Random
Replacement Algorithms
• Least Recently Used (LRU)
– Program usually stays in localized area for a
reasonable period of time.
– There is a high probability that the blocks that
have been referenced recently will be
referenced again soon.
– When a block is to be overwritten, it is sensible
to overwrite the one that has gone the longest
time without being referenced.
– This block is called the Least Recently Used
(LRU) block and the technique is called the
LRU replacement policy.
Least Recently Used (LRU)
• Consider four-line set in a set-associative
cache
• Control bits:
– TAG bits
– 2-bit counter for each line(to track the LRU
block)
– d_bit: dirty bit
– f_bit: occupied bit
• Initially reset all the counters, d_bit and
f_bit
Least Recently Used (LRU)
• A cache hit occurs:
– set the counter value to 0 of this cache line
– for other counters, if the value is less than the
referenced line, increment the counter value
provided f_bit is 1.
– otherwise, do not change the counter value
Least Recently Used (LRU)
• A cache miss occurs:
– Set is not full
– Set is full

• Set is not full:


– set the counter value to 0 of the cache line
– set f_bit to 1
– increment the counter value of other lines
whose f_bit is 1
Least Recently Used (LRU)
• Set is full:
– the line with highest counter value is removed;
write back if d_bit is 1
– new block is transferred to this line
– reset d_bit and the counter
– other counter values are incremented by 1
Example
• Consider the following code segment
for (i=0; i<100; i++){
for (j=0; j<100; j++)
B[i,j] = A[i,j];}

• Cache organization:
– 16 lines/blocks in the cache
– block size is 512
• Data in Main memory:
– block0 contains i, j, etc
– block1 onward: Array A
– block25 onward: Array B
Example
Array stored in memory:

- Row Major Order


- Column Major Order
Direct Mapping
• Mapping function
– cache line = i mod 16 (for ith block)
– line 0 : 0, 16, 32, ….
– line 1 : 1, 17, 33, …..
• Mapping for Array A
• Mapping for Array B
• Find:
– Cache hits
– Cache misses
2-way set associative mapping
• Mapping function
– i = j mod v (v is number of set = 8)
– m = v * k ( k = 2, lines in a set)
– set0: line 0, 1: 0, 8, 16, 24, 32, …
– set1: line 2, 3: 1, 9, 17, 25, 33, …..
• Mapping for array A
• Mapping for array B
• Replacement policy:
– FIFO (cache hits and cache misses)
– LRU (cache hits and cache misses)
Reference
Computer Organization and Architecture –
Designing for Performance
William Stallings, Seventh Edition

Chapter 04: Cache Memory

Computer Organization
Hamacher, Vranesic and Zaky, Fifth Edition

Chapter05: Page No.: 314 - 329


CS 322M Digital Logic & Computer Architecture

Control Unit

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Computer Components: Top Level View
CPU Internal Structure
Micro-Operations
• A computer executes a program
• Program – set of instructions
• Fetch/execute cycle
• Each cycle has a number of steps
• Called micro-operations
• Each step does very little
Constituent Elements of Program Execution

Fetch:
- MAR<- PC, Read
- MBR <- Memory
-
Single Bus Organization of CPU
Fetch - 4 Registers
• Memory Address Register (MAR)
– Connected to address bus
– Specifies address for read or write op
• Memory Buffer Register (MBR)
– Connected to data bus
– Holds data to write or last data read
• Program Counter (PC)
– Holds address of next instruction to be fetched
• Instruction Register (IR)
– Holds last instruction fetched
Fetch Sequence
• Address of next instruction is in PC
• Address (MAR) is placed on address bus
• Control unit issues READ command
• Result (data from memory) appears on
data bus
• Data from data bus copied into MBR
• PC incremented by 1 (in parallel with data
fetch from memory)
• Data (instruction) moved from MBR to IR
• MBR is now free for further data fetches
Fetch Sequence
• t1: MAR <- PC
• t2: MBR <- memory
PC <- PC +1
• t3: IR <- MBR
– (tx = time unit/clock cycle)
Fetch Sequence – Is it correct

• t1: MAR <- PC


• t2: MBR <- memory
• t3: PC <- PC +1
IR <- MBR

(tx = time unit/clock cycle)


Single Bus Organization of CPU
Rules for Clock Cycle Grouping
• Proper sequence must be followed
– MAR <- PC must precede MBR <- memory
• Conflicts must be avoided
– Must not read & write same register at same
time
– MBR <- memory & IR <- MBR must not be in
same cycle
• Also: PC <- PC +1 involves addition
– Use ALU
– May need additional micro-operations
Indirect Cycle
• MAR <- IRaddress - address field of IR
• MBR <- memory
Execute Cycle (ADD)
• Different for each instruction

• e.g. ADD R1,X - add the contents of


location X to Register R1 , result in R1

– t1: MAR <- IRaddress


– t2: MBR <- memory
– t3: R1 <- R1 + MBR
Single Bus Organization of CPU
Instruction Cycle
• Each phase decomposed into sequence of
elementary micro-operations
• E.g. fetch, indirect, and interrupt cycles
• Execute cycle
– One sequence of micro-operations for each
opcode
• Assume new 2-bit register
– Instruction cycle code (ICC) designates which part
of cycle processor is in
• 00: Fetch
• 01: Indirect
• 10: Execute
• 11: Interrupt
Flowchart for Instruction Cycle

01
Functional Requirements
• Define basic elements of processor
• Describe micro-operations that processor
performs
• Determine functions that control unit must
perform
Basic Elements of Processor
• ALU
• Registers
• Internal data paths
• External data paths
• Control Unit
Functions of Control Unit
• Sequencing
– Causing the CPU to step through a series of
micro-operations
• Execution
– Causing the performance of each micro-op
• This is done using Control Signals
Control Signals
• Clock
– One micro-instruction (or set of parallel micro-
instructions) per clock cycle
• Instruction register
– Op-code for current instruction
– Determines which micro-instructions are
performed
• Flags
– State of CPU
– Results of previous operations
• From control bus
– Interrupts
– Acknowledgements
Model of Control Unit
Reference
Computer Organization and Architecture –
Designing for Performance
William Stallings, Seventh Edition

Chapter 16: Control Unit Operation

Computer Organization
Hamacher, Vranesic and Zaky, Fifth Edition

Chapter 07: Page No.: 411 - 429


CS 322M Digital Logic & Computer Architecture

Control Unit

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Model of Control Unit
Control Signals - output
• Within CPU
– Cause data movement
– Activate specific functions
• Via control bus
– To memory
– To I/O modules
Example Control Signal Sequence - Fetch

• MAR <- PC
– Control unit activates signal to open gates
between PC and MAR
• MBR <- memory
– Open gates between MAR and address bus
– Memory read control signal
– Open gates between data bus and MBR
Single Bus Organization of CPU
Universal Shift Register
CPU with Internal Bus

ALU
Register and Bus Connection
Internal and External Bus
Read and Write Signal
Read and Write Signal
Timing Diagram
Control Step for Execution
• ADD R1, R2, R3
– Add the contents of Register R1 and R2 and
store the result in R3
Single Bus Organization of CPU

Operation:

R3 ← R1+R2

Steps:

Y ← R1
Z ← Y+R2
R3 ← Z
Control Step for Execution
• ADD R1, R2, R3
– Add the contents of Register R1 and R2 and
store the result in R3

Steps:

Y ← R1
Z ← Y+R2
R3 ← Z
Clock Timing
• Time needed for micro-operation 2
– R2out, ADD, Zin
Instruction Fetch and Execute
• ADD (R3), R1
– Add the content of Register R1 to the content
of memory location whose memory address is
in register R3 and store the result in R1

Addressing Mode:
(R3) : Register Indirect
R1: Register Direct
Single Bus Organization of CPU

PC contains the
Address of the
Instruction.

Issues for PC
Updates:
When and how

Assumption:
Instruction length
- one word
Instruction Fetch and Execute
• ADD (R3), R1
– Add the content of Register R1 to the content
of memory location whose memory address is
in register R3 and store the result in R1
Instruction Fetch and Execute
• ADD (R3), R1
– Add the content of Register R1 to the content
of memory location whose memory address is
in register R3 and store the result in R1
Fetch Phase:

t1: MAR <- PC, Read


t2: MDR <- Memory
PC <- PC + 1
t3: IR <- MDR
Instruction Fetch and Execute
• ADD (R3), R1
– Add the content of Register R1 to the content
of memory location whose memory address is
in register R3 and store the result in R1
Execute Phase:

t1: MAR <- R3, Read


t2: MDR <- Memory
Y <- R1
t3: Z <- Y + MDR
t4: R1 <- Z
Reference
Computer Organization and Architecture –
Designing for Performance
William Stallings, Seventh Edition

Chapter 16: Control Unit Operation

Computer Organization
Hamacher, Vranesic and Zaky, Fifth Edition

Chapter 07: Page No.: 411 - 429


CS 322M Digital Logic & Computer Architecture

Control Unit

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Single Bus Organization of CPU

PC contains the
Address of the
Instruction.

Issues for PC
Updates:
When and how

Assumption:
Instruction length
- one word
Instruction Fetch and Execute
• ADD (R3), R1
– Add the content of Register R1 to the content
of memory location whose memory address is
in register R3 and store the result in R1
Instruction Fetch and Execute
• ADI data, R1
– Add the content of Register R1 to the data
specified in the program and store the result in
R1
Addressing Mode:
data : Immediate
R1: Register Direct
Instruction Format: Two words
1st Word: Opcode
2nd Word: data
Single Bus Organization of CPU

PC contains the
Address of the
Instruction.

Issues for PC
Updates:
When and how

Assumption:
Instruction length
- two word
Instruction Fetch and Execute
• ADI data, R1
– Add the content of Register R1 to the data
specified in the program and store the result in
R1
Fetch Phase:

t1: MAR <- PC, Read


t2: MDR <- Memory
PC <- PC + 1
t3: IR <- MDR
Instruction Fetch and Execute
• ADI data, R1
– Add the content of Register R1 to the data
specified in the program and store the result in
R1
Execute Phase:

t4 (step 1): MAR <- PC, Read


t5 (step 2): MDR <- Memory
PC <- PC + 1
t6: Y <- MDR
Yin, MDRout
t7: Z<- Y + R1
R1out, Add, Zin
t8: R1 <- Z
Zout, R1in, End
Instruction Fetch and Execute
• ADD M, R1
– Add the content of Register R1 to the content
of memory location whose memory address is
specified in the program and store the result in
R1
Addressing Mode:
M : Memory Direct
R1: Register Direct
Instruction Format: Two words
1st Word: Opcode
2nd Word: Memory address for data
Instruction Fetch and Execute
• ADD M, R1
– Add the content of Register R1 to the content
of memory location whose memory address is
specified in the program and store the result in
R1
Tasks:

- Fetch Opcode
- Fetch the memory address
(part of the program)
- Fetch the data from Memory
- perform addition
Instruction Fetch and Execute
• ADD M, R1
– Add the content of Register R1 to the content
of memory location whose memory address is
specified in the program and store the result in
R1
Execute Phase:

t4 (step 1): MAR <- PC, Read


t5 (step 2): MDR <- Memory
PC <- PC + 1
t6: MAR <- MDR, Read
Tasks: t7: MDR <- Memory
- Fetch Opcode Y<- R1
- Fetch the memory address t8: Z <- Y + MDR
(part of the program) t9: R1 <- Z
- Fetch the data from Memory
- perform addition
Instruction Fetch and Execute
• ADD (R3), R1
– Add the content of Register R1 to the content
of memory location whose memory address is
in register R3 and store the result in R1
Unconditional Branch
• Control sequence for an unconditional
Branch Instruction

Instruction Format: Op-code : offset of target Address


Conditional Branch

• Branch on Negative: BRN


– Use of condition codes/flags
– depends on flag bit: N – Negative
– N: set to 0 if the ALU result is not negative
– N: set to 1 if the ALU result is negative
Conditional Branch
• Branch on Negative: BRN
– Step 4 is replaced by:
– PCout, Yin, If N == 0 then End
Model of Control Unit

Synchronization of slow and fast devices


Hardwired Implementation
• Clock
– Repetitive sequence of pulses
– Useful for measuring duration of micro-ops
– Must be long enough to allow signal
propagation
– Need a counter with different control signals
for t1, t2 etc.
Generation of Timing Signal
Instruction Fetch and Execute
• ADD (R3), R1
– Add the content of Register R1 to the content
of memory location whose memory address is
in register R3 and store the result in R1
Control Signal
• Generation of signal Zin
– Zin = T1 + T6.ADD + T5.BR + ……….
Control Signals
• Generation of signal: End
– End = T7.ADD + T6.BR + (T6.N + T4.~N).BRN
+ …….
Control Signals
• Generation of signal: WMFC
– WMFC = T2 + T5.ADD + …….

• Generation of signal: PCout


– PCout = T1 + T4.BR + T4.BRN + …….

• Similarly, we need all the control signals


– MARin, MDRin, MDRout, …..
Hardwired Implementation
• Control unit inputs
• Flags and control bus
– Each bit means something
• Instruction register
– Op-code causes different control signals for
each different instruction
– Unique logic for each op-code
– Decoder takes encoded input and produces
single output
– n binary inputs and 2n outputs
Control Unit Organization
Decoder and encoder
Problems With Hard Wired Designs
• Complex sequencing & micro-operation
logic
• Difficult to design and test
• Inflexible design
• Difficult to add new instructions
Reference
Computer Organization and Architecture –
Designing for Performance
William Stallings, Seventh Edition

Chapter 16: Control Unit Operation

Computer Organization
Hamacher, Vranesic and Zaky, Fifth Edition

Chapter 07: Page No.: 411 - 429


CS 322M Digital Logic & Computer Architecture

Control Unit

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Control Unit
Single Bus Organization of CPU

PC contains the
Address of the
Instruction.

Issues for PC
Updates:
When and how

Assumption:
Instruction length
- one word
Single Bus Organization: another version

Instruction comprises of 4 bytes

Stored in one memory word

The memory is byte addressable


Fetch and Execute Instruction
Control Steps: Fetch and Execute

ADD (R3), R1: Add the content of register R1 and memory


location pointed by R3; and store the result in R1
Some Other Organizations

• Already discussed
– Single Bus Organization

• Other Possibilities
– Two Bus
– Three Bus
CPU Organization: Two Internal Buses
Control Step for Execution

• ADD R1, R2, R3


– Add the contents of Register R1 and R2 and
store the result in R3

Step Action
1 R1out, Genable, Yin
2 R2out, ADD, ALUout, R3in
CPU Organization: Two Internal Buses

For Execution

Step Action
1 R1out, Genable, Yin
2 R2out, ADD, ALUout, R3in
Control Step for Execution
• ADD R1, R2, R3
– Add the contents of Register R1 and R2 and
store the result in R3

Step Action
1 R1out, Genable, Yin
2 R2out, ADD, ALUout, R3in

Two Bus Single Bus


CPU organization: Three Internal Buses
Additional Features
• One special circuit to increment the PC:
Incrementer
• IncPC increments the value of PC by 4
– Need timing control to load the new value to
PC
• Special arrangement to pass the value
through the ALU
– R = B control signal to pass the value of bus B
to bus C through ALU
Control Steps: Fetch and Execute

ADD R4, R5, R6: Add the content of register R4 and R5;
and store the result in R6
Additional Features
• Special circuit to increment the PC
• CALL/RETURN
Control Unit: Hardwired Control
Problems With Hard Wired Designs
• Complex sequencing & micro-operation
logic
• Difficult to design and test
• Inflexible design
• Difficult to add new instructions
Reference
Computer Organization and Architecture –
Designing for Performance
William Stallings, Seventh Edition

Chapter 16: Control Unit Operation

Computer Organization
Hamacher, Vranesic and Zaky, Fifth Edition

Chapter 07: Page No.: 411 - 429


CS 322M Digital Logic & Computer Architecture

Micro-programmed Control

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Control Unit: Hardwired Control
Problems With Hard Wired Designs
• Complex sequencing & micro-operation
logic
• Difficult to design and test
• Inflexible design
• Difficult to add new instructions

• Alternative approach:
– Micro-Programmed Controlled
Implementation: Control Unit
• The control unit generates a set of control
signals
• Each control signal is on or off
• Represent each control signal by a bit
• Have a control word for each micro-
operation
• Have a sequence of control words for each
machine code instruction
• Add an address to specify the next micro-
instruction, depending on conditions
Single Bus Organization: another version
Control Steps: Fetch and Execute
ADD (R3), R1: Add the content of register R1 and memory
location pointed by R3; and store the result in R1
Micro-programmed Control
• Use sequences of instructions to control
complex operations
• Called micro-programming or firmware
Program Execution

Memory Location Instruction


01000 MOV R1, R2
01001 ADD R1, M
01002 DEC R2
Micro Program
Memory Location Instruction Machine Code
01000 MOV R1, R2
01001 ADD R1, M
01002 DEC R2

Control Micro-Opeartion for MOV R1, R2


Step
0 PCout, MARin, Read, Select4, Add, Zin
1 Zout, Pcin, Yin, WMFC
2 MDRout, IRin
3 R2out, R1in, End
4
Micro Program
Control Micro-Opeartion for MOV R1, R2
Step
0 PCout, MARin, Read, Select4, Add, Zin
1 Zout, Pcin, Yin, WMFC
2 MDRout, IRin
3 R2out, R1in, End
4

Pcin PCout MARin Zin Zout IRin Read End

0 1 1 1 0 0 1 0
1 0 0 0 0 0 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1
Micro Program
Control Micro-Opeartion for MOV R1, R2
Step
0 PCout, MARin, Read, Select4, Add, Zin
1 Zout, Pcin, Yin, WMFC
2 MDRout, IRin
3 R2out, R1in, End
4

Memory Pcin PCout MARin Zin Zout IRin Read End


Location

2000 0 1 1 1 0 0 1 0
2001 1 0 0 0 0 0 0 0
2002 0 0 0 0 0 1 0 0
2003 0 0 0 0 0 0 0 1
2004
Micro Program
Memory Location Instruction Machine Code
(Main Memory)
01000 MOV R1, R2
01001 ADD R1, M
01002 DEC R2

Memory Pcin PCout MARin Zin Zout IRin Read End


Location
(Control
Store)

2000 0 1 1 1 0 0 1 0
2001 1 0 0 0 0 0 0 0
2002 0 0 0 0 0 1 0 0
2003 0 0 0 0 0 0 0 1
2004
Control Unit Organization: Micro Programmed
Control Unit Organization
Fetch and Execute
Micro-program Word Length

• Based on 3 factors
– Maximum number of simultaneous micro-
operations supported
– The way control information is represented or
encoded
– The way in which the next micro-instruction
address is specified
Micro-instruction Types

• Each micro-instruction specifies single (or


few) micro-operations to be performed
– (vertical micro-programming)
• Each micro-instruction specifies many
different micro-operations to be performed
in parallel
– (horizontal micro-programming)
Vertical Micro-programming
• Width is narrow
• n control signals encoded into log2 n bits
• Limited ability to express parallelism
• Considerable encoding of control
information requires external memory word
decoder to identify the exact control line
being manipulated
Horizontal Micro-programming

• Wide memory word


• High degree of parallel operations possible
• Little encoding of control information
Reference
Computer Organization and Architecture –
Designing for Performance
William Stallings, Seventh Edition

Chapter 16: Control Unit Operation


Page No.: 596 - 617

Computer Organization
Hamacher, Vranesic and Zaky, Fifth Edition

Chapter07: Page No.: 429 - 446


CS 322M Digital Logic & Computer Architecture

Micro-programmed Control

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Micro-instruction Types

• Each micro-instruction specifies single (or


few) micro-operations to be performed
– (vertical micro-programming)
• Each micro-instruction specifies many
different micro-operations to be performed
in parallel
– (horizontal micro-programming)
Single Bus Organization: another version
Control Steps: Fetch and Execute
ADD (R3), R1: Add the content of register R1 and memory
location pointed by R3; and store the result in R1
Horizontal Micro-Programming
Compromise

• Divide control signals into disjoint groups


• Implement each group as separate field in
memory word
• Supports reasonable levels of parallelism
without too much complexity
How to Encode
• K different internal and external control signals
• Not all used
– Two sources cannot be gated to same destination
– Register cannot be source and destination
– Only one pattern presented to ALU at a time
– Only one pattern presented to external control bus at a time
• Require Q < 2K which can be encoded with log2Q < K bits
• Compromises
– More bits than necessary used
– Some combinations that are physically allowable are not possible to
encode
Specific Encoding Techniques
• Microinstruction organized as set of fields
• Each field contains code
• Activates one or more control signals
• Organize format into independent fields
– Field depicts set of actions (pattern of control
signals)
– Actions from different fields can occur
simultaneously
• Alternative actions that can be specified by a
field are mutually exclusive
– Only one action specified for field could occur at a
time
Horizontal Micro-Programming
Vertical Micro-Programming
Organization of Control Memory
Control Unit Function
• Sequence logic unit issues read command
• Word specified by control address register is read into
control buffer register
• Control buffer register contents generates control signals
and next address information
• Sequence logic loads new address into control address
register based on next address information from control
buffer register and ALU flags
Next Address Decision
• Depending on ALU flags and control buffer
register
– Get next instruction
• Add 1 to control address register
– Jump to new routine based on jump
microinstruction
• Load address field of control buffer register into
control address register
– Jump to machine instruction routine
• Load control address register based on opcode in
IR
Functioning of Microprogrammed Control Unit
Tasks Done

• Microinstruction sequencing
• Microinstruction execution
• Must consider both together
Design Considerations
• Size of microinstructions
• Address generation time
– Determined by instruction register
• Once per cycle, after instruction is fetched
– Next sequential address
• Common in most designed
– Branches
• Both conditional and unconditional
Sequencing Techniques

• Based on current microinstruction,


condition flags, contents of IR, control
memory address must be generated
• Based on format of address information
– Two address fields
– Single address field
– Variable format
Branch Control Logic: Two Address Fields
Branch Control Logic: Single Address Field
Branch Control Logic: Variable Format
Advantages and Disadvantages
• Simplifies design of control unit
– Cheaper
– Less error-prone
– Easy to correct the error
– Easily extendable
• Slower
Reference
Computer Organization and Architecture –
Designing for Performance
William Stallings, Seventh Edition

Chapter 16: Control Unit Operation


Page No.: 596 - 617

Computer Organization
Hamacher, Vranesic and Zaky, Fifth Edition

Chapter07: Page No.: 429 - 446


CS 322M Digital Logic & Computer Architecture

Micro-programmed Control

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Sequencing Techniques

• Based on current microinstruction,


condition flags, contents of IR, control
memory address must be generated
• Based on format of address information
– Two address fields
– Single address field
– Variable format
Branch Control Logic: Single Address Field
Branch Control Logic: Single Address Field
Starting Address of Micro-Routine
• Common Fetch Cycle
– Four control step for fetch
– Memory Location of control store: 0, 1, 2, 3
• After fetch IR contains the opcode
• This opcode is used to find the starting
address of the micro-routine of the execute
phase of the instruction
Starting Address of Micro-Routine
• Size of IR: 8 bits
• 256 instructions can be designed
• Control steps for execution phase varies
from 4 to 12
• Two possibilities are considered for the size
of micro-routine
– Equal memory location for each opcode
– Required number of memory location
Starting Address of Micro-Routine
• Equal memory
location for
each opcode:
– Control steps
for execution
phase varies
from 4 to 12
– Use 12
memory
location for
each micro-
routine
Starting Address of Micro-Routine
• Equal memory
location for
each opcode:
– Control steps
for execution
phase varies
from 4 to 12
– Use 16
memory
location for
each micro-
routine
Starting Address of Micro-Routine
• Required number of memory location:
– Control steps for execution phase varies from 4
to 12
Size of Control Store
• Consider a processor with Instruction Register
(IR) of size 8 bits.
• In the micro-program controlled control unit,
single address field is used for branch control
logic.
• Same micro code segment is used for the fetch
phase of all instructions and this is stored in
memory location 0, 1, 2 and 3 of the control store.
• The average number of steps needed for the
execution phase of an instruction is 8 control
steps.
• Number of control signals: 87
Size of Control Store
• Consider a processor with Instruction
Register (IR) of size 8 bits.
• In the micro-program controlled
control unit, single address field is
used for branch control logic.
• Same micro code segment is used for
the fetch phase of all instructions and
this is stored in memory location 0, 1,
2 and 3 of the control store.
• The average number of steps needed
for the execution phase of an
instruction is 8 control steps.
• Number of control signals: 87
Reference
Computer Organization and Architecture –
Designing for Performance
William Stallings, Seventh Edition

Chapter 16: Control Unit Operation


Page No.: 596 - 617

Computer Organization
Hamacher, Vranesic and Zaky, Fifth Edition

Chapter07: Page No.: 429 - 446


CS 322M Digital Logic & Computer Architecture

Input/Output

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Computer Components: Top Level View
Input/Output Problems

• Wide variety of peripherals


– Delivering different amounts of data
– At different speeds
– In different formats
• All slower than CPU and RAM
• Need I/O modules
Input/Output Module
• Interface to CPU and Memory
• Interface to one or more peripherals
Generic Model of I/O Module
External Devices
• Human readable
– Screen, printer, keyboard
• Machine readable
– Monitoring and control
• Storage devices
– Hard Disk, optical Disk, etc.
• Communication
– Modem
– Network Interface Card (NIC)
External Device Block Diagram
I/O Module Diagram
I/O Module Function
• Control & Timing
• CPU Communication
• Device Communication
• Data Buffering
• Error Detection
I/O Steps
• CPU checks I/O module device status
• I/O module returns status
• If ready, CPU requests data transfer
• I/O module gets data from device
• I/O module transfers data to CPU
• Variations for output, DMA, etc.
I/O Module Diagram

• CPU checks I/O module


device status
• I/O module returns status
• If ready, CPU requests data
transfer
• I/O module gets data from
device
• I/O module transfers data to
CPU
• Variations for output, DMA,
etc.
Input Output Techniques
• Programmed
• Interrupt driven
• Direct Memory Access (DMA)
Three Techniques
Programmed I/O
• CPU has direct control over I/O
– Sensing status
– Read/write commands
– Transferring data
• CPU waits for I/O module to complete
operation
• Wastes CPU time
Programmed I/O - detail
• CPU requests I/O operation
• I/O module performs operation
• I/O module sets status bits
• CPU checks status bits periodically
• I/O module does not inform CPU directly
• I/O module does not interrupt CPU
• CPU may wait or come back later
I/O Commands
• CPU issues address
– Identifies module (& device if >1 per module)
• CPU issues command
– Control - telling module what to do
• e.g. spin up disk
– Test - check status
• e.g. power? Error?
– Read/Write
• Module transfers data via buffer from/to device
Addressing I/O Devices
• Under programmed I/O data transfer is very
like memory access (CPU viewpoint)
• Each device given unique identifier
• CPU commands contain identifier (address)
I/O Mapping
• Memory mapped I/O
– Devices and memory share an address space
– I/O looks just like memory read/write
– No special commands for I/O
• Large selection of memory access commands available
• Isolated I/O
– Separate address spaces
– Need I/O or memory select lines
– Special commands for I/O
• Limited set
Reference
Computer Organization and Architecture –
Designing for Performance
William Stallings, Seventh Edition

Chapter 7: Input/Output
Page No.: 200 - 227
CS 322M Digital Logic & Computer Architecture

Input/Output

J. K. Deka
Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati, Assam.
Three Techniques
Interrupt Driven I/O
• Overcomes CPU waiting
• No repeated CPU checking of device
• I/O module interrupts when ready
Interrupt Driven I/O Basic Operation

• CPU issues read command


• I/O module gets data from peripheral whilst
CPU does other work
• I/O module interrupts CPU
• CPU requests data
• I/O module transfers data
Simple Interrupt Processing
Program Status Word
• A set of bits
• Includes Condition Codes
– Sign of last result
– Zero
– Carry
– Equal
– Overflow
• Interrupt enable/disable
• Supervisor
Changes in Memory and Registers for an Interrupt
CPU Viewpoint
• Issue read command
• Do other work
• Check for interrupt at end of each
instruction cycle
• If interrupted:-
– Save context (registers)
– Process interrupt
• Fetch data & store
Instruction Cycle with Interrupts
Design Issues

• How do you identify the module issuing the


interrupt?
• How do you deal with multiple interrupts?
– i.e. an interrupt handler being interrupted
Identifying Interrupting Module
• Different line for each module
– Limits number of devices
• Software poll
– CPU asks each module in turn
– Slow
Identifying Interrupting Module
• Software poll
– CPU branches to an interrupt service routine
– Poll each I/O module to determine which
module caused the interrupt
– Can be done by separate command line,
TESTI/O
• The processor raises TESTI/O and places the
address
• The I/O module responds positively if it set the
interrupt
– Alternatively, by an addressable status register
• The processor reads the status register to identify
the interrupting module
Identifying Interrupting Module
• Daisy Chain or Hardware poll
– Interrupt Acknowledge sent down a chain
– Module responsible places vector on bus
– CPU uses vector to identify handler routine
• Bus Master
– Module must claim the bus before it can raise
interrupt
– e.g. PCI & SCSI
Identifying Interrupting Module
• Daisy Chain or Hardware poll
– Interrupt Acknowledge sent down a chain
– Module responsible places vector on bus
– CPU uses vector to identify handler routine
Multiple Interrupts
• Each interrupt line has a priority
• Higher priority lines can interrupt lower
priority lines
Example - PC Bus
• 80x86 has one interrupt line
• 8086 based systems use one 8259A
interrupt controller
• 8259A has 8 interrupt lines
Sequence of Events
• 8259A accepts interrupts
• 8259A determines priority
• 8259A signals 8086 (raises INTR line)
• CPU Acknowledges
• 8259A puts correct vector on data bus
• CPU processes interrupt
82C59A Interrupt Controller
Three Techniques
Direct Memory Access (DMA)
• Interrupt driven and programmed I/O
require active CPU intervention
– Transfer rate is limited
– CPU is tied up
• DMA is the answer
DMA Function
• Additional Module (hardware) on bus
• DMA controller takes over from CPU for I/O
Typical DMA Module Diagram
DMA Operation
• CPU tells DMA controller:-
– Read/Write
– Device address
– Starting address of memory block for data
– Amount of data to be transferred
• CPU carries on with other work
• DMA controller deals with transfer
• DMA controller sends interrupt when
finished
DMA Transfer

• DMA controller takes over bus


• Transfer of data
• Not an interrupt
– CPU does not switch context
• CPU suspended just before it accesses bus
– i.e. before an operand or data fetch or a data
write
• Slows down CPU but not as much as CPU
doing transfer
DMA and Interrupt Breakpoints During an
Instruction Cycle
DMA Configurations (1)

• Single Bus, Detached DMA controller


• Each transfer uses bus twice
– I/O to DMA then DMA to memory
• CPU is suspended twice
DMA Configurations (2)

• Single Bus, Integrated DMA controller


• Controller may support >1 device
• Each transfer uses bus once
– DMA to memory
• CPU is suspended once
DMA Configurations (3)

• Separate I/O Bus


• Bus supports all DMA enabled devices
• Each transfer uses bus once
– DMA to memory
• CPU is suspended once
Intel 8237A DMA Controller
• Interfaces to 80x86 family and DRAM
• When DMA module needs buses it sends HOLD signal to processor
• CPU responds HLDA (hold acknowledge)
– DMA module can use buses
• E.g. transfer data from memory to disk
1. Device requests service of DMA by pulling DREQ (DMA request) high
2. DMA puts high on HRQ (hold request),
3. CPU finishes present bus cycle (not necessarily present instruction) and
puts high on HDLA (hold acknowledge). HOLD remains active for
duration of DMA
4. DMA activates DACK (DMA acknowledge), telling device to start transfer
5. DMA starts transfer by putting address of first byte on address bus and
activating MEMR; it then activates IOW to write to peripheral. DMA
decrements counter and increments address pointer. Repeat until count
reaches zero
6. DMA deactivates HRQ, giving bus back to CPU
8237 DMA Usage of Systems Bus
Reference
Computer Organization and Architecture –
Designing for Performance
William Stallings, Seventh Edition

Chapter 7: Input/Output
Page No.: 200 - 227

You might also like