unit-3 CPU Control unit Design
unit-3 CPU Control unit Design
Introduction to CPU
1. Introduction to CPU
a. Introduction to CPU
b. Introduction to CPU
c. Register Organization
d. Control and Status Register
e. Processor Status Word
f. Concept of Program Execution
g. Interrupts
2. Processor Organization
a) Processor Organization
b) Storing a word into memory
c) Register Transfer Operation
d) Performing the arithmetic or logic operation
e) Multiple Bus Organization
f) Two bus structure
g) Three bus structure
Introduction to CPU
The operation or task that must perform by CPU are:
• Fetch Instruction: The CPU reads an instruction from memory.
• Fetch Data: The execution of an instruction may require reading data from
memory or I/O module.
The CPU is connected to the rest of the system through system bus. Through system bus, data or
information gets transferred between the CPU and the other component of the system. The
system bus may have three components:
Data Bus:
Data bus is used to transfer the data between main memory and CPU.
Address Bus:
Address bus is used to access a particular memory location by putting the address of the memory
location.
Control Bus:
Control bus is used to provide the different control signal generated by CPU to different part of
the system. As for example, memory read is a signal generated by CPU to indicate that a memory
read operation has to be performed. Through control bus this signal is transferred to memory
module to indicate the required operation.
There are three basic components of CPU: register bank, ALU and Control Unit. There are
several data movements between these units and for that an internal CPU bus is used. Internal
CPU bus is needed to transfer data between the various registers and the ALU.
The internal organization of CPU in more abstract level is shown in the Figure 5.1 and Figure
5.2.
Register Organization
A computer system employs a memory hierarchy. At the highest level of hierarchy, memory is
faster, smaller and more expensive. Within the CPU, there is a set of registers which can be
treated as a memory in the highest level of hierarchy. The registers in the CPU can be
categorized into two groups:
• User-visible registers: These enables the machine - or assembly-language
programmer to minimize main memory reference by optimizing use of registers.
• Control and status registers: These are used by the control unit to control
the operation of the CPU. Operating system programs may also use these in
privileged mode to control the execution of program.
User-visible Registers:
The user-visible registars can be categorized as follows:
General Purpose Registers
Data Registers
Address Registers
Condition Codes
General-purpose registers can be assigned to a variety of functions by the programmer. In some
cases, general- purpose registers can be used for addressing functions (e.g., register indirect,
displacement).
In other cases, there is a partial or clean separation between data registers and address registers.
Data registers may be used to hold only data and cannot be employed in the calculation of an
operand address.
Address registers may be somewhat general purpose, or they may be devoted to a particular
addressing mode. Examples include the following:
• Segment pointer: In a machine with segment addressing, a segment
register holds the address of the base of the segment. There may be multiple
registers, one for the code segment and one for the data segment.
• Index registers: These are used for indexed addressing and may be
autoindexed.
• Stack pointer: If there is user visible stack addressing, then typically the
stack is in memory and there is a dedicated register that points to the top of the
stack.
Condition Codes (also referred to as flags) are bits set by the CPU hardware as the result of the
operations. For example, an arithmatic operation may produce a positive, negative, zero or
overflow result. In addition to the result itself beign stored in a register or memory, a condition
code is also set. The code may be subsequently be tested as part of a condition branch operation.
Condition code bits are collected into one or more registers.
Register Organization
There are a variety of CPU registers that are employed to control the operation of the CPU. Most
of these, on most machines, are not visible to the user.
Different machines will have different register organizations and use different terminology. We
will discuss here the most commonly used registers which are part of most of the machines.
Four registers are essential to instruction execution:
Program Counter (PC): Contains the address of an instruction to be fetched. Typically, the PC is
updated by the CPU after each instruction fetched so that it always points to the next instruction
to be executed. A branch or skip instruction will also modify the contents of the PC.
Instruction Register (IR): Contains the instruction most recently fetched. The fetched instruction
is loaded into an IR, where the opcode and operand specifiers are analyzed.
Memory Address Register (MAR): Containts the address of a location of main memory from
where information has to be fetched or information has to be stored. Contents of MAR is directly
connected to the address bus.
Memory Buffer Register (MBR): Contains a word of data to be written to memory or the word
most recently read. Contents of MBR is directly connected to the data bus.It is also known as
Memory Data Register(MDR).
Apart from these specific register, we may have some temporary registers which are not visible
to the user. As such, there may be temporary buffering registers at the boundary to the ALU;
these registers serve as input and output registers for the ALU and exchange data with the MBR
and user visible registers.
3. Carry out the actions specified by the instruction stored in the IR.
The first two steps are usually referred to as the fetch phase and the step 3 is known as the
execution phase. Fetch cycle basically involves read the next instruction from the memory into
the CPU and along with that update the contents of the program counter. In the execution phase,
it interpretes the opcode and perform the indicated operation. The instruction fetch and execution
phase together known as instruction cycle. The basic instruction cycle is shown in the Figure 5.3.
Figure 5.3: Basic Instruction cycle
In cases, where an instruction occupies more than one word, step 1 and step 2 can be repeated as
many times as necessary to fetch the complete instruction. In these cases, the execution of a
instruction may involve one or more operands in memory, each of which requires a memory
access. Further, if indirect addressing is used, then additional memory access are required.
The fetched instruction is loaded into the instruction register. The instruction contains bits that
specify the action to be performed by the processor. The processor interpretes the instruction and
performs the required action. In general, the actions fall into four categories:
• Processor-memory: Data may be transfrred from processor to memory or
from memory to processor.
The execution cycle of a perticular instruction may involve more than one reference to memory.
Also, instead of memory references, an instruction may specify an I/O operation. With these
additional considerations the basic instruction cycle can be expanded with more details view in
the Figure 5.4. The figure is in the form of a state diagram.
Figure 5.4: Instruction cycle state diagram.
Interrupts
Virtually all computers provide a mechanism by which other module (I/O, memory etc.) may
interrupt the normal processing of the processor. The most common classes of interrupts are:
Program: Generated by some condition that occurs as a result of an instruction
execution, such as arithmatic overflow, division by zero, attempt to
execute an illegal machine instruction, and reference outside the user's
allowed memory space.
Timer: Generated by a timer within the processor. This allows the operating
system to perform certain functions on a regular basis.
For I/O operation, say an output operation, like printing some information by a printer. Printer is
much slower device than the CPU. The CPU puts some information on the output buffer. While
printer is busy printing these information from output buffer, CPU is lying idle. During this time
CPU can perform some other task which does not involve the memory bus.
When the external device becomes ready to be serviced, that is, when it is ready to accept more
data from the processor, the I/O module for that external device sends an interrupt request signal
to the processor. The processor responds by suspending operation of the current program,
branching off to a program to service the particular I/O device (known as an interrupt handler),
and resuming the original execution after the device is serviced.
From the point of view of the user program, an interrupt is just that : an interruption of the
normal sequence of execution. When the interrupt processing is completed, execution resumes.
To accommodate interrupts, an interrupt cycle is added to the instruction cycle, which is shown
in the Figure 5.5. In the interrupt cycle, the processor checks if any interrupt have occurred,
indicated by the presence of an interrupt signals. If no interrupts are pending, the processor
proceeds to the fetch cycle and fetches the next instruction of the current program. If an interrupt
is pending, the processor does the following:
1. It suspends the execution of the current program being executed and saves its contents.
This means saving the address of the next instruction to be executed (current contents of
the program counter) and any other data relevant to the processor's current activity.
2. It sets the program counter to the starting address of an interrupt handler routine.
3.
Processor Organization
There are several components inside a CPU, namely, ALU, control unit, general purpose register,
Instruction registers etc. Now we will see how these components are organized inside CPU.
There are several ways to place these components and inteconnect them. One such organization
is shown in the Figure 5.6.
In this case, the arithmatic and logic unit (ALU), and all CPU registers are connected via a single
common bus. This bus is internal to CPU and this internal bus is used to transfer the information
between different components of the CPU. This organization is termed as single bus
organization, since only one internal bus is used for transferring of information between different
components of CPU. We have external bus or buses to CPU also to connect the CPU with the
memory module and I/O devices. The external memory bus is also shown in the Figure 5.6
connected to the CPU via the memory data and address register MDR and MAR.
The number and function of registers R0 to R(n-1) vary considerably from one machine to
another. They may be given for general-purpose for the use of the programmer. Alternatively,
some of them may be dedicated as special-purpose registers, such as index register or stack
pointers.
In this organization, two registers, namely Y and Z are used which are transperant to the user.
Programmer can not directly access these two registers. These are used as input and output buffer
to the ALU which will be used in ALU operations. They will be used by CPU as temporary
storage for some instructions.
Figure 5.6 : Single bus organization of the data path inside the CPU
For the execution of an instruction, we need to perform an instruction cycle. An instruction
cycle consists of two phase,
• Fetch cycle and
• Execution cycle.
Most of the operation of a CPU can be carried out by performing one or more of the following
functions in some prespecified sequence:
1. Fetch the contents of a given memory location and load them into a CPU
register.
2. Store a word of data from a CPU register into a given memory location.
3. Transfer a word of data from one CPU register to another or to the ALU.
4. Perform an arithmatic or logic operation, and store the result in a CPU
register.
Now we will examine the way in which each of the above functions is implemented in a
computer.
Fetching a Word from Memory:
Information is stored in memory location indentified by their address. To fetch a word from
memory, the CPU has to specify the address of the memory location where this information is
stored and request a Read operation. The information may include both, the data for an operation
or the instruction of a program which is available in main memory.
As an example, assume that the address of the memory location to be accessed is kept in register
R2 and that the memory contents to be loaded into register R1. This is done by the following
sequence of operations:
1. MAR [R2] 2. Read
3. Wait for MFC signal 4. R1 [MDR]
The time required for step 3 depends on the speed of the memory unit. In general, the time
required to access a word from the memory is longer than the time required to perform any
operation within the CPU.
The scheme that is used here to transfer data from one device (memory) to another device (CPU)
is referred to as an asynchronous transfer.
This asynchronous transfer enables transfer of data between two independent devices that have
different speeds of operation. The data transfer is synchronised with the help of some control
signals. In this example, Read request and MFC signal are doing the synchronization task.
An alternative scheme is synchronous transfer. In this case all the devices are controlled by a
common clock pulse (continously running clock of a fixed frequency). These pulses provide
common timing signal to the CPU and the main memory. A memory operation is completed
during every clock period. Though the synchronous data transfer scheme leads to a simpler
implementation, it is difficult to accommodate devices with widely varying speed. In such
cases, the duration of the clock pulse will be synchronized to the slowest device. It reduces the
speed of all the devices to the slowest one.
Storing a word into memory
The procedure of writing a word into memory location is similar to that for reading one from
memory. The only difference is that the data word to be written is first loaded into the MDR, the
write command is issued.
As an example, assumes that the data word to be stored in the memory is in register R1 and that
the memory address is in register R2. The memory write operation requires the following
sequence:
1. MAR [R2]
2. MDR [R1]
3. Write
4. Wait for MFC
- In this case step 1 and step 2 are independent and so they can be carried out in any order. In
fact, step 1 and 2 can be carried out simultaneously, if this is allowed by the architecture, that is,
if these two data transfers (memory address and data) do not use the same data path.
In case of both memory read and memory write operation, the total time duration depends on
wait for the MFC signal, which depends on the speed of the memory module.
There is a scope to improve the performance of the CPU, if CPU is allowed to perform some
other operation while waiting for MFC signal. During the period, CPU can perform some other
instructions which do not require the use of MAR and MDR.
As for example, consider the instruction : "Add contents of memory location NUM to the
contents of register R1 and store the result in register R1." For simplicity, assume that the address
NUM is given explicitly in the address field of the instruction .That is, in this instruction, direct
addressing mode is used.
Execution of this instruction requires the following action :
1.1. Fetch instruction
1.2. Fetch first operand (Contents of memory location pointed at by the
address field of the instruction)
1.3. Perform addition
1.4. Load the result into R1.
Following sequence of control steps are required to implement the above operation for the
single-bus architecture that we have discussed in earlier section.
Steps Actions
1. PCout, MARin, Read, Clear Y, Set carry -in to ALU, Add, Zin
2. Zout, PCin, Wait For MFC
3. MDRout, Irin
4. Address-field- of-IRout, MARin, Read
5. R1out, Yin, Wait for MFC
6. MDRout, Add, Zin
7. Zout, R1in
8. END
Branching
With the help of branching instruction, the control of the execution of the program is transfered
from one particular position to some other position, due to which the sequence flow of control is
broken. Branching is accomplished by replacing the current contents of the PC by the branch
address, that is, the address of the instruction to which branching is required.
Consider a branch instruction in which branch address is obtained by adding an offset X, which
is given in the address field of the branch instruction, to the current value of PC.
Consider the following unconditional branch instruction
JUMP X
The control sequence that enables execution of an unconditional branch instruction using the
single - bus organization is as follows :
Steps Actions
1. PCout, MARin, Read, Clear Y, Set Carry-in to ALU, Add ,Zin
3. MDRout, IRin
4. PCout, Yin
6. Zout, PCin
7. End
Execution starts as usual with the fetch phase, ending with the instruction being loaded into the
IR in step 3. To execute the branch instruction, the execution phase starts in step 4.
In Step 4
The contents of the PC are transferred to register Y.
In Step 5
The offset X of the instruction is gated to the bus and the addition operation is performed.
In Step 6
The result of the addition, which represents the branch address is loaded into the PC.
In Step 7
It generates the End signal to indicate the end of execution of the current instruction.
Consider now the conditional branch instruction instead of unconditional branch. In this case,
we need to check the status of the condition codes, between step 3 and 4. i.e., before adding the
offset value to the PC contents.
For example, if the instruction decoding circuitry interprets the contents of the IR as a branch on
Negative(BRN) instruction, the control unit proceeds as follows:First the condition code register
is checked. If bit N (negative) is equal to 1 , the control unit proceeds with step 4 trough step 7
of control sequence of unconditional branch instruction.
If , on the other hand , N is equal to 0, and End signal is issued .
This in effect , terminates execution of the branch instruction and causes the instruction
immediately following in the branch instruction to be fetched when a new fetch operation is
performed.
Therefore , the control sequence for the conditional branch instruction BRN can be obtained
from the control sequence of an unconditional branch instruction by replacing the step 4 by
4. If then End
If N then PCout, yin
Consider the sequence of control signal required to execute the ADD instruction
that is explained in previous lecture. It is obvious that eight non-overlapping time
slots are required for proper execution of the instruction represented by this
sequence.
Each time slot must be at least long enough for the function specified in the
corresponding step to be completed. Since, the control unit is implemented by
hardwire device and every device is having a propagation delay, due to which it
requires some time to get the stable output signal at the output port after giving
the input signal. So, to find out the time slot is a complicated design task.
4. For the moment, for simplicity, let us assume that all slots are equal in
time duration. Therefore the required controller may be implemented based upon
the use of a counter driven by a clock.
5. Each state, or count, of this counter corresponds to one of the steps of the
control sequence of the instructions of the CPU.
6. In the previous lecture, we have mentioned control sequence for execution
of two instructions only (one is for add and other one is for branch). Like that we
need to design the control sequence of all the instructions.
7. By looking into the design of the CPU, we may say that there are various
instruction for add operation. As for example,
ADD NUM R1 Add the contents of memory location specified by NUM to
the contents
of register R1 .
8. The control sequence for execution of these two ADD instructions are
different. Of course, the fetch phase of all the instructions remain same.
It is clear that control signals depend on the instruction, i.e., the contents of the
instruction register. It is also observed that execution of some of the instructions
depend on the contents of condition code or status flag register, where the control
sequence depends in conditional branch instruction.
Hence, the required control signals are uniquely determined by the following information:
o Contents of the control counter.
The structure of control unit can be represented in a simplified view by putting it in block
diagram. The detailed hardware involved may be explored step by step. The simplified view of
the control unit is given in the Figure 5.10.
The decoder/encoder block is simply a combinational circuit that generates the required control
outputs depending on the state of all its input.
The decoder part of decoder/encoder part provide a separate signal line for each control step, or
time slot in the control sequence. Similarly, the output of the instructor decoder consists of a
separate line for each machine instruction loaded in the IR, one of the output line INS1 to INSm
is set to 1 and all other lines are set to 0.
The detailed view of the control unit organization is shown in the Figure 5.11.
Figure 5.11: Detailed view of Control Unit organization
All input signals to the encoder block should be combined to generate the individual control
signals.
In the previous section, we have mentioned the control sequence of the instruction,
"Add contents of memory location address in memory direct made to register R1 ( ADD_MD)",
"Control sequence for an unconditional branch instruction (BR)",
also, we have mentioned about Branch on negative (BRN).
Consder those three CPU instruction ADD_MD, BR, BRN.
It is required to generate many control signals by the control unit. These are basically coming out
from the encoder circuit of the control signal generator. The control signals are: PCin, PCout, Zin,
Zout, MARin, ADD, END, etc.
By looking into the above three instructions, we can write the logic function for Zin as :
Zin = T1 + T6 . ADD_MD + T5 . BR + T5 . BRN + . . . . . . . . . . . . . .
For all instructions, in time step1 we need the control signal Zin to enable the input to register
Zin time cycle T6 of ADD_MD instruction, in time cycle T5 of BR instruction and so on.
Similarly, the Boolean logic function for ADD signal is
ADD = T1 + T6 . ADD_MD + T5 . BR + . . . . . . . . . . . . . .
These logic functions can be implemented by a two level combinational circuit of AND and
OR gates.
Similarly, the END control signal is generated by the logic function :
END = T8. ADD_MD + T7 . BR + ( T7 . N + T4 . ) . BRN + . . . . . . . .
......
This END signal indicates the end of the execution of an instruction, so this END signal can be
used to start a new instruction fetch cycle by resetting the control step counter to its starting
value.
The circuit diagram (Partial) for generating Zin and END signal is shown in the Figure 5.12 and
Figure 5.13 respectively.
The signal ADD_MD, BR, BRN etc. are coming from instruction decoder circuits which
depends on the contents of IR.
The signal T1, T2, T3 etc are coming out from step decoder depends on control step counter.
The signal N (Negative) is coming from condition code register.
When wait for MFC (WMFC) signal is generated, then CPU does not do any works and it waits
for an MFC signal from memory unit. In this case, the desired effect is to delay the initiation of
the next control step until the MFC signal is received from the main memory. This can be
incorporated by inhibiting the advancement of the control step counter for the required period.
Let us assume that the control step counter is controlled by a signal called RUN.
By looking at the control sequence of all the instructions, the WMFC signal is generated as:
WMFC = T2 + T5 . ADD_MD + . . . . . . . . . . . . . .
The RUN signal is generated with the help of WMFC signal and MFC signal. The arrangement
is shown in the Figure 5.14.
Figure 5.14: Generation of RUN signal
The MFC signal is generated by the main memory whose operation is independent of CPU
clock. Hence MFC is an asynchronous signal that may arrive at any time relative to the CPU
clock. It is possible to synchronized with CPU clock with the help of a D flip-flop.
When WMFC signal is high, then RUN signal is low. This run signal is used with the master
clock pulse through an AND gate. When RUN is low, then the CLK signal remains low, and it
does not allow to progress the control step counter.
When the MFC signal is received, the run signal becomes high and the CLK signal becomes
same with the MCLK signal and due to which the control step counter progresses. Therefore, in
the next control step, the WMFC signal goes low and control unit operates normally till the next
memory access signal is generated.
The timing diagram for an instruction fetch operation is shown in the Figure 5.15.
Figure 5.15: Timing of control signals during instruction fetch
Programmable Logic Array
In this discussion, we have presented a simplified view of the way in which the sequence of
control signals needed to fetch and execute instructions may be generated.
It is observed from the discussion that as the number of instruction increases the number of
required control signals will also increase.
In VLSI technology, structure that involve regular interconnection patterns are much easier to
implement than the random connections.
One such regular structure is PLA ( programmable logic array ). PLAs are nothing but the arrays
of AND gates followed by array of OR gates. If the control signals are expressed as sum of
product form then with the help of PLA it can be implemented.
The PLA implementation of control unit is shown in the Figure 5.16.
Microprogrammed Control
In hardwired control, we saw how all the control signals required inside the CPU can be
generated using a state counter and a PLA circuit.
There is an alternative approach by which the control signals required inside the CPU can be
generated . This alternative approach is known as microprogrammed control unit.
In microprogrammed control unit, the logic of the control unit is specified by a microprogram.
A microprogram consists of a sequence of instructions in a microprogramming language. These
are instructions that specify microoperations.
A microprogrammed control unit is a relatively simple logic circuit that is capable of (1)
sequencing through microinstructions and (2) generating control signals to execute each
microinstruction.
The concept of microprogram is similar to computer program. In computer program the complete
instructions of the program is stored in main memory and during execution it fetches the
instructions from main memory one after another. The sequence of instruction fetch is controlled
by program counter (PC) .
Microprogram are stored in microprogram memory and the execution is controlled by
microprogram counter ( PC).
Microprogram consists of microinstructions which are nothing but the strings of 0's and 1's. In a
particular instance, we read the contents of one location of microprogram memory, which is
nothing but a microinstruction. Each output line ( data line ) of microprogram memory
corresponds to one control signal. If the contents of the memory cell is 0, it indicates that the
signal is not generated and if the contents of memory cell is 1, it indicates to generate that control
signal at that instant of time.
First let me define the different terminologies that are related to microprogrammed control unit.
Control Word (CW) :
Control word is defined as a word whose individual bits represent the various control signal.
Therefore each of the control steps in the control sequence of an instruction defines a unique
combination of 0s and 1s in the CW.
A sequence of control words (CWs) corresponding to the control sequence of a machine
instruction constitutes the microprogram for that instruction.
The individual control words in this microprogram are referred to as microinstructions.
The microprograms corresponding to the instruction set of a computer are stored in a aspecial
memory which will be referred to as the microprogram memory. The control words related to an
instructions are stored in microprogram memory.
The control unit can generate the control signals for any instruction by sequencially reading the
CWs of the corresponding microprogram from the microprogram memory. To read the control
word sequentially from the microprogram memory a microprogram counter ( PC) is needed.
The basic organization of a microprogrammed control unit is shown in the Figure 5.17.
The "starting address generator" block is responsible for loading the starting address of the
microprogram into the PC everytime a new instruction is loaded in the IR.
The PC is then automatically incremented by the clock, and it reads the successive
microinstruction from memory.
Figure 5.17: Basic organization of a microprogrammed control
Each microinstruction basically provides the required control signal at that time step. The
microprogram counter ensures that the control signal will be delivered to the various parts of the
CPU in correct sequence.
We have some instructions whose execution depends on the status of condition codes and status
flag, as for example, the branch instruction. During branch instruction execution, it is required to
take the decision between the alternative action.
To handle such type of instructions with microprogrammed control, the design of control unit is
based on the concept of conditional branching in the microprogram. For that it is required to
include some conditional branch microinstructions.
In conditional microinstructions, it is required to specify the address of the microprogram
memory to which the control must direct. It is known as branch address. Apart from branch
address, these microinstructions can specify which of the states flags, condition codes, or
possibly, bits of the instruction register should be checked as a condition for branching to take
place.
To support microprogram branching, the organization of control unit should be modified to
accommodate the branching decision.
To generate the branch address, it is required to know the status of the condition codes and status
flag. To generate the starting address, we need the instruction which is present in IR. But for
branch address generation we have to check the content of condition codes and status flag.
The organization of control unit to enable conditional branching in the microprogram is shown in
the Figure 5.18.
Figure 5.18: Organization of microprogrammed control with conditional branching.
The control bits of the microinstructions word which specify the branch conditions and address
are fed to the "Starting and branch address generator" block.
This block performs the function of loading a new address into the PC when the condition of
branch instruction is satisfied.
In a computer program we have seen that execution of every instruction consists of two part -
fetch phase and execution phase of the instruction. It is also observed that the fetch phase of all
instruction is same.
In microprogrammed controlled control unit, a common microprogram is used to fetch the
instruction. This microprogram is stored in a specific location and execution of each instruction
start from that memory location.
At the end of fetch microprogram, the starting address generator unit calculate the appropriate
starting address of the microprogram for the instruction which is currently present in IR. After
the PC controls the execution of microprogram which generates the appropriate control signal
in proper sequence.
During the execution of a microprogram, the PC is always incremented everytime a new
microinstruction is fetched from the microprogram memory, except in the following situations :
1. When an End instruction is encountered, the PC is
loaded with the address of the first CW in the microprogram for
the instruction fetch cycle.
1. When a new instruction is loaded into the IR, the PC is loaded with the starting
address of the microprogram for that instruction.
2. When a branch microinstruction is encountered, and the branch condition is
satisfied, the PC is loaded with the branch address.
Let us examine the contents of microprogram memory and how the microprogram of each
instruction is stored or organized in microprogram memory. Consider the two example that are
used in our previous lecture . First example is the control sequence for execution of the
instruction "Add contents of memory location addressed in memory direct mode to register R1".
Steps Actions
1. PCout, MARin, Read, Clear Y, Set carry-in to ALU, Add, Zin
2. Zout, PCin, Wait For MFC
3. MDRout, IRin
4. Address-field-of-IRout, MARin, Read
5. R1out, Yin, Wait for MFC
6. MDRout, Add, Zin
7. Zout, R1in
8. END
3. MDRout, IRin
4. PCout, Yin
6. Zout, PCin
7. End
First consider the control signal required for fetch instruction , which is same for all the
instruction, we are listing them in a particular order.
PCout MARin Read Clear Y Set Carry to ALU Add Zin Zout PCin WMFC MDRout IRin
The control word for the first three steps of the above two instruction are : ( which are the fetch
cycle of each instruction as follows ):
Step1 1 1 1 1 1 1 1 0 0 0 0 0 ---
Step2 0 0 0 0 0 0 0 1 1 1 0 0 ---
Step3 0 0 0 0 0 0 0 0 0 0 1 1 ---
We are storing this three CW in memory location 0, 1 and 2. Each instruction starts from
memory location 0. After executing upto third step, i.e., the contents of microprogram memory
location 2, this control word stores the instruction in IR. The starting address generator circuit
now calculate the starting address of the microprogram for the instruction which is available in
IR.
Consider that the microprogram for add instruction is stored from memory location 50 of
microprogram memory. So the partial contents from memory location 50 are as
follows :
Loc
atio 50 0 1 1 0 0 0 0 0 0 0 0 0 - - - - --
n
51 0 0 0 0 0 0 0 0 0 1 0 0 -- -- --
and so on . . . .
When the microprogram executes the End microinstruction of an instruction, then it generates
the End control signal. This End control signal is used to load the PC with the starting address
of fetch instruction ( In our case it is address 0 of microprogram memory). Now the CPU is
ready to fetch the next instruction from main memory.
From the discussion, it is clear that microprograms are similar to computer program, but it is in
one level lower, that's why it is called microprogram.
For each instruction of the instruction set of the CPU, we will have a microprogram.
While executing a computer program, we fetch instruction by instruction from main memory
which is controlled by program counter(PC).
When we fetch an instruction from main memory, to execute that instruction , we execute the
microprogram for that instruction. Microprograms are nothing but the collection of
microinstrctions. These microinstructions will be fetched from microprogram memory one after
another and its sequence is maintained by PC. Fetching of microinstruction basically provides
the required control signal at that time instant.
In the previous discussion, to design a micro programmed control unit, we have to do the
following:
• For each instruction of the CPU, we have to write a microprogram to generate the control
signal. The microprograms are stored in microprogram memory (control store). The
starting address of each microprogram are known to the designer
• Each microinstructions are nothing but the combination of 0’s and 1’s which is known as
control word. Each position of control word specifies a particular control signal. 0 on the
control word means that a low signal value is generated for that control signal at that
particular instant of time, similarly 1 indicates a high signal.
• To incorporate the branching instruction, i.e., the branching within the microprogram, a
branch address generator unit must be included. Both unconditional and conditional
branching can be achieved with the help of microprogram. To incorporate the conditional
branching instruction, it is required to check the contents of condition code and status
flag.
• Microprogramed controlled control unit is very much similar to CPU. In
CPU the PC is used to fetch instruction from the main memory, but in case of
control unit, microprogram counter is used to fetch the instruction from control
store.
• But there are some differences between these two. In case of fetching
instruction from main memory, we are using two signals MFC and WMFC. These
two signals are required to synchronize the speed between CPU and main
memory. In general, main memory is a slower device than the CPU.
• In microprogrammed control the need for such signal is less obvious. The
size of control store is less than the size of main memory. It is possible to replace
the control store by a faster memory, where the speed of the CPU and control
store is almost same.
• Since control store are usually relatively small, so that it is feasible to
speed up their speed through costly circuits.
• If we can implement the main memory by a faster device then it is also possible to
eliminate the signals MFC & WMFC. But, in general, the size of main memory is very
big and it is not economically feasible to replace the whole main memory by a faster
memory to eliminate MFC & WMFC.
Grouping of control signals:
• It is observed that we need to store the information of each control signal
in control store. The status of a particular control signal is either high or low at a
particular instant of time.
• It is possible to reserve one bit position for each control signal. If there are
n control signals in a CPU, them the length of each control signal is n . Since we
have one bit for each control signal, so a large number of resources can be
controlled with a single microinstruction. This organization of microinstruction is
known as horizontal organization.
• If the machine structure allows parallel uses of a number of resources,
then horizontal organization has got advantage. Since more number of resources
can be accessed parallel, the operating speed is also more in such organization. In
this situation, horizontal organization of control store has got advantage.
• If more number of resources can be accessed simultaneously, than most of
the contents of control store is 0. Since the machine architecture does not provide
the parallel access of resources, so simultaneously we cannot generate the control
signal. In such situation, we can combine some control signals and group them
together. This will reduce the size of control word. If we use compact code to
specify only a small number of control functions in each microinstruction, then it
is known as vertical organization of microinstruction.
• In case of horizontal organization, the size of control word is longer,
which is in one extreme point and in case of vertical organization, the size of
control word is smaller, which is in other extreme.
• In case of horizontal organization, the implementation is simple, but in case of vertical
organization, implementation complexity increases due to the required decoder circuits.
Also the complexity of decoder depends on the level of grouping and encoding of the
control signal.
• Horizontal and Vertical organization represent the two organizational extremes in
microprogrammed control. Many intermediate schemes are also possible, where the
degree of encoding is a design parameter.
• We will explain the grouping of control signal with the help of an example. Grouping of
control signals depends on the internal organization of CPU.
• Assigning individual bits to each control signal is certain to lead to long microinstruction,
since the number of required control signals is normally large.
• However, only a few bits are set to 1 and therefore used for active gating in any given
microinstructions. This obviously results in low utilization of the available bit space.
• If we group the control signal in some non-over lapping group then the size of control
word reduces.
The single bus architecture of CPU is shown in the Figure 5.19.
Figure 5.19: Single bus architecture of CPU
This CPU contains four general purpose registers R0 , R1 , R2 and R3 . In addition there are three
other register called SOURCES, DESTIN and TEMP. These are used for temporary storage
within the CPU and completely transparent to the programmer. A computer programmer cannot
use these three registers.
For the proper functioning of this CPU, we need all together 24 gating signals for the transfer of
information between internal CPU bus and other resources like registers.
In addition to these register gating signals, we need some other control signals which include the
Read, Write, Clear Y, set carry in, WMFC, and End signal. (Here we are restricting the control
signal for the case of discussion in reality, the number of signals are more).
It is also necessary to specify the function to be performed by ALU. Depending on the power of
ALU, we need several control lines, one control signal for each function. Assume that the ALU
that is used in the design can perform 16 different operation such as ADD, SUBSTRACT, AND,
OR, etc. So we need 16 different control lines.
The above discussion indicates that 46(24+6+16) distinct signals are required. This indicates that
we need 46 bits in each micro instructions, therefore the size of control word is 46.
Consider the microprogram pattern that is shown for the Add instruction. On an average 4 to 5
bits are set to 1 in each micro instruction and rest of the bits are 0. Therefore, the bit utilization is
poor, and there is a scope to improve the utilization of bit.
If is observed that most signals are not needed simultaneously and many signals are mutually
exclusive.
As for example, only one function of the ALU can be activated at a time. In out case we are
considering 16 ALU operations. Instead of using 16 different signal for ALU operation, we can
group them together and reduce the number of control signal. From digital logic circuit, it is
obvious that instead of 16 different signal, we can use only 4 control signal for ALU operation
and them use a 4 X 16 decoder to generate 16 different ALU signals. Due to the use of a decoder,
there is a reduction in the size of control word.
Another possibilities of grouping control signal is: A sources for data transfer must be unique,
which means that it is not possible to gate the contents of two different registers onto the bus at
the same time. Similarly Read Write signals to the memory cannot be activated simultaneously.
This observation suggests the possibilities of grouping the signals so that all signals that are
mutually exclusive are placed in the same group. Thus a group can specify one micro operation
at a time.
At that point we have to use a binary coding scheme to represent a given signal within a group.
As for example, for 16 ALU function, four bits are enough to decode the appropriate function.
A possible grouping of the 46 control signals that are required for the above mention CPU is
given in the Table 5.1.
<< Previous | First | Last | Next >>
Table 5.1: Grouping of the control signals
F1 F2 F3 F4 F5
(4 bits) (3 bits) (2 bits) (2 bits) (4 bits)
0000: No 000: No
00: No Transfer 00: No Transfer 0000: Add
Transfer Transfer
0001:
001: PCin 01: MARin 01: Yin 0001: Sub
PCout
0010: 0010:
001: IRin 10: MDRin 10: SOURCEin
MDRout MULT
0011: Zout 011: Zin 11: TEMPin 11: DESTINin 0011: Div
0100:
100: R0in |
R0out
0101:
101: R1in |
R1out
0110:
110: R2in |
R2out
0111:
111: R3in |
R3out
1000:
SOURCEo |
ut
1001:
|
DESTINout
1010:
|
TEMPout
1011:
ADDRESS 1111: XOR
out
F6 F7 F8 F9 F10
(2 bits) (1 bit) (1 bit) (1 bit) (1 bit)
00: no
0: no action 0: carry-in=0 0: no action 0: continue
action
01: read 1: clear Y 1: carry-in=1 1:WMFC 1: end
10: write
A possible grouping of signal is shown here. There may be some other grouping of signal
possible. Here all out- gating of registers are grouped into one group, because the contents of
only one bus is allowed to goto the internal bus, otherwise there will be a conflict of data.
But the in-gating of registers are grouped into three different group. It implies that the contents
of the bus may be stored into three different registers simultaneously transfer to MAR and Z.
Due to this grouping, we are using 7 bits (3+2+2) for the in-gating signal. If we would have
grouped then in one group, then only 4 bits would have been enough; but it will take more time
during execution. In this situation, two clock cycles would have been required to transfer the
contents of PC to MAR and Z.
Therefore, the grouping of signal is a critical design parameter. If speed of operation is also a
design parameter, then compression of control word will be less.
In this grouping, 46 control signals are grouped into 10 different groups ( F1 , F2 ,………., F10
) and the size of control word is 21. So, the size of control word is reduced from 46 to 21, which
is more than 50%.
For the proper decoding, we need the following decoder:
For group F1 & F5 : 4 X 16 decoder,
group F2 : 3 X 8 decoder
group F3,F4 & F6 : 2 X 4 decoder
Microprogram Sequencing:
In microprogrammed controlled CU,
• Each machine instruction can be implemented by a microroutine.
Sequencing Techniques:
Based on the current microinstruction, condition flags and the contents of the instruction
register, a control memory address must be generated for the next microinstruction. A wide
variety of techniques have been used and can be grouped them into three general categories:
• Two address fields
• Variable format.
wo Address fields:
The branch control logic with two-address field is shown in the Figure 5.20.
The address selection signals determine which option to be selected. This approach reduce the
number of address fields to one.
Variable format:
In variable format branch control logic one bit designates which format is being used. In one
format, the remaining bits are used to active control signals. In the other format, some bits drive
the branch logic module, and the remaining bits provide the address. With the first format, the
next address is either the next sequential address or an address derived from the instruction
register. With the second format, either a conditional or unconditional branch is being specified.
The approach is shown in the Figure 5.22.
Figure 5.22: Branch control logic with variable format
Address Generation:
We have looked at the sequencing problem from the point of view of format consideration and
general logic requirements. Another viewpoint is to consider the various ways in which the next
address can be derived or computed.
Various address generation Techniques
Explicit Implicit
Two-field Mapping
Unconditional branch Addition
Conditional branch Residual control
The address generation technique can be divided into two techniques: explicit & implicit.
In explicit technique, the address is explicitly available in the microinstruction.
In implicit technique, additional logic circuit is used to generate the address.
In two address field approach, signal address field or a variable format, various branch
instruction can be implemented with the explicit approaches.
In implicit technique, mapping is required to get the address of next instruction. The opcode
portion of a machine instruction must be mapped into a microinstruction address.
Q 1: What are the major components of CPU?
Q 2: What is the overall function of a processor's control unit?
Q 3: Provide a typical list of the inputs and outputs of a control unit.
Q 4: What are the basic tasks that must be performed by a CPU?
Q 5: Why registers are used in CPU?
Q 6: Explain the use of the following registers-
a. Program counter
b. Instruction register
c. Memory address register
d. Memory buffer register
Q 7: What do you mean by flag bits. Explain the use of the following flags- sign, zero, carry,
overflow and equal .
Q 8: What are the main two phases of instruction execution?
Q 9: Give and explain the instruction cycle state diagram.
Q 10: Explain the tasks that can be performed during fetch phase of an instruction execution.
Q 11: Consider the single bus organization of the CPU that is explained in the lecture note.
Write the sequence of control steps required for each of the following instructions-
a. Subtract the number NUM from register R1
b. Subtract contents of memory location NUM from register R1
c. Subtract contents of memory location whose address is at memory location NUM from
register R1
Q 12: What is the use of control signal Memory Function Complete(MFC) and Wait for
Memory Function Complete(WMFC)?
Q 13: Give the organization of control unit and explain each components.
Q 14: What is the relationship between instructions and micro-operation?
Q 15: What do you mean by horizontal and vertical organization of micro instruction?
Q 16: Why micro-program counter(MPC) is needed in micro-programmed controlled
architecture?
Q 17: Explain the following sequencing techniques for micro-program. Two address fields,
single address fields and variable format.
Q 18: What are the advantages and disadvantages of hardwired and micro-programmed control?
Why is micro-programmed control becoming increasingly more popular?.