Computer Organization Notes
Computer Organization Notes
1. Input unit
2. Output unit
3. Memory unit
4. Arithmetic and Logic unit (ALU).
5. Control unit.
1. Instructions/machine instructions:-
These monitor the transfer of information within a computer and also
between the computer and its I/O devices
They also specify the arithmetic and logic operations to be performed
2. Data:
These are the numbers and encoded characters that can be used as
operands by the instructions.
An entire program is treated as a data if it is completely used by
another program, like compiler (Compiler takes a high level
language program as an input and produces a machine language
program called Object program).
Numbers are represented in positional binary notation like BCD (Binary Coded
Decimal)
OPCODE OPERAND/s
Step 1: Fetch the instruction from main memory into the processor
Step 2: Fetch the operand at location LOCA from main memory into
the processor Register R1
Step 3: Add the content of Register R1 and the contents of register R0
Step 4: Store the result (sum) in R0.
Figure 1.2 below shows how the memory and the processor are
connected. As shown in the diagram, in addition to the ALU and the
control circuitry, the processor contains a number of registers used for
several different purposes. The instruction register (IR) holds the
instruction that is currently being executed. The program counter
(PC) keeps track of the execution of the program. It contains the
memory address of the next instruction to be fetched and executed.
There are n general purpose registers R0 to Rn-1 which can be used
by the programmers during writing programs.
The execution of the program is started when the address of the first
instruction of the program is stored in PC. The contents of PC is
transferred to MAR and a read control signal is issued to the memory.
After the memory access time is elapsed, the contents of the memory
location is loaded into MDR. Next, the contents of MDR is
transferred into IR. Then the instruction is decoded and executed.
The required operands for the instruction is fetched from the memory
by sending its address to MAR and initiating a Read cycle. When the
operand is brought into MDR, it is transferred to the ALU. Once all
the required operands are fetched, the ALU performs the desired
operation. If the result is to be stored into the memory, it is transferred
to MDR and the corresponding address of the memory location is
transferred to MAR and then a Write cycle is initiated. During the
execution of the current instruction, in between, the PC is
incremented to point to the next instruction to be executed. Hence, the
execution of the next instruction will be started as soon as the current
instruction is executed.
Some devices may require urgent services from the processor which
will affect the normal execution of the programs. In order to handle
such situations, the execution of the current program must be
interrupted. This happens when the device raises an interrupt signal.
An interrupt is nothing but a request for services from an I/O device
to the processor. The processor executes an interrupt service routine
to provide the required service to the I/O device.
1. Low cost.
2. Flexible to attach peripheral devices.
Numbers, character operands and instructions are stored in the memory of a Computer.
The memory consists of millions of storage cells, each of which can store a bit of information
either the value 0 or 1.In memory a group of n bits is referred to a memory word and n is
called the word length. The memory of a computer can be represented as shown in the fig 2.1
Today’s Modern computers have word lengths that typically range from 16 to 64 bits. A unit
of 8 bits is called a byte. A group of 4 bytes( for 32 bit computers) or group of 8 bytes(for 64
bit computer) is referred as a single word. A memory word of 32 bit computer will hold 4
ASCII characters is as shown in Figure 2.2
To retrieve/store information from/to memory, either for one word or one byte (8-bit),
addresses for each location are needed. A k-bit address memory has 2k memory locations,
namely 0 to 2k-1, called memory space.
Eg. 32-bit memory: 232 = 0 to 232-1 Memory locations = 4G
64-bit memory: 264 = 0 to 264-1 Memory locations
The data operands and the program instructions are stored in the memory. The
execution of any instruction results in movement of operands and results between the
memory and the processor.
The two basic memory operations are Load (or Read or Fetch) and Store (or Write).
1. Load (or Read or Fetch) : The Load operation sends a copy of the contents of a
memory location to processor.
Step 1:The processor sends the address of the desired location to the memory.
Step 2:The memory reads the data stored at that address and sends them to the
processor.
Here the contents of location are not changed.
2. Store (or Write) : The store operation sends an item from processor to a location.
Step 1: the processor sends both address of the location and the data to be written to
that location.
Here the previous contents of location are destroyed.
The computer program processing, consists of a sequence of small steps, such as adding
two numbers, testing for a particular condition, reading a character from the keyboard, or
sending a character to be displayed on a display screen. A computer is provided with
instructions that perform the following four types of operations:
• Data transfers between the memory and the processor registers
• Arithmetic and logic operations on data
• Program sequencing and control
• I/O transfers
Note: Memory locations are identified by names A to Z and Processor registers are identified
by R1 to RN
Three-address instructions
Two-address instructions
An instruction which has 2 address fields (operands) is called two-address instructions.
General format : Operation source,destination
Ex: Add A,B
Here the contents of A and B are added and stored in B.
One-address instructions
An instruction which has 1 address fields (operands) is called two-address instructions.
Ex1: Add A
It means add the content of memory location A to the content of the accumulator register and
stored the result in accumulator. Here accumulator is implicit in the instruction.
(Accumulator is a register used to perform arithmetic and logical operations)
Ex2: Load A
It copies the contents of memory location A into accumulator.
Ex3: Store A
It copies the contents of accumulator into memory location A.
Zero-address instructions
An instruction without address fields is called zero-address instructions.
Ex1. Push
Ex2. Pop
The above operations are used to push or pop the contents into or from the stack.
The above three instructions of the program are in successive word locations, starting at
location i. Since each instruction is 4 bytes long, the second and third instructions will start at
addresses i + 4 and i + 8.
2.3.5 BRANCHING
Consider an example, shown in the figure 2.5 (a) to add N numbers and store the sum in
one of the storage location. To add such N numbers, all the N numbers are placed in memory
locations num1,num2,num3,num4......then an Add instruction is used to add first two
numbers. Then its sum is stored in one of the register .Then a second Add instruction is used
to add third number with the previous sum. Thus it is continued until all n numbers are added
.This method has to use many such add instructions and the program becomes lengthy.
This can be avoided by using a Looping statement ( Loop is a straight line sequence of
instructions executed many times as needed) and a single Add instruction can be used, as
shown in figure 2.5(b)
Figure 2.5(a) : Straight line program to add n numbers Figure 2.5(b) : Using loop to add N
numbers
Let the number of entries in the list be N. which is stored in memory location N, as
shown in the above figure.. Register R1 is used as a counter to determine the number of times
the loop is repeated. The contents of location N are loaded into register R1 at the beginning
of the program. Then, within the body of the loop, the instruction
Decrement R1
Decrements the contents of R1 by 1 each time the loop is executed.
Execution of the loop is repeated as long as the result of the decrement operation is greater
than zero, as in the instruction Branch > 0 LOOP. This is a conditional branch statement
which will be executed until the specified condition is met. (Unconditional assembly
language branch statement includes GOTO LABEL)
When the branching condition becomes false, the loop is terminated. The result will
be available in R0. The instruction Move R0,SUM moves the sum from R0 to memory
location SUM.
The N and Z flags indicate whether the result of an arithmetic or logic operation is
Negative or zero. The N and Z flags are affected by instructions that transfer data, such as
Move, Load, or Store.
The V flag indicates whether overflow has taken place. Overflow occurs when the result of
an arithmetic operation exceeds the range of values. (That can be represented by the number
of bits available for the operands)
The C flag is set to 1 if a carry occurs from the most significant bit position during
an arithmetic operation.
How the flags are set or reset is shown in the below example.
The methods by which the location of an operand is specified in an instruction are referred
to as addressing modes. (Or) It is a method of finding the EA(Effective address) of the
operands in the instructions. The list of addressing modes is shown in the table 2.1
Register addressing mode: In this mode the operand will be a processor register, that is the
name of the register specified in the operand field of the instruction.
Move R1,R2
Absolute addressing mode( Direct addressing mode): In this mode the operand will be a
memory location that is the address of the memory location is given explicitly in the
instruction.
Move LOC,B
Address and data constants can be represented in assembly language using the Immediate
mode.
Immediate addressing mode — The operand is specified explicitly in the instruction, i.e., it is
used to specify the value of a source operand. Common convention used is # sign prefix to
the operand.
Indirect addressing mode — The effective address of the operand is the contents of a
register
or memory location whose address appears in the instruction.
The indirection is denoted by placing the name of the register or the memory address in
parentheses as shown in the below example.
Add (R1),R0
Here R1 will hold address of memory location , thus (R1) it indirectly refers to content of
memory location.
Indirect mode through general purpose register : Add (R1),R0 is executed as shown in
the above figure 2.6(a). The processor uses the value of B present in register R1. The
contents of location B are read. Now this content is added to content of register R0.
Indirect mode through a memory location : Add (A),R0 is executed as shown in figure 2.6
(b). The processor reads the contents of memory location A. Second read operation is
requested using the value B as an address to read the operand.
The figure 2.8 below represents a two-dimensional array having n rows and four columns.
Each row contains the entries for one student, and the columns give the IDs and test scores.
Auto-increment mode:
The Effective Address of the operand is the contents of a register in the instruction.
After accessing the operand, the contents of this register is automatically incremented to point
to the next item in the list.
It is symbolically represented as (Ri)+
The contents of Ri register is incremented
Ex: Add (R2)+,R0
Here, the content of R2 is added with R0 and then R2 is incremented to point to next item.
Auto-decrement mode:
The Effective Address of the operand is the contents of a register in the instruction.
After accessing the operand, the contents of this register is automatically decremented to
point to the next item in the list.
It is symbolically represented as -(Ri)
The contents of Ri register is decremented
Ex: Add -(R2),R0
Here, the Contents of R2 is decremented to point to next item in the list. This is added with
R0
The assembly language instruction can be written as An opcode followed by at least one
blank space and then preceded by operands
Ex : ADD #5,R3
Ex : MOVE A,B
Assembler directives are the commands written in assembly language program, which
instructs the assembler to perform the task/operation. These are not translated into object
program.
Some of the assembler directives are :
S EQU 150
EQU directs the assembler that the symbolic name S must be replaced with memory location
Address 150,
ORIGIN 201
Instruct assembler to initiate data block at main memory locations starting from 201
N DATAWORD 40
Inform the assembler that value of N i.e. data value 40 is to be placed in the memory location
201.
ORIGIN 100
States that assembler directive must load machine instructions of the object program in the
main
Memory starting from location 100.
END START
End of the program and the label of where program starts
N1 RESERVE 400
Reserve memory block of 400 bytes
I/O operation means by which data are transferred between the processor and the I/O
devices (outside world). To transfer data between I/O and processor, the user uses a method
called program-controlled I/O. According to this method, rate of data transfer from keyboard
to the computer is limited by the speed of the user. However the rate of transfer between
processor and display is determined by processor’s speed. Thus the speed of transfer between
the processor and the I/O devices needs a mechanism to synchronize the transfer of data
between them.
The mechanism of data transfer between processor, keyboard and display can be explained as
below (figure 2.9)
Keyboard/display Example:
Whenever the data is typed through the keyboard, it is sent to an 8-bit buffer register
called DATAIN, available in keyboard. This availability of data in DATAIN is indicated to
processor through a flag bit set in SIN register (SIN is a 1 bit flag ). When data is stored in
DATAIN, automatically SIN resets from 0 to 1. When SIN is set to 1, the processor reads the
data of DATAIN. Once the character is transferred to processor, SIN is cleared to 0
automatically.
Similarly a buffer register DATAOUT, and a status control flag, SOUT are used in
display unit. When SOUT is equal to 1, display is ready to receive a character, and thus it
displays the character. After transferring character to display, SOUT is cleared to 0. Again
SOUT is set to 1 , when display device is ready to receive a second character.
To perform I/O transfers, we need machine instructions that checks the status flags and
transfer data between the processor and the I/O devices. To transfer a character from
DATAIN to processor register R1 the following sequence of machine instructions are used.
READWAIT Branch to READWAIT if SIN = 0
Input from DATAIN to R1
To transfer a character from processor register R1 to display unit register DATAOUT, the
following sequence of machine instructions are used.
WRITEWAIT Branch to WRITEWAIT if SOUT = 0
Output from R1 to DATAOUT
Memory-mapped I/O: In this method, the memory address space will also refer to peripheral
device buffer registers such as DATAIN and DATAOUT. (The I/O addresses are available in
memory address space).
I/O mapped I/O : In this method, separate address space is available for I/O addresses and
memory addresses.
Accessing I/O Devices, Interrupts, Interrupt Hardware, Enabling and Disabling Interrupts,
Handling Multiple Devices, Controlling Device requests, Exceptions, Direct Memory Access,
Bus arbitration, Buses, Synchronous bus, Asynchronous bus, Interface Circuits, Parallel port
and Serial port (Basic concept only), Standard I/O Interfaces (Basic concepts only),
Peripheral Component Interconnect (PCI) Bus , SCSI Bus( Basic concepts only), Universal
Serial Bus (USB) ( Basic concepts only)
The ALU and all the registers are interconnected via a single common
bus. This bus is the internal bus to the processor. An external bus is
used to connect the processor to the memory and I/O devices.
The data and address lines of the external memory bus is connected to
the internal processor bus through MDR and MAR respectively.
Register MDR has two inputs and two outputs. Data may be loaded
into MDR either from the memory bus or from the internal processor
bus. The data stored in MDR may be placed on either bus.
The input of MAR is connected to the internal bus, and its output is
connected to the external bus. The control lines of the memory bus are
connected to the instruction decoder and control logic. Decoder and
control logic unit is responsible for issuing the signals that control the
operation of all the units inside the processor and for interacting with
the memory bus. The registers, the ALU and the interconnecting bus
are collectively called as datapath.
The number of registers used varies from one processor to another.
Some of the registers are provided to the users for general purpose.
Some others may be dedicated as special purpose registers like index
registers or stack pointers.
The three registers Y, Z and TEMP shown in the figure are transparent
to the programmer and are not explicitly referenced by any instruction.
They are used only by the processor for temporary storage during the
execution of some instructions. The MUX selects either the output of
register Y (input from the bus) or a constant value 4 (assuming the each
instruction occupies 4 bytes) to be provided as input A to the ALU
based on the control input Select. The constant 4 is used to increment
the contents of the program counter.
Following are the sequence of operations (with few exceptions)
required for executing an instruction –
1. Transfer a word of data from one of the processor register to
ALU or another processor register.
2. Perform an arithmetic or logic operation and store the result in a
processor register.
3. Fetch the contents of a required memory location and load them
into a processor register.
4. Store a word of data from a processor register into a required
memory location.
3.1.1 Register transfers
• For each register two control signals are used to place the
contents of that register on the bus or to load the data on the bus
into register.(as shown in Figure 3.2)
• The input and output of register Ri are connected to the bus via
switches controlled by Riin and Riout respectively.
• When Riin is set to 1, the data on the bus are loaded into register
Ri.
Example
All operations and data transfers within the processor take place
within time periods defined by the processor clock. The control
signals that govern a particular transfer are asserted at the start of
the clock cycle. The registers consist of edge-triggered flip-flops.
Hence, at the next active edge of the clock, the flip-flops of
register R4 will load the data present at their inputs. At the same
time, the control signals R1out and R4in will return to zero.
There are other schemes that are possible. For example, data
transfers may use both the rising and falling edge of the clock. If
edge-triggered flip-flops are not used, two or more clock signals may
be needed to transfer the data. This is known as multiphase
clocking.
The Figure 3.3 shows the implementation of one bit of register Ri. A
two input multiplexer is used to select the data given to the input of
an edge-triggered D flip-flop. When Riin is equal to 1, the
multiplexer selects the data on the bus and loads into the flip-flop at
the rising edge of the clock. When Riin is equal to zero, the
multiplexer feeds back the value that is stored currently in the flip-
flop. The output Q of the flip-flop is connected to the bus through a
tri-state gate. When Riout is equal to zero, the gate’s output is in the
high impedance (electrically disconnected) state which corresponds
to an open circuit state of a switch. When Riout is equal to 1, the gate
drives the bus to 1 or zero, based on the value of Q.
Bus
D Q
1
Q
Ri
out
Ri
in Clock
Figure 3.3. Input and output gating for one register bit.
3.1.2 Performing an Arithmetic or Logic operation
1. R1out, Yin
2. R2out, SelectY, Add, Zin
3. Zout, R3in
Y.
This last transfer cannot be carried out during step 2, because only
one register output can be connected to the bus during any clock
cycle.
To fetch a word from the memory, the processor has to specify the
address of the memory location and request a Read operation. This
applies for both instruction in a program or an operand specified by an
instruction. The processor transfers the required address to the MAR,
whose output is connected to the address lines of the memory bus. At
the same time , the processor uses the control lines of the memory bus
to indicate that a Read operation is needed.
MDR
MDR MDR
inE in
1. MAR ← [R1]
2. Start a Read operation on the memory bus
Note: This is only Basic Information for students. Please refer
“Reference Books” prescribed as per syllabus
DEPARTMENT OF TECHNICAL EDUCATION E-CONTENT
For simplicity, let us assume that the output of MAR is enabled all
the time. Thus the contents of MAR are always available on the
address lines of the memory bus.
Step 1 2 3
Clock
MAR in
Address
Read
MR
MDR inE
Data
MFC
MDR out
the memory. The desired address is loaded into MAR and the data to be
written are loaded into MDR, and a write command is issued.
Example
Executing the instruction Move R2,(R1) requires the following
steps-
1. R1out,MARin
2. R2out,MDRin,Write
3. MDRoutE,WMFC
Step Action
Figure 3.6. Control sequence for execution of the instruction Add (R3),R1.
Incrementer
PC
Re gister
file
Constant 4
M
U
X
ALU R
Instruction
decoder
IR
MDR
MAR
Memory b us Address
data lines lines
The ALU may simply pass one of its two operands without
modifying to bus C. R=A or R=B is the control signal for
such an operation. Using the incrementer to PC eliminates
the need to add 4 to the PC using the main ALU. The
source for constant 4 at the multiplexer can be used to
increment other addresses in case of LoadMultiple and
StoreMultiple instructions.
Figure 3.9 Control sequence for the instruction Add R4,R5,R6 for
the three-bus organization in Figure 3.8
Step Action
1 PCout , MARin ,
Read,Select4,Add, Zin
2 Zout , PCin , Yin , WMFC
3 MDRout , IRin
4 R3out , MARin , Read
5 R1out , Yin , WMFC
6 MDRout , SelectY,Add, Zin
7 Zout , R1in , End
CLK
Clock Control step Reset
counter
Step decoder
T1 T2 Tn
INS1
External
INS2 inputs
Instruction
IR Encoder
decoder
Condition
codes
INSm
Run End
Control signals
The step decoder provides a separate signal line for each step.
In the same way, the output of the instruction decoder consists
of separate line for each machine instruction. For any
instruction loaded in IR, one of the output lines INS1 through
INSm is set to 1, and rest of the lines are set to zero. The
encoder block combines all inputs to generate individual
control signals such as Yin, PCout, Add, End and so on.
Generating Zin
Zin = T1 + T6 • ADD + T4 • BR + …
Figure 3.13 Generation of the Zin control signal for the single-bus
organization of the processor.
Zin is set to 1, during the time slot T1, for all instructions,
during T6 for an Add instruction, during T4 for an
unconditional branch instruction and so on.
As an other example, the control signal End, generated by the
encoder for a single bus organization of a processor is as
shown in Figure 3.14.
Generating End
T7 T5 T4 T5
End
The End signal starts the fetch cycle of a new instruction by reseting
the control step counter to its starting value.
Figure 3.12 has a control signal RUN. When RUN is set to 1, it causes
the counter to be incremented by 1 at the end of every clock cycle.
When RUN is zero, the counter stops counting. This is required
whenever WMFC signal is issued, to make the processor to wait for
the reply from the memory.
The controlled hardware shown in the Figures 3.11 and 3.12, can be
viewed as a state machine that changes its state from one to another in
every clock cycle, based on the contents of the IR, the condition codes
and the external inputs. The outputs of this state machine are the
control signals. The wiring of the logic elements determine the
sequence of operations carried out by this machine. Hence it is called
Hardwired control. Hardwired system can operate at high speed; but
with little flexibility. The complexity of the instruction set, it can
implement is limited.
Instruction Data
cache cache
System b us
Main Input/
memory Output
out
out
Micro -
in
in
out
out
in
in
WMFC
in
Select
in
instruction
Read
End
Add
MD
MA
PC
PC
R1
R1
R3
IR
R
R
Z
Z
Y
1 0 1 1 1 0 0 0 1 1 1 0 0 0 0 0 0
2 1 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0
3 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0
4 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0
5 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0
6 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0
7 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1
Starting
IR Address
generator
Clock PC
Control
CW
Store
Figure 3.16 gives the control words corresponding to the seven steps for the
execution of Add (R3),R1 (shown in Figure 3.6). Each of the control steps in
the control sequence of an instruction defines a unique combination of 1s and
0s in the control word .
Figure 3.19 shows the organization of the control unit to allow the
conditional branching in the microprogram. The starting address
generator block of Figure 3.17 becomes the starting and the branch
address generator. This block loads the new address into µPC when a
microinstruction instructs it to do so. The conditional branching is
implemented using external inputs, condition codes as well as the
contents of the IR. The µPC is incremented every time a new
instruction is fetched from the microprogram memory except in the
following situation-
1. When a new instruction is loaded into the IR, the starting
address of the microroutine of that instruction is loaded into
Note: This is only Basic Information for students. Please
refer “Reference Books” prescribed as per syllabus
DEPARTMENT OF TECHNICAL EDUCATION E-CONTENT
µPC.
2. When a branch microinstruction is encountered, the µPC is
loaded with branch address if the branch condition is satisfied.
3. When an End microinstruction is encountered, the address of the
first control word in the microroutine for the instruction fetch
cycle is loaded into the µPC.
Figure 3.19. Organization of the control unit to allow the conditional branching in the
microprogram.
The hardware required to connect an I/O device to the bus (I/O Interface) is shown in the
figure 4.2
I/O interface is the hardware present between the I/O device and the bus. The interface
consists of 3 parts – address decoder, control circuits and data and status registers. The address
Program controlled I/O : To transfer data between I/O and processor, the user uses a
method called program-controlled I/O. According to this method, rate of data transfer from
keyboard to the computer is limited by the speed of the user. In this method, the processor
repeatedly checks the status flags to have synchronization between processor and the I/O
device.
Accordingly the processor communicates with the I/O devices. i.e. processor polls the device.
There are 2 mechanisms to implement I/O operations.
1. Interrupts
4.2 INTERRUPTS
During computation when a program needs I/O devices help, it constantly checks for
I/O status. Meanwhile the processor spends time waiting for longer duration. To avoid
processor waiting, a facility is provided for an I/O device to inform the processor only when
the request is needed. It can do so by sending a hardware signal called an Interrupt to the
processor. Thus the processor attends the interrupts and resumes back to its processing.
Interrupt service Routine : The routine executed in response to an interrupt request is called
the interrupt-service routine, which is the PRINT routine in our example. Interrupts bear
considerable resemblance to subroutine calls.
Consider an example program having two routines COMPUTE and PRINT. This
COMPUTE routine produces a set of lines of output, to be printed by PRINT routine. These
are done by repeatedly executing COMPUTE routine and then PRINT routine. The printer
accepts request for printing only one line at a time. Hence printer should send one line of text,
wait it to be printed, then send the second line and so on, until all the lines printed. In this case
processor spends certain amount of time to attend printer. It could be overcome by running
both routines at the same time in parallel. First COMPUTE routine is executed to produce first
n lines of output, then PRINT routine is executed to send first n lines to the printer. Here
instead of waiting for the lines to be printed, PRINT routine may be temporarily suspended and
execution of COMPUTE is continued. Thus whenever the printer becomes ready, it alerts the
processor by sending an interrupt request signal. In response, the processor interrupts
execution of COMPUTE routine and transfers the control to PRINT routine.
Figure 4.4 : an equivalent circuit for an open-drain bus used to implement a common
interrupt-request line.
An I/O device requests can interrupt by activating a bus line called interrupt-request. A
single interrupt-request line can be used to serve n devices as in the above figure. All devices
are connected to the line through switches to ground. When an interrupt is requested, a device
closes its associated switch. If all the switches are open, means all interrupt-request signals
INTR1 to INTRn are inactive. Thus the voltage on the interrupt-request line will be equal to
Vdd, which is the inactive state of the line. Whenever a device requests can interrupt, by
closing the switch, the voltage on line will drop to 0. This will cause the interrupt-request
signal INTR, which is received by the processor, to go to 1. The value of INTR is the logical
OR of the individual device requests, since the one or more switches are closed and line
voltage is dropped to 0. In the above figure, special gates called, Open-collector (for bipolar
circuits) or open-drain(for MOS circuits) are used to drive the INTR line.
The second possibility, with only one interrupt-request line is explained as below. The
processor automatically disables interrupts before the execution of interrupt-service routine.
Next it saves the contents of PC and the program status register (PS) on the stack. One bit in
Vectored Interrupt:
To reduce the time taken in the polling process to identify the device, a device requesting
an interrupt may identify itself directly to the processor. So the processor immediately starts
executing the corresponding interrupt-service routine. This approach is called Vectored
interrupts.
Here the device requesting an interrupt may identify itself to the processor by sending a
special code over the bus & then the processor start executing the ISR.
The code (length typically in the range of 4 to 8 bits) supplied by the processor
indicates the starting address of the ISR for the device.
The location pointed to by the interrupting device is used to store the staring address to
ISR.
The processor reads this address, called the interrupt vector & loads into PC.
The interrupt vector also includes a new value for the Processor Status Register
When the processor is ready to receive the interrupt vector code, it activate the interrupt
acknowledge (INTA) line. The I/O device responds by sending its interrupt-vector
code and then turn off the INTR signal.
Interrupt Nesting:
For some devices, long delay in responding to an interrupt request may lead to errors in
the operation of computer. Such interrupts are acknowledged and serviced even though
processor is executing an ISR for another device. This mechanism is called interrupt nesting.
a) Multiple-level priority:
We can assign a priority level to the processor that can be changed under program
control.
The priority level of the processor is the priority of the program that is currently being
executed.
b)Privileged Instruction:
The processor status word is used, in which the processor‟s priority is encoded in a few
bits. This can be changed by program instructions that write into the PS. These instructions
called privileged instructions can be executed only while processor is running in supervisor
mode.
The processor will be in supervisor mode only, when executing OS routines.
Before beginning to execute the application programs, it has to switch to user mode.
While in the user mode, if an attempt is made to execute a privileged instruction, it
leads to a special type of interrupt called a privileged exception.
Simultaneous requests :
When interrupt requests arrive from two or more devices simultaneously, the processor has to
decide which request should be serviced first and which one should be delayed.
Interrupt request received over these lines are sent to a priority arbitration circuit in the processor
A request is accepted only if it has a higher priority level than that currently assigned to the
processor.
Daisy chain:
When vectored interrupts are used, we must ensure that only one device is selected to send its
interrupt vector code. A widely used scheme for this is forming a daisy chain, as shown in the
figure 4.6 below.
The interrupt request line INTR is common to all devices. The interrupt acknowledge line
INTA is connected in a daisy chain fashion such that INTA signal propagates serially through
the devices.
When several devices raise an interrupt request, the INTR is activated & the processor
responds by setting INTA line to 1. This signal is received by device
Device1 passes the signal on to device2 only if it does not require any service.
If devices1 has a pending request for interrupt blocks that INTA signal & proceeds to put its
identification code on the data lines.
Therefore, the device that is electrically closest to the processor has the highest priority.
Usually an I/O device interface generates an interrupt request whenever it is ready for an
I/O transfer, whenever the SIN flag is set to 1. Interrupt requests are generated only by those
I/O devices, which are being used by a given program. The devices which will be idle, are not
allowed to generate interrupt requests, though they may be ready to participate in I/O transfer
Operations. Hence, we need some mechanism in the interface circuits of individual devices to
Control whether a device is allowed to generate an interrupt request or not.
The control which is needed is usually provided in the form of an interrupt-enable bit
in the device‟s interface circuit. The keyboard interrupt-enable KEN, and display
interrupt-enable DEN, flags in register CONTROL perform this function. If either of
these flags is set, the interface circuit generates an interrupt request whenever the
corresponding status flag in register STATUS is set. At the same time, the interface circuit sets
bit KIRQ /DIRQ to indicate that the keyboard / display unit is requesting an interrupt,
respectively. If an interrupt-enable bit is reset to 0, the interface circuit will not generate an
interrupt request, regardless of the state of the status flag.
4.2.5 EXCEPTIONS
An interrupt is an event which causes the execution of one program to be suspended and
execution of another program to start.
The Exception is used to refer to any event that causes an interruption.
Kinds of exception:
Recovery from errors
Debugging
Privileged Exception
Debugging:
System software has a program called debugger, which helps to find errors in a program.
The debugger uses exceptions to provide two important facilities
They are
Trace
Breakpoint
Trace Mode:
Whenever processor is in trace mode , after execution of every instruction an exception
occurs using the debugging program as the exception- service routine.
Break point:
Here the program being debugged is interrupted only at specific points selected by the user.
An instruction called the Trap (or) software- interrupt is used for this purpose.
While debugging the user may interrupt the program execution after instruction i.
When the program is executed and reaches that point, it is interrupted and debugging routine is
activated.
Privilege Exception:
To protect the OS of a computer from being corrupted by any user program, certain
instructions can be executed only when the processor is in supervisor mode. These instructions
are called privileged exceptions. When the processor is in user mode, it will not execute
instructions. That means, when the processor is in supervisor mode, it will execute
instructions.
*When a block of data is transferred , the DMA controller increment the memory address for
successive words and keep track of number of words and it also informs the processor by
raising an interrupt signal.
*While DMA control is taking place, the program that requested the transfer in between,
cannot be done and the processor can be used to execute another program.
*After DMA transfer is completed, the processor returns to the program that requested the
transfer.
An example of a computer sytem showing how DMA controllers can be used is shown in
figure 4.9. A DMA controller connects a high speed network to the computer bus. A disk
controller, that controls two disks also has DMA capability and provides two DMA channels.
It can perform two independent DMA operations at the same time.
To initiate a DMA transfer of block of data from main memory to one of the disks, a
program loads the address and word count information into the registers of the corresponding
channel of the disk controller.
Note: This is only Basic Information for students. Please refer
“Reference Books” prescribed as per syllabus
DEPARTMENT OF TECHNICAL EDUCATION E-CONTENT
After the DMA transfer is completed, this status is recorded in the status and control
register of the DMA channel, when Done-bit is set. At the same time, if the IE bit is set, the
DMA controller raises an interrupt request to the processor and sets IRQ bit.
Requests by DMA devices for using the bus are always given higher priority than
processor requests. Among different DMA devices, highest priority is provided for high speed
peripherals such as disks, high speed network interfaces or graphics display device. Among the
most memory access cycles, provided by processor, the DMA controller always tries to access
the system bus without giving chances to other devices. This interweaving technique of DMA
is called “ Cycle stealing”. This kind of transfer of block of data without interruption to main
memory is known as block or burst mode of DMA.
During blocks of data transfer between I/O devices and processor, thus many devices
compete to get access with the system bus. Hence there are chances for a conflict raised by all
devices to get access with the system bus at the same time. The device that is allowed to access
the system bus for the first time is called the BUS MASTER. BUS arbitration is the process by
which the next device is allowed to become the bus master.
There are 2 approaches to bus arbitration. They are,
Centralized arbitration ( A single bus arbiter performs arbitration)
Distributed arbitration (all devices participate in the selection of next bus master).
Centralized arbitration:
Figure 4.10 A simple arrangement for bus arbitration using daisy chain.
Here the processor is the bus master and it may grants bus master ship to one of its DMA
controller.
A DMA controller indicates that it needs to become the bus master by activating the Bus Request
line (BR) which is an open drain line.
The signal on BR is the logical OR of the bus request from all devices connected to it.
When BR is activated the processor activates the Bus Grant Signal (BGI) and indicated the DMA
controller that they may use the bus when it becomes free.
This signal is connected to all devices using a daisy chain arrangement.
Distributed Arbitration:
It means that all devices waiting to use the bus have equal responsibility in carrying out the
arbitration process.
4.4 BUSES
A bus protocol is the set of rules that govern the behaviour of various devices connected to
the bus ie, when to place information in the bus, assert control signals etc.
The bus lines used for transferring data is grouped into 3 types. They are,
Address line
Data line
Control line.
Control signals Specifies that whether read / write operation has to performed. It also carries
timing information. (ie) they specify the time at which the processor & I/O devices place the
data on the bus & receive the data from the bus.
During data transfer operation, one device plays the role of a „Master‟.
Master device initiates the data transfer by issuing read / write command on the bus. Hence
it is also called as ”Initiator‟.
The device addressed by the master is called as Slave / Target.
Types of Buses:
There are 2 types of buses. They are,
Synchronous Bus
Asynchronous Bus.
Note : Strobe means Capturing the values of data at a given instant and store them into a
buffer.
Once the master places the device address and command on the bus, it takes time for
this information to propagate to the devices:
Also, all the devices have to be given enough time to decode the address and control
signals, so that the addressed slave can place data on the bus.
At the end of the clock cycle, at time t2, the master strobes the data on the data lines
into its input buffer if it‟s a Read operation.
When data are to be loaded into a storage buffer register, the data should be available
for a period longer than the setup time of the device.
In case of a Write operation, the master places the data on the bus along with the
address and commands at time t0.
• The slave strobes the data into its input buffer at time t2.
•
Note: Synchronous buses are used in memory-processor buses
Example 1: The above figure 4.12 shows the hardware components needed for connecting a
keyboard to a processor. Keyboard consists of mechanical switches that are normally open.
When a key is pressed, the switch is closed and establishes a path for the electrical signal. This
signal is detected by an encoder circuit and it generates the ASCII code for the respective
character. The debouncing circuit is used to detect the pressing of the key as single press,
repeated pressing and some bouncing effect.
The mechanism of data transfer between the keyboard and processor is explained as below.
Example 2: consider another example for parallel interface circuit, i.e., a printer connected to
a processor.
This printer functions under the control of handshake signals, Valid and idle. When the printer
is ready to accept a character, the printer asserts its idle signal. The interface circuit then places
a new character on the data lines and activates the valid signal. Thus the printer starts printing
the new character and resets the idle signal and which in turn deactivate the valid signal.
The interface contain a data register DATAOUT and a status flag SOUT. SOUT is set
to 1 when printer is ready to accept another character and is cleared to 0, when a new character
is loaded into DATAOUT by the processor.
A standard I/O Interface is required to fit the I/O device with an Interface circuit.
The processor bus is the bus defined by the signals on the processor chip itself.
The devices that require a very high speed connection to the processor such as the main
memory may be connected directly to this bus.
The bridge connects two buses, which translates the signals and protocols of one bus into
another.
The bridge circuit introduces a small delay in data transfer between processor and the devices.
We have 3 Bus standards. These standards have been developed and approved by an
organization such as IEEE (Institute of electronics and Electrical engineers) , ANSI (American
National standards Institute) and ISO(International standards Organization). They are,
PCI (Peripheral Component Inter Connect)
SCSI (Small Computer System Interface)
USB (Universal Serial Bus)
The way in which these standards are used in a computer system is illustrated in figure 4.15.
It has the ability to select a particular target & to send commands specifying the operation to
be performed.
They are the controllers on the processor side.
Target:
The disk controller operates as a target.
It carries out the commands it receive from the initiator. The initiator establishes a logical
connection with the intended target.
Port limitations:
Normally the system has a few limited ports.
To add new ports, the user must open the computer box to gain access to the internal expansion
bus & install a new interface card.
The user may also need to know to configure the device & the s/w.
Device Characteristics:-
The kinds of devices that may be connected to a computer cover a wide range of functionality.
The speed, volume & timing constrains associated with data transfer to & from devices varies
significantly.
The processor writes the data into a memory location by loading the
address of this location into MAR and loading the data into MDR. It
indicates that a write operation is involved by setting the R/W line to
0.
7 7 1
FF
Memory
A
d
d
decoder
FF
cells
The circuit in the Figure 5.2 stores 128 bits and needs 16 external
connections for address (4 connections), data (8 connections), control
lines (2 connections) and 2 lines for power supply and ground
connections are also required.
Read operation: In order to read state of SRAM cell, the word line is
activated to close switches T1 and T2. Sense/Write circuits at the
bottom monitor the state of b and b’ and set the output accordingly.
Write operation: The data to be written into the cell is placed on bit
line b and its complement on b’, and the word line is activated. The
required signals on the bit line are generated by Sense/Write circuits.
Transistor pairs (T3, T5) and (T4, T6) form the inverters in the latch.
The read and write operations are performed as explained above. The
Static RAMs are fast, but costly. Hence, less expensive RAMs can be
implemented if simpler cells are used. But, they do not retain their
state indefinitely. Hence, they are called dynamic RAMs (DRAMs).
During the read operation, the transistor in a selected cell is turned on.
The sense amplifier connected to the bit line recharges the capacitor
to the full charge to represent the logic 1, if the charge stored on the
capacitor is above the threshold value. If the sense amplifier detects
that the charge on the capacitor is below the threshold value, it pulls
the bit line to ground level to represent a logic zero. Thus, the
contents may be refreshed while accessing the cells for reading.
The 4096 cells in each row are divided into 512 groups of 8 bits.
Hence, a row can store 512 bytes. Therefore, 12 address bits are used
to select a row, and 9 address bits are used to select a group in a row.
As a result, a total of 21 address bits are needed to access a byte. The
higher order 12 bits form the row address and the lower order 9 bits
form the column address. To reduce the number of pins needed for
external connections, the row and column addresses are multiplexed
on 12 pins.
During read and write operation, first apply the row address, the RAS
signal latches the row address into row address latch. Then apply the
column address, CAS signal latches the column address into column
address latch. Then, if a read operation is initiated, the information in
the column address latch is decoded and the appropriate group of 8
Sense/Write circuits are selected. The output values of the selected
circuits are transferred to the data lines, D7-0. For a write operation,
the information on D7-0 lines are transferred to the selected circuits,
which is used to overwrite the contents of the selected cells in the
corresponding 8 columns.
DRAMs are low cost, and high density memories. Hence, they are
widely used in computers. They range in size from 1M to 256 M bits
and even larger chips are being developed.
During a Read operation, the contents of the cells in a row are loaded
onto the latches. During a refresh operation, the contents of the cells
are refreshed without changing the contents of the latches. Data held
Note: This is only Basic Information for students. Please
refer “Reference Books” prescribed as per syllabus
DEPARTMENT OF TECHNICAL EDUCATION E-CONTENT
data is transferred and the width of the data bus. Memory chips are
designed to meet the speed requirements of popular buses.
Double-Data-Rate SDRAM:
The SDRAM does all the actions on the rising edge of the clock
signal. There is a similar device, that accesses the cell array in the
same way, but the data transfers take place on both edges of the clock.
Since the data is transferred on both the edges of the clock, bandwidth
seems to be doubled for long burst transfers. Such devices are called
double-data-rate SDRAMs (DDR SDRAMs).
For faster access of the data, the cell array is organized into two
banks, each of which can be accessed separately. Consecutive words
of a block are stored on different banks, so that the two words, which
are transferred on successive edges of the clock, can be
simultaneously accessed.
Refresh overhead:
Rambus provides 8 data lines and a 9th data line is used for parity
checking. A two channel rambus known as Direct RDRAM, has 18
data lines to transfer 2 bytes of data at a time. There are no separate
address lines.
SRAM and SDRAM chips are volatile, as they lose the contents when
the power is turned off. Many applications need memory devices to
retain contents after the power is turned off. For example, When
computer is turned on, the operating system must be loaded from the
disk into the memory. It is required to store instructions which would
load the OS from the disk, in a memory, that will not lose its contents
after the power is turned off. Hence, we need to store the instructions
into a non-volatile memory. Non-volatile memory is read in the same
manner as volatile memory. Separate writing process is needed to
place information in this memory. Normal operation involves only
reading of data, this type of memory is called Read-Only memory
(ROM).
5.3.1 ROM
Figure 5.12 shows a possible configuration for a ROM cell. When the
transistor is connected to ground at point P, a logic value 0 is stored in
the cell, else logic 1 is stored. The bit line is connected to the power
supply through a register. To read the contents of the cell, the word
line is activated. A sense circuit at the end of the bit line is used to
generate the proper output value. Data are written into a ROM when it
is manufactured.
5.3.2 PROM
Some ROMs allow the data to be loaded by a user. Such ROMs are
called Programmable ROMs (PROMs). PROM is made
programmable by inserting a fuse at point P in Figure 5.12. PROM
contains all 0s before it is programmed. The user can insert 1 at the
required location by burning the fuses at that locations using high
current pulses. The process of inserting the data is irreversible.
Storing information specific to a user in a ROM is expensive. PROMs
provide a faster and considerably less expensive approach as they can
be programmed directly by the user.
5.3.3 EPROM
Another type of ROM chip allows the stored data to be erased and
new data to be loaded, which are called as Erasable Programmable
ROM (EPROM). It provides considerable flexibility during
development phase of digital systems. EPROM cell has a structure
similar to ROM cell as shown in Figure 5.12. However, the
connection to ground is made at P, using a special transistor which
can be turned off. This transistor can be programmed to behave like a
permanently open switch, by injecting charge into it, that becomes
trapped inside. Advantages of EPROM chips is that their contents can
be erased by dissipating the charges trapped in the transistor and
reprogrammed. For this reason, EPROM chips are mounted in
packages that have transparent windows through which the chips can
be exposed to UV light for erasing the data.
5.3.4 EEPROM
Single flash chips are not sufficiently large, so flash cards and flash
drives are used to implement larger memory modules.
Flash cards
Flash Drives
Flash drives are designed to fully emulate the hard disks. The flash
drives have lower storage capacity. Right now, the flash drives have
the capacity less than one gigabyte.
Advantages:
1. Flash drives are solid state electronic devices, which do not have
movable parts. This leads to faster response.
2. Flash drives consume less power, making it attractive for use in
the applications that are battery-driven.
3. Flash drives are insensitive to vibration
Disadvantages
1. Flash drives have smaller capacity and higher cost per bit.
2. The flash memory deteriorate after it has been written a number
of times, say one million times.
The next level in the hierarchy is the main memory, which are
implemented using DRAMs, in the form of SIMMs, DIMMs, or
RIMMs. The main memory is much larger, but slower than the cache
memory. The access times of the main memory is 10 times more than
the L1 cache. Since magnetic disks provide inexpensive storage, they
can be used as the secondary storage as shown in the Figure 5.13.
5.5 Cache Memories
At any given point of time, only some blocks in the main memory can
be held in the cache. The correspondence between the blocks in the
main memory and those in the cache is determined by a “mapping
function”.
The processor need not to know the existence of the cache. It simply
issues Read or Write requests. The cache control hardware finds out
whether the requested word is currently in the cache or not. If the
word exists in the cache, the Read or Write operation is performed.
This is called read or write hit is said to have occurred.
If the data is not present in the cache, then a Read miss or Write miss
occurs. When a read miss occurs, the block of words containing this
requested word is transferred from the memory. After the block is
loaded, the desired word is forwarded to the processor. Alternatively,
6.1 PROCESSORS
A processor is the logic circuitry that responds to and processes the basic instructions that
drive a computer. The four primary functions of a processor are fetch, decode, execute and
write back.
Most processors today are multi-core, which means that the IC(Integrated Circuit) contains
two or more processors for enhanced performance, reduced power consumption and more
efficient simultaneous processing of multiple tasks (see: parallel processing). Multi-core set-
ups are similar to having multiple, separate processors installed in the same computer, but
because the processors are actually plugged into the same socket, the connection between
them is faster.
Various processor families can be categorised according to clock rate versus cycles
per instruction (CPI) as shown in the figure.
RISC and CISC processors are designed for multi-core chips, embedded applications ,
low cost or low power consumption and tend to have lower clock space. However
high performance processors must necessarily be designed to operate at high clock
speeds.
The VLIW (very long instruction word) processors use more functional units than
superscalar processor.
as the boundary between software and hardware. ISA specifies the addressing modes used
for accessing data operands and the processor registers available for use by the instructions.
The first two philosophies to instruction sets were: reduced (RISC) and complex (CISC).
Figure 6.2
Figure 6.1
RISC processors have a CPI (clock per instruction) of one cycle. This is due to the
optimization of each instruction on the CPU and a technique called pipelining.
RISC instructions are simpler and can execute as fast as microinstruction on CISC
machines.
RISC families include DEC Alpha, AMD Am29000, ARC, ARM, Atmel
AVR, Blackfin, Intel i860 and i960, MIPS, Motorola88000, PA-
RISC, Power (including PowerPC), RISC-V, SuperH, and SPARC.
It is based on concept of using very large instruction set (VLSI) having simple as well
as complex instructions and keeps the program length as small as possible.
CISC instructions vary in size, specify a sequence of operations, and require serial
(slow) decoding algorithms.
They tend to have few registers, and the registers may be special purpose.
To add the flexibility in the instruction set they support more and complex addressing
modes.
Examples of CISC are IntelX86, Motorola 68000 series, DEC VAX etc.
RISC Architecture uses separate instructions and data caches. Hence their access paths are
different. Where as in CISC processor, there is a unified cache for holding both instructions
and data. Therefore they have to share the same path for data and instructions.
The hardwired control is found in most RISC processors while the traditional CISC
processors use microprogrammed control. Thus the control memory is needed in CISC
processors. The modern CISC processors may also use hardwired control.
placement of fields
Super-pipelining is the breaking of stages of a given pipeline into smaller stages (thus
making the pipeline deeper) in an attempt to shorten the clock period
Simple pipelined system performs only one pipeline stage per clock cycle.
Super pipeline is capable of performing two pipeline stages per clock cycle.
Superscalar performs only one pipeline stage per clock cycle in each parallel pipeline.
Advantages:
1. Hardware detects parallelism between instructions.
2. Hardware tries to issue as many as instructions as possible in parallel.
Disadvantage:
1. Very complex, much hardware is needed for run time detection.
2. Power consumption can be very large
When a processor has more than one core to execute all necessary functions of a computer,
then that processor is known as Multi core architecture. In other words, a chip with more than
one CPU.
The multi core CPU design require much less printed circuit board space than multi
chip designs.
A dual core processor uses slightly less power than two-coupled single core
processors. Thus they have high performance scientific applications. They are used in mobile
handsets.
Arithmetic Pipeline :
The arithmetic logic units of a computer can be segmentized for pipeline
operations in various data formats. Processors can have multiple arithmetic
logic units. Well-known arithmetic pipeline examples are the four-stage pipes
used in Star-100, the eight-stage pipes used in the TI-ASC, upto 14 pipeline
stages used in the Cray-1, and up to 26 stages per pipe in the Cyber-205.
Instruction Pipelining :
The execution of a stream of instruction can be pipelined by overlapping the
execution of the current instruction with the fetch, decode, and operand fetch of
subsequent instruction. This technique is also known as instruction lookahead.
Almost all high-performance computers are now equipped with instruction-
execution pipelines.
Processor Pipelining :
This refers to the pipeline processing of the same data stream by a cascade of
processors, each of which processes a specific task. The data stream passes the
Note: This is only Basic Information for students. Please
refer “Reference Books” prescribed as per syllabus
DEPARTMENT OF TECHNICAL EDUCATION E-CONTENT
first processor with results stored in a memory block which is also accessible
by the second processor. The second processor then passes the refined results
to the third, and so on.
A static pipeline may assume only one functional configuration at a time. Static
pipelines can be either unifunctional or multi-functional. Pipelining is made
possible in static pipes only if instructions of the same type are to be executed
continuously. The function performed by a static pipeline should not change
frequently. Otherwise, its performance may be very low. A dynamic pipeline
processor permits several functional configurations to exist simultaneously. In
this sense, a dynamic pipeline must be multifunctional. On the other hand, a
unifunctional pipe must be static. The dynamic configuration needs much more
elaborate control and sequencing mechanisms than those for static pipelines.
Most existing computers are equipped with static pipes, either unifunctional or
multifunctional.