0% found this document useful (0 votes)
23 views103 pages

DECA - Unit 4 Complete

Uploaded by

sidaksingh269
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views103 pages

DECA - Unit 4 Complete

Uploaded by

sidaksingh269
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 103

Digital Electronics and Computer Architecture (DECA)

CPU and its Organization (Unit-4):


(Complete Unit)
Central Processing Unit: Introduction, General Register Organization,
Operation of Control Unit, Control Word, Stack Organization, Instruction Format, Various
Addressing Modes, RISC and CISC Characteristics, Introduction to Parallel Processing,
Flynn’s Classification of Computers, Pipelining, Pipeline Hazards, Direct Memory Access
(DMA), DMA Transfer, Input-Output Processor (IOP), General Introduction to Computer
memory: Memory hierarchy, Main Memory: RAM and ROM, Auxiliary Memory: Magnetic
Disks and Tape, Chache Memory

B.E.-CSE 1st Sem.

Department of Interdisciplinary Courses in Engineering (DICE)


&
Department of Computer Science and Engineering 1
Central Processing Unit (CPU)

Central Processing Unit (CPU)

CPU is the brain of the computer. All types of data processing operations and all the
important functions of a computer are performed by the CPU.

The CPU consists of 3 major units, which are:


1. Memory or Storage Unit
2. Control Unit
3. ALU(Arithmetic Logic Unit)

2
Central Processing Unit (CPU)

The part of the computer that performs the bulk of data-processing operations is
called the central processing unit (CPU).

The CPU is made up of three major parts : Register set, Control Unit and Arithmetic
Logical Unit (ALU).

The register set stores intermediate data used during the execution of instructions.

The ALU performs the required micro-operations for executing the instructions.

The control unit generates the sequence of various micro-operations, supervises


the transfer of information among the registers and instructs the ALU as to which
operation to perform.

3
Central Processing Unit (Cont..)

4
Central Processing Unit (Cont..)

Memory or Storage Unit:

∙ Data and instructions are stored in memory units which are required for
processing.
∙ It also stores the intermediate results of any calculation or task when they are in
process.
∙ The final results of processing are stored in the memory units before these results
are released to an output device for giving the output to the user.
∙ All sorts of inputs and outputs are transmitted through the memory unit.

5
Central Processing Unit (Cont..)

Control Unit:

∙ Controlling of data and transfer of data and instructions is done by the control
unit among other parts of the computer.
∙ The control unit is responsible for managing all the units of the computer.
∙ The main task of the control unit is to obtain the instructions or data which is
input from the memory unit, interprets them, and then directs the operation of the
computer according to that.
∙ The control unit is responsible for communication with Input and output devices
for the transfer of data or results from memory.
∙ The control unit is not responsible for the processing of data or storing data.

6
Central Processing Unit (Cont..)

ALU (Arithmetic Logic Unit) :

∙ Arithmetic Section
∙ Logic Section

7
MAJOR COMPONENTS OF CPU
• Storage Components
Registers
Flags

• Execution (Processing) Components


Arithmetic Logic Unit(ALU)
Arithmetic calculations, Logical computations, Shifts/Rotates

• Transfer Components
Bus

• Control Components
Control Unit

Registe
r ALU
File

Control
Unit
REGISTERS
• In Basic Computer, there is only one general purpose register, the
Accumulator (AC)
• In modern CPUs, there are many general purpose registers
• It is advantageous to have many registers
– Transfer between registers within the processor are relatively fast
GENERAL REGISTER
ORGANIZATION
-In basic computer, memory locations are needed to store address, operands, pointers, temporary
results.
-Having to refer to memory locations for such applications is time consuming because memory
access is most time consuming operation in computer.
-It is more convenient and more efficient to store these intermediate values in processor register.
Clock Input
(7 lines)
R1
R2
R3
R4
R5
R6
R7
Load
(7 lines)
SELA { MUX MUX } SELB

3x8 A bus B bus


decoder

SELD
OPR ALU

Output
OPERATION OF CONTROL UNIT
The control unit :
Directs the information flow through ALU by
- Selecting various Components in the system
- Selecting the Function of ALU

Example: R1 ← R2 + R3
[1] MUX A selector (SELA): BUS A ← R2
[2] MUX B selector (SELB): BUS B ← R3
[3] ALU operation selector (OPR): ALU to ADD
[4] Decoder destination selector (SELD): R1 ← Out Bus

3 3 3 5
Control Word SELA SELB SELD OP
R
Encoding of register selection fields
Binary
Code SELA SELB
SELD
000 Input Input
None
001 R1 R1
R1
010 R2 R2
R2
011 R3 R3
R3
100 R4 R4
R4
ALU CONTROL-Micro Operations
Encoding of ALU operations OPR
Select Operation Symbol
00000 Transfer A TSFA
00001 Increment A
INCA
00010 ADD A + B ADD
00101 Subtract A - B SUB
00110 Decrement A
DECA
01000 AND A and B AND
01010 OR A and B OR
01100 XOR A and B XOR
01110 Complement A
COMA
10000 Shift right A SHR
Examples of ALU Microoperations A
11000 Shift left A SHL A
Symbolic Designation
Microoperation SELA SELB SELD OPR
Control
R1 <− R2Word
− R3 R2 R3 R1 SUB 010 011 001 00101
R4 <− R4 OR R5 R4 R5 R4 OR 100 101 100 01010
R6 <- R6 + 1 R6 - R6 INCA 110 000 110 00001
R7 <− R1 R1 - R7 TSFA 001 000 111 00000
Output <− R2 R2 - None TSFA 010 000 000 00000
Output <− Input Input None TSFA 000 000 000 00000
R4 <− shl R4 R4 - R4 SHLA 100 000
100 11000
R5 <− 0 R5 R5 R5 XOR 101 101 101 01100
REGISTER STACK ORGANIZATION
Stack
- Very useful feature
- Also efficient for arithmetic expression evaluation
- Storage which can be accessed in LIFO
- Pointer: SP
- Only PUSH and POP operations are applicable

stac Addres
k s 63
Register Stack (64 word register stack) Flag
FULLs EMPT
Y

Stack 4
pointer
S C 3
P6 B 2
bits A 1
Push, Pop operations ( for insertion and deletion operations) 0
D
/* Initially, SP = 0, EMPTY = 1, FULL = 0 R
*/
PUSH POP
SP <− SP + 1 DR <−M[SP]
M[SP] <− DR SP <− SP − 1
If (SP = 0) then (FULL <− 1) If (SP = 0) then (EMPTY
<−1)
EMPTY <− 0 FULL <− 0
INSTRUCTION FORMAT
• Instruction
Fields
OP-code field - specifies the operation to be performed
Address field - designates memory address(es) or a processor register(s)
Mode field - determines how the address field is to be interpreted (to
get effective address or the operand)

• The number of address fields in the instruction format depends on the


internal organization of CPU

• The three most common CPU organizations:


Single accumulator organization:
ADD X /* AC ← AC + M[X]
*/
General register organization:
ADD R1, R2, R3 /* R1 ← R2 + R3
*/
ADD R1, R2 /* R1 ← R1 + R2 */
MOV R1, R2 /* R1 ← R2 */
ADD R1, X /* R1 ← R1 + M[X] */
Stack organization:
PUSH X /* TOS ← M[X] */
ADD
THREE, AND TWO-ADDRESS
INSTRUCTIONS
• Three-Address Instructions
Program to evaluate X = (A + B) * (C + D) :
ADD R1, A, B /* R1 <− M[A] + M[B] */
ADD R2, C, D /* R2 <− M[C] + M[D] */
MUL X, R1, R2 /* M[X] <− R1 * R2 */

- Results in short programs (Advantage)


- Instruction becomes long (many bits)

• Two-Address Instructions
Program to evaluate X = (A + B) * (C + D) :

MOV R1, A /* R1 <− M[A] */


ADD R1, B /* R1 <− R1 + M[A] */
MOV R2, C /* R2 🡨 M[C] */
ADD R2, D /* R2 🡨 R2 + M[D] */
MUL R1, R2 /* R1 🡨 R1 * R2 */
MOV X, R1 /* M[X] 🡨 R1 */
ONE-ADDRESS INSTRUCTIONS
(Program)
Zero-ADDRESS INSTRUCTIONS
(Program)

Stack Organized
computer uses zero-
address type of
instruction format
and in this type
operand stored in
Push down stack.
Practice Questions

Q. What will be the control word for R4🡨R6 OR


R7?
Q. Write control word for the following micro-
operation: Output🡨input.
Q. What contents will be left in stack after executing the
following instructions:
PUSH 10
POP
PUSH 5
PUSH 10 Answer: 5
POP
PUSH 15
POP 18
ADDRESSING MODES

• Addressing Modes

*Specifies a rule for interpreting or modifying the address field of the


instruction (before the operand is actually referenced)

* Variety of addressing modes

- to give programming flexibility to the user


- to use the bits in the address field of the instruction efficiently
TYPES OF ADDRESSING MODES

• Implied Mode
Address of the operands are specified implicitly in the definition of the
instruction

- No need to specify address in the instruction


- EA = AC, or EA = Stack[SP]
- Examples from Basic Computer
CLA, CME, INP

• Immediate Mode
Instead of specifying the address of the operand, operand itself is specified

- No need to specify address in the instruction


- However, operand itself needs to be specified
- Sometimes, require more bits than the address
- Fast to acquire an operand
TYPES OF ADDRESSING MODES
• Register Mode
Address specified in the instruction is the register address

- Designated operand need to be in a register


- Shorter address than the memory address
- Saving address field in the instruction
- Faster to acquire an operand than the memory addressing
- EA = IR(R) (IR(R): Register field of IR)

• Register Indirect Mode


Instruction specifies a register which contains the memory address of the
operand

- Saving instruction bits since register address


is shorter than the memory address
- Slower to acquire an operand than both the
register addressing or memory addressing
- EA = [IR(R)] ([x]: Content of x)

• Autoincrement or Autodecrement Mode


- When the address in the register is used to access memory, the value in the
register is incremented or decremented by 1 automatically
Addressing Modes

TYPES OF ADDRESSING MODES

• Direct Address Mode


Instruction specifies the memory address which can be used directly to access the
memory
- Faster than the other memory addressing modes
- Too many bits are needed to specify the address for a large physical memory space
- EA = IR(addr) (IR(addr): address field of IR)

• Indirect Addressing Mode


The address field of an instruction specifies the address of a memory location that
contains the address of the operand

- When the abbreviated address is used large physical memory can be addressed with a
relatively small number of bits
- Slow to acquire an operand because of an additional memory access
- EA = M[IR(address)]
Addressing Modes

TYPES OF ADDRESSING MODES


• Relative Addressing Modes

In this mode, the content of the program counter (PC) is added to the address part of
the instruction in order to obtain the effective address.

The Address fields of an instruction specifies the part of the address (abbreviated
address) which can be used along with a designated register to calculate the
address of the operand
- Address field of the instruction is short
- Large physical memory can be accessed with a small number of address bits
- EA = f(IR(address), R), R is sometimes implied

3 different Relative Addressing Modes depending on R;


* PC Relative Addressing Mode (R = PC)
- EA = PC + IR(address)
• Indexed Addressing Mode (R = IX, where IX: Index Register)
- EA = IX + IR(address)
• Base Register Addressing Mode
(R = BAR, where BAR: Base Address Register)
- EA = BAR + IR(address)
ADDRESSING MODES - EXAMPLE
and Practice Question
Addres Memor
s y
20 Load to AC Mode
PC = 0
20 Address =
200 1
20 500
Next
2 instruction
R1 =
400
39 45
XR = 9 0
40 70
100
0 0
A
C 50 80
0 0

60 90
0 0

Addressin Effectiv Conten 70 32


g e t 2 5
Mode
Direct address Address
500 /* AC ← (500) of AC
*/
800 80 30
Immediate operand 201 /* AC ← 500 */ 0 0
500
Indirect address 800 /* AC ← ((500)) */
300
Relative address 702 /* AC ← (PC+500) */
325
Indexed address 600 /* AC ← (RX+500) */
900
Register - /* AC ← R1 */
Practice Question-Cont…

25
Practice Question-Cont…

26
RISC and CISC

• Reduced Instruction Set Computer (RISC) –


The main idea behind this is to make hardware simpler by using an
instruction set composed of a few basic steps for loading, evaluating, and
storing operations just like a load command will load data, a store
command will store the data.

• Complex Instruction Set Computer (CISC) –


The main idea is that a single instruction will do all loading, evaluating,
and storing operations just like a multiplication command will do stuff
like loading data, evaluating, and storing it, hence it’s complex.

27
Continue..

Both approaches try to increase the CPU performance


•RISC: Reduce the cycles per instruction at the cost of the number of
instructions per program.

•CISC: The CISC approach attempts to minimize the number of instructions


per program but at the cost of an increase in the number of cycles per
instruction.

RISC processor is comparatively faster and


consumes less power.

28
Characteristic of RISC

Earlier when programming was done using assembly language, a need


was felt to make instruction do more tasks because programming in
assembly was tedious and error-prone due to which CISC architecture
evolved but with the uprise of high-level language dependency on
assembly reduced RISC architecture prevailed.

Characteristic of RISC –
1.Simpler instruction, hence simple instruction decoding.
2.Instruction comes in the size of one word.
3.Instruction takes a single clock cycle to get executed.
4.More general-purpose registers.
5.Simple Addressing Modes.
6.Fewer Data types.
7.A pipeline can be achieved.

29
Characteristic of CISC

Characteristic of CISC –
1.Complex instruction, hence complex instruction decoding.
2.Instructions are larger than one-word size.
3.Instruction may take more than a single clock cycle to get executed.
4.Less number of general-purpose registers as operations get
performed in memory itself.
5.Complex Addressing Modes.
6.More Data types.

CISC is a processor which is developed with a full (complex)


set of instructions aimed at providing the full processor
capacity in the most efficient manner.

30
Continue..

Example – Suppose we have to add two 8-bit numbers:


•CISC approach: There will be a single command or instruction for this like
ADD which will perform the task.

•RISC approach: Here programmer will write the first load command to load
data in registers then it will use a suitable operator and then it will store the
result in the desired location.

So, add operation is divided into parts i.e. load, operate, store due to which
RISC programs are longer and require more memory to get stored but
require fewer transistors due to less complex command.

31
Difference

RISC CISC
Focus on software Focus on hardware

Uses both hardwired and microprogrammed


Uses only Hardwired control unit
control unit

Transistors are used for storing complex


Transistors are used for more registers
Instructions

Fixed sized instructions Variable sized instructions

Can perform only Register to Register Can perform REG to REG or REG to MEM or
Arithmetic operations MEM to MEM

Requires more number of registers Requires less number of registers

Code size is large Code size is small

An instruction executed in a single clock cycle Instruction takes more than one clock cycle

An instruction fit in one word Instructions are larger than the size of one word

32
Parallel Processing

33
Parallel Processing

• Parallel processing can be described as a class of techniques which enables the


system to achieve simultaneous data-processing tasks to increase the
computational speed of a computer system.

• The primary purpose of parallel processing is to speed up the computer processing


capability and increase its throughput, i.e. the amount of processing that can be
accomplished during a given interval of time.

• A parallel processing system is able to perform concurrent data processing to


achieve faster execution time. For example, while an instruction is being executed
in the ALU, the next instruction can be read from memory.

34
Parallel Processing
The following diagram shows one possible way of separating the execution unit into
eight functional units operating in parallel.
The operation performed in each functional unit is indicated in each block of the
diagram:

35
Parallel Processing
• The adder and integer
multiplier performs the
arithmetic operation with
integer numbers.
• The floating-point
operations are separated into
three circuits operating in
parallel.
• The logic, shift, and
increment operations can be
performed concurrently on
different data. All units are
independent of each other,
so one number can be
shifted while another
number is being
incremented.
36
Flynn's Classification of Computers

M.J. Flynn proposed a classification for the organization of a computer system by the
number of instructions and data items that are manipulated simultaneously.
• The sequence of instructions read from memory constitutes an instruction
stream.
• The operations performed on the data in the processor constitute a data stream.
• Parallel processing may occur in the instruction stream, in the data stream, or
both.
Flynn's classification divides computers into four major groups that are:
1. Single instruction stream, single data stream (SISD)
2. Single instruction stream, multiple data stream (SIMD)
3. Multiple instruction stream, single data stream (MISD)
4. Multiple instruction stream, multiple data stream (MIMD)

37
Flynn's Classification of Computers
SISD stands for 'Single Instruction and Single Data Stream'. It represents the
organization of a single computer containing a control unit, a processor unit, and a
memory unit.
Most conventional computers have SISD architecture like the traditional Von-
Neumann computers.

Where, CU = Control Unit,


PE = Processing Element,
M = Memory
Instructions are decoded by the
Control Unit and then the Control
Unit sends the instructions to the
processing units for execution.

38
Flynn's Classification of Computers - SIMD
SIMD stands for 'Single Instruction and
Multiple Data Stream'. It represents an
organization that includes many processing
units under the supervision of a common
control unit.
All processors receive the same instruction
from the control unit but operate on different
items of data. The shared memory unit must
contain multiple modules so that it can
communicate with all the processors
simultaneously.

SIMD is mainly dedicated to array


processing machines. However,
vector processors can also be seen as a
part of this group. 39
Flynn's Classification of Computers

MISD stands for 'Multiple Instruction and Single Data stream'.


MISD structure is only of theoretical interest since no practical system has been
constructed using this organization.
In MISD, multiple processing units operate on one single-data stream. Each
processing unit operates on the data independently via separate instruction stream.

40
Flynn's Classification of Computers

MIMD stands for 'Multiple Instruction and Multiple Data Stream'.


In this organization, all processors in a parallel computer can execute different
instructions and operate on various data at the same time.
In MIMD, each processor has a separate program and an instruction stream is
generated from each program.

Example:
Cray T90, Cray T3E, IBM-SP2

41
Pipelining

42
Pipelining
• The term Pipelining refers to a technique of decomposing a sequential process
into sub-operations, with each sub-operation being executed in a dedicated
segment that operates concurrently with all other segments.

• The most important characteristic of a pipeline technique is that several


computations can be in progress in distinct segments at the same time.

• The overlapping of computation is made possible by associating a register with


each segment in the pipeline. The registers provide isolation between each
segment so that each can operate on distinct data simultaneously.

• The structure of a pipeline organization can be represented simply by including an


input register for each segment followed by a combinational circuit.

43
Pipelining

Let us consider an example of combined


multiplication and addition operation to get
a better understanding of the pipeline
organization.

The combined multiplication and addition


operation is done with a stream of numbers
such as:

Ai* Bi + Ci for i = 1, 2, 3, ......., 7

The operation to be performed on the


numbers is decomposed into sub-operations
with each sub-operation to be implemented
in a segment within a pipeline.
44
Pipelining
The sub-operations performed in each
segment of the pipeline are defined as:
R1 ← Ai, R2 ← Bi Input Ai, and
Bi

R3 ← R1 * R2, R4 ← Ci Multiply,
and input Ci

R5 ← R3 + R4 Add
Ci to product

The following block diagram represents the


combined as well as the sub-operations
performed in each segment of the pipeline. 45
Pipelining

Ai* Bi + Ci for i = 1, 2, 3, ......., 7

46
Pipelining

In general, the pipeline organization is applicable for two areas of computer design
which includes:
1. Arithmetic Pipeline
2. Instruction Pipeline

47
Pipelining

Arithmetic Pipeline
• Arithmetic Pipelines are mostly used in high-speed computers.
• They are used to implement floating-point operations, multiplication of fixed-point
numbers, and similar computations encountered in scientific problems.

To understand the concepts of arithmetic pipeline in a more convenient way, let us consider
an example of a pipeline unit for floating-point addition and subtraction.

The inputs to the floating-point adder pipeline are two normalized floating-point binary
numbers defined as:

X=A*2^a=0.9504*10^3
Y=B*2^a=0.8200*10^2
Where A and B are two fractions that represent the mantissa and a and b are the exponents.
48
Pipelining

Arithmetic Pipeline
The combined operation of floating-point addition and subtraction is divided into
four segments. Each segment contains the corresponding suboperation to be
performed in the given pipeline. The suboperations that are shown in the four
segments are:
1. Compare the exponents by subtraction.
2. Align the mantissas.
3. Add or subtract the mantissas.
4. Normalize the result.

49
Pipelining

Arithmetic Pipeline

X=0.9504*10^3
Y=0.8200*10^2
Where A and B are two fractions
that represent the mantissa and a
and b are the exponents.

X=0.9504*10^3
Y=0.0820*10^3

After Addition

Z=1.0324*10^3

Normalize the result:

Z=0.10324*10^4 50
Pipelining

Instruction Pipeline
• Pipeline processing can occur not only in the data stream but in the instruction
stream as well.
• Most of the digital computers with complex instructions require instruction
pipeline to carry out operations like fetch, decode and execute instructions.
In general, the computer needs to process each instruction with the following
sequence of steps.
1. Fetch instruction from memory.
2. Decode the instruction.
3. Calculate the effective address.
4. Fetch the operands from memory.
5. Execute the instruction.
6. Store the result in the proper place. 51
Pipelining

Instruction Pipeline
Each step is executed in a particular segment, and there are times when different
segments may take different times to operate on the incoming information.

Moreover, there are times when two or more segments may require memory
access at the same time, causing one segment to wait until another is finished
with the memory.

The organization of an instruction pipeline will be more efficient if the instruction


cycle is divided into segments of equal duration. One of the most common
examples of this type of organization is a Four-segment instruction pipeline.

52
Pipelining
Instruction Pipeline
• A four-segment instruction pipeline combines two or more different segments
and makes it as a single one.

• For instance, the decoding of the instruction can be combined with the
calculation of the effective address into one segment.

The following block diagram (next slide) shows a typical example of a four-
segment instruction pipeline. The instruction cycle is completed in four segments.

Segment 1: The instruction fetch segment can be implemented using first in, first
out (FIFO) buffer.
Segment 2: The instruction fetched from memory is decoded in the second
segment, and eventually, the effective address is calculated in a separate
arithmetic circuit.
Segment 3: An operand from memory is fetched in the third segment.
Segment 4: The instructions are finally executed in the last segment of the
pipeline organization. 53
Pipelining

Instruction Pipeline

1. FI is the segment that fetches an instruction.


2. DA is the segment that decodes the instruction
and calculates the effective address.
3. FO is the segment that fetches the operand.
4. EX is the segment that execute the instruction.

54
Pipelining Hazard

• Pipeline hazards are situations that prevent the next instruction in the
instruction stream from executing during its designated clock cycles.
• Any condition that causes a stall in the pipeline operations can be called a
hazard.

There are primarily three types of hazards:


i. Data Hazards
ii. Control Hazards or instruction Hazards
iii. Structural Hazards

55
Pipelining Hazard

i. Data Hazards:

• A data hazard is any condition in which either the source or the destination
operands of an instruction are not available at the time expected in the
pipeline.

• As a result of which some operation has to be delayed and the pipeline stalls.
Whenever there are two instructions one of which depends on the data
obtained from the other.

A=3+A
B=A*4

For the above sequence, the second instruction needs the value of ‘A’ computed in
the first instruction.

Thus the second instruction is said to depend on the first.


56
Pipelining Hazard

ii. Structural Hazards:

• This situation arises mainly when two instructions require a given hardware
resource at the same time and hence for one of the instructions the pipeline
needs to be delayed.

• The most common case is when memory is accessed at the same time by two
instructions. One instruction may need to access the memory as part of the
Execute or Write back phase while other instruction is being fetched.

• In this case if both the instructions and data reside in the same memory. Both
the instructions can’t proceed together and one of them needs to be stalled till
the other is done with the memory access part.

• Thus in general, sufficient hardware resources are needed for avoiding


structural hazards.
57
Pipelining Hazard

iii. Control hazards:

• The instruction fetch unit of the CPU is responsible for providing a stream of
instructions to the execution unit. The instructions fetched by the fetch unit are
in consecutive memory locations and they are executed.

• However the problem arises when one of the instructions is a branching


instruction to some other memory location. Thus all the instruction fetched in
the pipeline from consecutive memory locations are invalid now and need to
be removed(also called flushing of the pipeline).This induces a stall till new
instructions are again fetched from the memory address specified in the branch
instruction.

• Thus the time lost as a result of this is called a branch penalty. Often dedicated
hardware is incorporated in the fetch unit to identify branch instructions and
compute branch addresses as soon as possible and reducing the resulting delay
as a result.
58
Example of Pipelining

Q. 1 The number of clock cycles that it takes to process 200


tasks in a six segment pipeline is 205 .

Explanation:
•Let there be 'n' tasks to be performed in the 'pipelined processor’.

•The first instruction will be taking 'k' segment pipeline 'cycles' to exit out of
the 'pipeline' but the other 'n - 1' instructions will take only '1' 'cycle' each,
i.e, a total of 'n - 1' 'cycles’.

•So, to perform 'n' 'instructions' in a 'pipelined processor' time taken is


k + (n - 1) cycles.

•So, in our case number of clock cycles = 6 + (200 -1) = 205

59
Pipelining Question

60
Cont…

61
Practice Questions

62
Cont…
Flynn's classification categorizes computer architectures based on:
A) Memory hierarchy
B) Instruction set architecture
C) Instruction and data streams
D) Pipelining techniques

Ans. C)

Which of the following architectures executes a single instruction on multiple data streams simultaneously?
A) SIMD
B) MIMD
C) MISD
D) SISD
Ans. A)

In MIMD architecture, multiple processors execute:


A) Different instructions on different data
B) Different instructions on the same data
C) Same instructions on different data
D) Same instructions on the same data
Ans. A)

Which type of hazard occurs when the pipeline must be stalled or flushed due to a change in the control flow of instructions?
A) Data hazard
B) Control hazard
C) Structural hazard
D) Pipeline stall
Ans. B)

63
Cont…

64
Direct Memory Access (DMA)

65
Direct Memory Access (DMA)

• The transfer of data between a fast storage device such as magnetic


disk and memory is often limited by the speed of the CPU.
• Removing CPU from the path and letting the I/O devices manage the
memory buses directly would improve the speed of transfer.
• This transfer technique is called direct memory access (DMA).
• During DMA transfer, the CPU is idle and has no control of the memory
buses.
• A DMA controller takes over the buses to manage the transfer directly
the I/O device and memory.

66
Direct Memory Access (DMA)
• The bus request (BR) input is used by the DMA controller to request the CPU to
provide the control of the buses.
• When this input is active, CPU terminates the execution of current instruction
and places address bus, data bus, and read and write lines into disabled mode
(i. e. output is disconnected).
• The CPU activates bus grant (BG) output to inform the external DMA that
buses are in disconnected mode and give control of buses to conduct memory
transfer.
• When the DMA terminates the transfer, it disables the bus request line, and
inform CPU to come in its normal operation.

67
DMA Controller

DMA Controller is a hardware device that allows I/O devices to


directly access memory with less participation of the processor.

68
Direct Memory Access (DMA) Transfer

• DMA controller has to share the bus with the processor to make the data transfer. The device
that holds the bus at a given time is called bus master.

• When a transfer from I/O device to the memory or vice verse has to be made, the processor
stops the execution of the current program, increments the program counter, moves data over
stack then sends a DMA select signal to DMA controller over the address bus.

• If the DMA controller is free, it requests the control of bus from the processor by raising the
bus request signal. Processor grants the bus to the controller by raising the bus grant signal,
now DMA controller is the bus master.

• The processor initiates the DMA controller by sending the memory addresses, number of
blocks of data to be transferred and direction of data transfer.

• After assigning the data transfer task to the DMA controller, instead of waiting ideally till
completion of data transfer, the processor resumes the execution of the program after
retrieving instructions from the stack.
.
69
Direct Memory Access (DMA) Transfer

• DMA controller now has the full control of buses and can interact directly with memory and
I/O devices independent of CPU. It makes the data transfer according to the control
instructions received by the processor.

• After completion of data transfer, it disables the bus request signal and CPU disables the bus
grant signal thereby moving control of buses to the CPU.

• When an I/O device wants to initiate the transfer then it sends a DMA request signal to the
DMA controller, for which the controller acknowledges if it is free.

• Then the controller requests the processor for the bus, raising the bus request signal. After
receiving the bus grant signal it transfers the data from the device. For n channeled DMA
controller n number of external devices can be connected.

70
Direct Memory Access (DMA) Transfer
Transfer Of Data in Computer By DMA Controller

71
Direct Memory Access (DMA) Transfer

The DMA transfers the data in three modes which include the following

a) Burst Mode: In this mode DMA handover the buses to CPU only after
completion of whole data transfer. Meanwhile, if the CPU requires the bus it has
to stay ideal and wait for data transfer.

b) Cycle Stealing Mode: In this mode, DMA gives control of buses to CPU after
transfer of every byte. It continuously issues a request for bus control, makes the
transfer of one byte and returns the bus. By this CPU doesn’t have to wait for a long
time if it needs a bus for higher priority task.

c) Transparent Mode: Here, DMA transfers data only when CPU is executing the
instruction which does not require the use of buses.

72
Practice Questions

In DMA transfers, the required signals and


addresses are given by the __________
a) Processor
b) Device drivers
c) DMA controllers
d) The program itself
Answer: The DMA controller

73
Cont…

The DMA controller has _______ registers.


a) 4
b) 2
c) 3
d) 1
Answer: 3 Registers

74
Cont…

Which of the following is used to request the


bus from the main CPU?
a) data bus
b) address bus
c) bus requester
d) interrupt signal
Answer: Bus Request

75
Input Output Processors (IOP)

76
Input Output Processors (IOP)
• The DMA mode of data transfer reduces CPU’s overhead in handling
I/O operations. It also allows parallelism in CPU and I/O operations.
• Such parallelism is necessary to avoid wastage of valuable CPU time
while handling I/O devices whose speeds are much slower as
compared to CPU.
• The concept of DMA operation can be extended to relieve the CPU
further from getting involved with the execution of I/O operations.
• This gives rises to the development of special purpose processor
called Input-Output Processor (IOP).
• The Input Output Processor (IOP) is just like a CPU that handles the
details of I/O operations.
• It is more equipped with facilities than those are available in typical
DMA controller.
• The IOP can fetch and execute its own instructions that are
specifically designed to characterize I/O transfers.
• In addition to the I/O – related tasks, it can perform other processing
77
tasks like arithmetic, logic, branching and code translation .
Block Diagram of IOP (Input output Processor)

78
IOP
• The Input Output Processor is a specialized processor which loads and stores data
into memory along with the execution of I/O instructions.
• It acts as an interface between system and devices. It involves a sequence of events
to executing I/O operations and then store the results into the memory.

Advantages –
•The I/O devices can directly access the main memory without the intervention by the
processor in I/O processor based systems.
•It is used to address the problems that are arises in Direct memory access method.
•Reduced processor workload: With an I/O processor, the main processor doesn’t
have to deal with I/O operations, allowing it to focus on other tasks. This results in more
efficient use of the processor’s resources and can lead to faster overall system
performance.
• Improved data transfer rates: Since the I/O processor can access memory directly,
data transfers between I/O devices and memory can be faster and more efficient than
with other methods.
• Scalability: I/O processor based systems can be designed to scale easily, allowing
for additional I/O processors to be added as needed. This can be particularly useful in
large-scale data centers or other environments where the number of I/O devices is
constantly changing.
79
Disadvantages of IOP

Disadvantages –

1.Cost: I/O processors can add significant cost to a system due to the additional
hardware and complexity required. This can be a barrier to adoption, especially for
smaller systems.
2.Increased complexity: The addition of an I/O processor can increase the overall
complexity of a system, making it more difficult to design, build, and maintain. This
can also make it harder to diagnose and troubleshoot issues.
3.Limited performance gains: While I/O processors can improve system
performance by offloading I/O tasks from the main processor, the gains may not be
significant in all cases. In some cases, the additional overhead of the I/O processor
may actually slow down the system.
4.Synchronization issues: With multiple processors accessing the same memory,
synchronization issues can arise, leading to potential data corruption or other
errors.

80
IOP –CPU Communication

81
IOP-CPU Communication

• There is a communication channel between IOP and CPU to perform task


which come under computer architecture.
• This channel explains the commands executed by IOP and CPU while
performing some programs.
• The CPU do not executes the instructions but it assigns the task of initiating
operations, the instructions are executed by IOP.
• I/O transfer is instructed by CPU. The IOP asks for CPU through interrupt.
• This channel starts by CPU, by giving “test IOP path” instruction to IOP and
then the communication begins

• IOP and CPU Whenever CPU gets interrupt from IOP to access
memory, it sends test path instruction to IOP.
• IOP executes and check for status, if the status given to CPU is OK,
then CPU gives start instruction to IOP and gives it some control and
get back to some another (or same) program, after that IOP is able to
access memory for its program.
82
Questions on I/O processor

83
Basic Concepts of Memory Hierarchy
In the Computer System Design, Memory Hierarchy is an enhancement to
organize the memory such that it can minimize the access time. The Memory
Hierarchy was developed based on a program behavior known as locality of
references. The figure below clearly demonstrates the different levels of
memory hierarchy:
This Memory Hierarchy Design is divided into 2 main types:
1. External Memory or Secondary Memory – Comprising of Magnetic Disk,
Optical Disk, Magnetic Tape i.e. peripheral storage devices which are accessible
by the processor via I/O Module.
2. Internal Memory or Primary Memory – Comprising of Main Memory, Cache
Memory & CPU registers. This is directly accessible by the processor.
We can infer the following characteristics of Memory Hierarchy Design from
above figure:
1. Capacity: It is the global volume of information the memory can store. As we
move from top to bottom in the Hierarchy, the capacity increases.
2. Access Time: It is the time interval between the read/write request and the
availability of the data. As we move from top to bottom in the Hierarchy, the
access time increases.
3. Performance: Earlier when the computer system was designed without
Memory Hierarchy design, the speed gap increases between the CPU registers
and Main Memory due to large difference in access time.
4. Cost per bit: As we move from bottom to top in the Hierarchy, the cost per bit
Semi-Conductor Memories
A device for storing digital information that is fabricated by using integrated
circuit technology is known as semiconductor memory. Also known as
integrated-circuit memory, large-scale integrated memory, memory chip,
semiconductor storage, transistor memory.
Definition:- Semiconductor memory is the main memory element of a
microcomputer-based system and is used to store program and data. The main
memory elements are nothing but semiconductor devices that stores code and
information permanently. The semiconductor memory is directly accessible by
the microprocessor. And the access time of the data present in the primary
memory must be compatible with the operating time of the microprocessor.
Thus semiconductor devices are preferred as primary memory. With the rapid
growth in the requirement for semiconductor memories there have been a
number of technologies and types of memory that have emerged. Names such
as ROM, RAM, EPROM, EEPROM, Flash memory, DRAM, SRAM, SDRAM, and
the very new MRAM can now be seen in the electronics literature. Each one
has its own advantages and area in which it may be use.
Types of semiconductor memory: Electronic semiconductor memory
technology can be split into two main types or categories,
according to the way in which the memory operates :
1. RAM - Random Access Memory
2. ROM - Read Only Memory
1. Random Access Memory (RAM) As the names suggest, the
RAM or random access memory is a form of semiconductor
memory technology that is used for reading and writing data
in any order - in other words as it is required by the
processor. It is used for such applications as the computer or
processor memory where variables and other storage are
required on a random basis. Data is stored and read many
times to and from this type of memory.
Random access memory is used in huge quantities in
computer applications as current day computing and
processing technology requires large amounts of memory to
enable them to handle the memory hungry applications used
today. Many types of RAM including SDRAM with its DDR3,
DDR4, and soon DDR5 variants are used in huge quantities.
Dynamic RAM is a form of random access memory. DRAM uses a capacitor
to store each bit of data, and the level of charge on each capacitor
determines whether that bit is a logical 1 or 0. However these capacitors
do not hold their charge indefinitely, and therefore the data needs to be
refreshed periodically. As a result of this dynamic refreshing it gains its
name of being a dynamic RAM. DRAM is the form of semiconductor
memory that is often used in equipment including personal computers
and workstations where it forms the main RAM for the computer. The
semiconductor devices are normally available as integrated circuits for use
in PCB assembly in the form of surface mount devices or less frequently
now as leaded components.
Disadvantages of DRAM-
• Complex manufacturing process.
• Data requires refreshing.
• More complex external circuitry required (read and refresh periodically).
• Volatile memory.
• Relatively slow operational speed.
SRAM is stands for Static Random Access Memory. This form of semiconductor
memory gains its name from the fact that, unlike DRAM, the data does not need
to be refreshed dynamically. These semiconductor devices are able to support
faster read and write times than DRAM (typically 10 ns against 60 ns for
DRAM),and in addition its cycle time is much shorter because it does not need
to pause between accesses However they consume more power, they are less
dense and more expensive than DRAM. As a result of this SRAM is normally
used for caches, while DRAM is used as the main semiconductor memory
technology.
Synchronous DRAM- This form of semiconductor memory can run at faster
speeds than conventional DRAM. It is synchronized to the clock of the processor
and is capable of keeping two sets of memory addresses open simultaneously.
By transferring data alternately from one set of addresses, and then the other,
SDRAM cuts down on the delays associated with non-synchronous RAM, which
must close one address bank before opening the next. Within the SDRAM family
there are several types of memory technologies that are seen. These are
referred to by the letters DDR - Double Data Rate. DDR4 is currently the latest
technology, but this is soon to be followed by DDR5 which will offer some
significant improvements in performance.
MRAM This is Magneto-resistive RAM, or Magnetic RAM. It is a non-volatile RAM
memory technology that uses magnetic charges to store data instead of electric
charges. Unlike technologies including DRAM, which require a constant flow of
electricity to maintain the integrity of the data, MRAM retains data even when the
power is removed. An additional advantage is that it only requires low power for
active operation. As a result this technology could become a major player in the
electronics industry now that production processes have been developed to
enable it to be produced.
2. Read Only Memory (ROM) A ROM is a form of semiconductor memory
technology used where the data is written once and then not changed. In view
of this it is used where data needs to be stored permanently, even when the
power is removed - many memory technologies lose the data once the power is
removed. As a result, this type of semiconductor memory technology is widely
used for storing programs and data that must survive when a computer or
processor is powered down. For example, the BIOS of a computer will be stored
in ROM. As the name implies, data cannot be easily written to ROM. This stands
for Programmable Read Only Memory. It is a semiconductor memory which can
only have data written to it once, the data written to it is permanent. These
memories are bought in a blank format and they are programmed using a special
Typically a PROM will consist of an array off useable links some of which are "blown"
during the programming process to provide the required data pattern. The PROM
stores its data as a charge on a capacitor. There is a charge storage capacitor for each
cell and this can be read repeatedly as required. However it is found that after many
years the charge may leak away and the data may be lost. Nevertheless, this type of
semiconductor memory used to be widely used in applications where a form of ROM
was required, but where the data needed to be changed periodically, as in a
development environment, or where quantities were low.
a. EPROM- This is an Erasable Programmable Read Only Memory. This form of
semiconductor memory can be programmed and then erased at a later time. This is
normally achieved by exposing the silicon to ultraviolet light. To enable this to
happen there is a circular window in the package of the EPROM to enable the light
to reach the silicon of the chip. When the PROM is in use, this window is normally
covered by a label, especially when the data may need to be preserved for an
extended period.
b. EEPROM- This is an Electrically Erasable Programmable Read Only Memory. Data
can be written to it and it can be erased using an electrical voltage. This is typically
applied to an erase pin on the chip. Like other types of PROM, EEPROM retains the
contents of the memory even when the power is turned off. Also like other types of
ROM, EEPROM is not as fast as RAM. EEPROM memory cells are made from floating-
c. Flash memory Flash memory may be considered as a development of EEPROM
technology. Data can be written to it and it can be erased, although only in blocks, but
data can be read on an individual cell basis. To erase and re-program areas of the chip,
programming voltages at levels that are available within electronic equipment are
used. It is also non-volatile, and this makes it particularly useful. As a result Flash
memory is widely used in many applications including memory cards for digital
cameras, mobile phones, computer memory sticks and many other applications. Flash
memory stores data in an array of memory cells. The memory cells are made from
floating gate.
Disadvantages of Flash Memory
• Higher cost per bit than hard drives
• Slower than other forms of memory
• Limited number of write / erase cycles
• Data must be erased before new data can be written
• Data typically erased and written in block
d. Phase change Random Access Memory, P-RAM or just Phase Change memory, PCM. It
is based around a phenomenon where a form of chalcogenide glass changes is state or
phase between an amorphous state (high resistance) and a polycrystalline state (low
resistance). It is possible to detect the state of an individual cell and hence use this for
data storage. Currently this type of memory has not been widely commercialized, but it
Cache Memory: It is a special very high-speed memory. It is used to speed up
and synchronize with high-speed CPU. Cache memory is costlier than main memory
or disk memory but more economical than CPU registers. Cache memory is an
extremely fast memory type that acts as a buffer between RAM and the CPU. It
holds frequently requested data and instructions so that they are immediately
available to the CPU when needed. Cache memory is used to reduce the average
time to access data from the Main memory. The cache is a smaller and faster
memory that stores copies of the data from frequently used main memory
locations. There are various different independent caches in a CPU, which store
instructions and data.
Levels of Memory:
• Level 1 or Register – It is a type of memory in which data is stored and accepted that are
immediately stored in CPU. Most commonly used register is accumulator, Program counter,
address register etc.
• Level 2 or Cache memory – It is the fastest memory which has faster access time where data
is temporarily stored for faster access.
• Level 3 or Main Memory – It is memory on which computer works currently. It is small in size
and once power is off data no longer stays in this memory.
• Level 4 or Secondary Memory – It is external memory which is not as fast as main memory
but data stays permanently in this memory.
Cache Performance: When the processor needs to read or write a location in main memory, it
first checks for a corresponding entry in the cache.
• If the processor finds that the memory location is in the cache, a cache hit has occurred and
data is read from the cache.
• If the processor does not find the memory location in the cache, a cache miss has occurred.
For a cache miss, the cache allocates a new entry and copies in data from main memory,
then the request is fulfilled from the contents of the cache. The performance of cache
memory is frequently measured in terms of a quantity called Hit ratio.
Hit ratio = hit / (hit + miss) = no. of hits/total accesses
We can improve Cache performance using higher cache block size, and higher associativity,
Application of Cache Memory:
1. Usually, the cache memory can store a reasonable number of blocks at any given
time, but this number is small compared to the total number of blocks in the main
memory.
2. The correspondence between the main memory blocks and those in the cache is
specified by a mapping function.
3. Primary Cache – A primary cache is always located on the processor chip. This cache
is small and its access time is comparable to that of processor registers.
4. Secondary Cache – Secondary cache is placed between the primary cache and the
rest of the memory. It is referred to as the level 2 (L2) cache. Often, the Level 2 cache is
also housed on the processor chip.
5. Spatial Locality of reference- This says that there is a chance that the element will be
present in close proximity to the reference point and next time if again searched then
more close proximity to the point of reference.
6. Temporal Locality of reference- In this Least recently used algorithm will be used.
Whenever there is page fault occurs within a word will not only load word in main
memory but complete page fault will be loaded because the spatial locality of
reference rule says that if you are referring to any word next word will be referred to in
its register that’s why we load complete page table so the complete block will be
Auxiliary Memory
An Auxiliary memory is referred to as the lowest-cost, highest-space, and slowest-
approach storage in a computer system. It is where programs and information are
preserved for long-term storage or when not in direct use. The most typical auxiliary
memory devices used in computer systems are magnetic disks and tapes.
Magnetic Disks A magnetic disk is a type of memory constructed using a circular
plate of metal or plastic coated with magnetized materials. Usually, both sides of the
disks are used to carry out read/write operations. However, several disks may be
stacked on one spindle with read/write head available on each surface. The following
image shows the structural representation for a magnetic disk-

• The memory bits are stored in the


magnetized
surface in spots along the concentric
circles
called tracks.
• The concentric circles (tracks) are
commonly
divided into sections called sectors.
Magnetic Tape
Magnetic tape is a storage medium that allows data archiving, collection, and
backup for different kinds of data. The magnetic tape is constructed using a
plastic strip coated with a magnetic recording medium. The bits are recorded
as magnetic spots on the tape along several tracks. Usually, seven or nine bits
are recorded simultaneously to form a character together with a parity bit.
Magnetic tape units can be halted, started to move forward or in reverse, or
can be rewound. However, they cannot be started or stopped fast enough
between individual characters. For this reason, information is recorded in
blocks referred to as records.
Optical Disc
An optical disc is an electronic data storage medium that is also referred
to as an optical disk, optical storage, optical media, Optical disc drive,
disc drive, which reads and writes data by using optical storage
techniques and technology.
An optical disc, which may be used as a portable and secondary storage
device, was first developed in the late 1960s. James T. Russell invented
the first optical disc, which could store data as micron-sized light and
dark dots.
Optical Disc
An optical disc can store more data and has a longer lifespan than the preceding
generation of magnetic storage medium. To read and write to CDs and DVDs, computers
use a CD writer or DVD writer drive, and to read and write to Blu-ray discs, they require
a Blu-ray drive. MO drives, such as CD-R and DVD-R drives, are used to read and write
information to discs (magneto-optic). The CDs, Blu-ray, and DVDs are the most common
types of optical media, which are usually used to:
⮚ They are used to transfer data to various devices or computers.
⮚ These media are used to deliver the software to others.
⮚ They help users to hold large amounts of data, like videos, photos, music, and
more.
⮚ Also, optical media are used to get back up from a local machine.
With the introduction of an all-new generation of optical media, the storage capacity to
store data has increased. CDs have the potential to store 700 MB of data, whereas DVDs
allow you to store up to 8.4 GB of data. Blu-ray discs, the newest type of optical media,
can hold up to 50 GB of data. This storage capacity is the most convenient benefit as
compared to the floppy disk storage media, which can store up to 1.44 MB of data.
Additionally, a Blu-ray, a new type of optical media that can read CDs, DVDs, and Blu-ray
discs. In other words, older drives are not able to read newer optical discs, but the
Advantages of Optical Disk:
• Cost: Only plastics and aluminum foils are used in the production of an optical disk, which makes
their manufacturing cost less expensive.
• Durability: While comparing with Volatile and Non-Volatile memories, optical disks are more
durable. It is not caused to data losses due to any power failure and is not subjected to wear.
• Simplicity: With the help of using optical disks, the process of backup of data is much easier. The
data should be placed inside the drive icon; the data, which needs to be burnt. And, the users can
backup data easily only by clicking on "Burn Disk."
• Stability: A very high level of stability is provided by optical disk because it is not unprotected from
electromagnetic fields and other kinds of environmental influences, unlike magnetic disks.
Disadvantages of Optical Disk:
• Security: Optical disks need to keep safe from the hands of thieves when you are using them for
backup purposes.
• Reliability: Unlike flash drives, any plastic casings can be caused to damage optical disks.
• Capacity: As compared to other forms of storage drives, the cost of optical disks is high per GB/TB.
Objective Questions Based on Memory

102
Thank You

103

You might also like