Boolean Algebra
Boolean Algebra
Definition
Boolean algebra is a mathematical framework for analyzing and simplifying logic circuits. It involves
operations on binary variables (0 and 1) and is essential for designing digital systems.
Key Operations
A B A AND B
0 0 0
0 1 0
1 0 0
1 1 1
A B A OR B
0 0 0
0 1 1
1 0 1
1 1 1
A NOT A
0 1
1 0
1. Commutative Law:
A+B=B+AA + B = B + AA+B=B+A, A⋅B=B⋅AA \cdot B = B \cdot AA⋅B=B⋅A
2. Associative Law:
A+(B+C)=(A+B)+CA + (B + C) = (A + B) + CA+(B+C)=(A+B)+C, A⋅(B⋅C)=(A⋅B)⋅CA \cdot (B \cdot C) =
(A \cdot B) \cdot CA⋅(B⋅C)=(A⋅B)⋅C
3. Distributive Law:
A⋅(B+C)=(A⋅B)+(A⋅C)A \cdot (B + C) = (A \cdot B) + (A \cdot C)A⋅(B+C)=(A⋅B)+(A⋅C)
Applications
1
1. Simplifying logic gates in circuits.
2. Designing combinational and sequential digital circuits.
2. Turing Machine
Definition
Components
Example
Importance
3. Digital System
Definition
A digital system processes digital (discrete) data, typically in binary form (0 and 1). These systems form
the backbone of modern technology.
Components
Examples
1. Digital clocks.
2. Microcontrollers in home appliances.
3. Computers.
2
Advantages
4. History of Computers
Generations of Computers
Dedicated Systems
Embedded Systems
3
Definition
General-purpose computers store both instructions and data in the same memory.
Key Features
Examples
1. Personal Computers.
2. Smartphones.
Importance
Definition
The von Neumann architecture is a computer design model where the program instructions and data
are stored in the same memory space. This design forms the basis of most modern computers.
Key Features
1. Stored-Program Concept: Both instructions and data are stored in the same memory.
2. Single Memory Bus: A single pathway is used for both instructions and data, leading to the "von
Neumann bottleneck."
3. Sequential Execution: Instructions are executed one at a time, in order.
Components
Diagram
plaintext
Copy code
Input --> Memory <--> CPU <--> Output
4
Advantages
1. Simplicity in design.
2. Flexibility for general-purpose computing.
Disadvantages
2. Harvard Architecture
Definition
The Harvard architecture is a computer design where instructions and data are stored in separate
memory spaces, each with its own bus.
Key Features
Advantages
Disadvantages
Example
1. Charles Babbage
o Known as the "Father of the Computer."
o Designed the Analytical Engine, which had key components like memory and a control
unit, precursors to modern computers.
2. Alan Turing
o Conceptualized the Turing Machine, a theoretical model of computation.
o Laid the foundation for algorithm design and computer science.
5
3. Howard Aiken
o Developed the Harvard Mark I, one of the first electromechanical computers.
o Influenced the development of the Harvard architecture.
4. Konrad Zuse
o Built the Z3, the first programmable digital computer.
o Introduced binary arithmetic in computing.
5. Gordon Moore
o Co-founder of Intel.
o Proposed Moore’s Law, predicting the doubling of transistors on integrated circuits
approximately every two years.
Definition of Bus
A bus is a communication pathway or a group of wires used for data transfer between different
components of a computer system, such as the CPU, memory, and input/output devices.
Types of Buses
1. Data Bus
o Function: Transfers data between the CPU, memory, and I/O devices.
o Width: Measured in bits (e.g., 32-bit or 64-bit); wider buses can transfer more data at
once.
o Example: In a 64-bit processor, the data bus is 64 bits wide, allowing it to transfer 8
bytes simultaneously.
2. Address Bus
o Function: Carries the address of the memory location or I/O device that the CPU wants
to read or write to.
o Width: Determines the maximum addressable memory (e.g., a 32-bit address bus can
address 2322^{32}232 memory locations).
3. Control Bus
6
o Function: Carries control signals that coordinate operations between the CPU and other
components.
o Examples of Control Signals: Read/Write, Interrupt Request, Memory Access.
1. Address Phase:
o The CPU places the memory or I/O address on the address bus.
o The control bus specifies whether it’s a read or write operation.
2. Data Phase:
o Data is transferred via the data bus based on the operation specified.
o For a read operation, data flows from memory/I/O to the CPU.
o For a write operation, data flows from the CPU to memory/I/O.
3. Control Signals:
o The control bus ensures synchronization and handles signals like acknowledgment or
interrupts.
Bus Architecture
Bus System: Single bus for both data and instructions, known as the von Neumann bottleneck.
Limitation: The shared bus creates a delay when both data and instructions need to be accessed
simultaneously.
Harvard Architecture
7
1. Front-Side Bus (FSB)
o Connects the CPU to the main memory.
o Used in older systems; now replaced by faster alternatives.
Conflict resolution methods are strategies used to manage and prioritize multiple requests for shared
resources, such as memory, buses, or I/O devices, in a computer system. These methods ensure that the
system operates efficiently without deadlocks or priority inversion.
Definition
The daisy chain method is a priority-based conflict resolution mechanism where devices are connected
sequentially in a chain, and the priority is determined by their position in the chain.
Working
Advantages
Disadvantages
8
Example
2. Polling Method
Definition
Polling is a method where the CPU actively checks each device in a predefined sequence to see if it
needs attention.
Working
Advantages
Disadvantages
1. Inefficient, as the CPU spends time checking devices even when no request exists.
2. High latency for lower-priority devices.
Example
Definition
This method assigns fixed priorities to each device, and each device has its own dedicated request line
to the CPU.
Working
9
Advantages
Disadvantages
Example
Used in real-time systems where specific devices (e.g., emergency alarms) need immediate attention.
Definition
Address generation refers to the creation of memory or device addresses to access resources, while
sequencing ensures the correct order of handling requests.
Working
1. Address Generation:
o The CPU calculates the memory or I/O address using an address bus.
o This address is then matched with the target device or memory location.
2. Sequencing:
o Ensures that requests are serviced in the correct order.
o Example: In polling, the sequence is predefined, whereas in fixed-priority methods, the
sequence is determined by priority levels.
5. Reactivation of Devices
Definition
Reactivation refers to restarting or re-enabling devices after their requests have been serviced.
Working
Comparison of Methods
10
Method Advantages Disadvantages
Daisy Chain Simple, cost-effective Starvation of lower-priority devices
Polling No additional hardware needed Inefficient, high CPU overhead
Fixed Fast response for high-priority devices Requires extra hardware, risk of inversion
Priority
Process organization refers to the way a CPU handles processes, including their scheduling, execution,
and communication with memory and I/O devices. It involves the control unit, arithmetic logic unit
(ALU), registers, and their interaction for executing instructions.
3. Registers
o Temporary storage within the CPU used for fast data access during processing.
Registers in a CPU
Definition of Registers
Registers are small, fast storage locations inside the CPU that hold data, instructions, and addresses
temporarily during execution.
Types of Registers
Register Size (bits) Function
Accumulator (AC) 8, 16, 32, Holds intermediate results of arithmetic and logical
64 operations.
Program Counter (PC) 16, 32, 64 Holds the address of the next instruction to be executed.
Instruction Register (IR) 16, 32, 64 Stores the current instruction being executed.
Memory Address Register 16, 32, 64 Holds the memory address to be accessed.
(MAR)
Memory Data Register 16, 32, 64 Temporarily holds data read from or written to memory.
(MDR)
Stack Pointer (SP) 16, 32 Points to the top of the stack in memory.
11
Base Register (BR) 16, 32 Holds the base address for memory access.
General-Purpose Registers 8, 16, 32, Used for temporary storage of data during processing.
(GPR) 64
Status Register / Flags 8, 16 Indicates the state of the processor (e.g., zero, carry,
overflow flags).
1. Accumulator (AC)
o Size: Depends on the processor architecture (e.g., 8-bit for older CPUs, 32-bit or 64-bit
for modern CPUs).
o Function: Stores the result of operations performed by the ALU.
o Example: After adding two numbers, the result is stored in the accumulator.
12
o Examples of Flags:
Zero (Z): Set if the result is zero.
Carry (C): Set if there’s a carry out from the MSB in an addition.
Overflow (O): Set if an arithmetic operation overflows.
1. Fetch
o PC sends the address of the next instruction to memory.
o The instruction is fetched and stored in the IR.
2. Decode
o The IR decodes the instruction into opcode and operands.
o Relevant addresses or data are loaded into GPRs or the AC.
3. Execute
o The ALU performs the operation using data from the AC or GPRs.
o Results are stored back in the AC, GPRs, or memory.
4. Store
o Results are written to memory or I/O devices via the MDR and MAR.
Definition
The general register organization is a CPU design approach where multiple general-purpose registers
(GPRs) are used for temporary data storage during instruction execution. This organization facilitates
faster data manipulation compared to accessing main memory.
1. Multiple Registers
o Used to store data and intermediate results during program execution.
2. Instruction Flexibility
o Instructions can directly access registers without needing memory access.
o Example: ADD R1, R2 adds contents of R1 and R2, storing the result in R1.
4. Register Addressing
o Registers are typically addressed using a small binary code due to their limited number.
13
Faster Execution: Reduces memory access time.
Simplified Instructions: Eliminates the need for memory addressing in many operations.
Efficient Use of CPU: Operations are performed directly on registers.
Stack Organization
Definition
The stack organization is a CPU design where data is stored and retrieved in a last-in, first-out (LIFO)
order using a stack. It uses a special register called the stack pointer (SP) to keep track of the top of the
stack.
2. Implicit Addressing
o Instructions do not require explicit operands; operations are performed on the top of
the stack.
o Example: ADD pops the top two values, adds them, and pushes the result.
Types of Stacks
1. Register Stack
o Implemented using a small set of high-speed registers.
o Limited in size but faster than a memory stack.
2. Memory Stack
o Implemented in main memory.
o Larger in size but slower due to memory access time.
Memory Stack
Definition
A memory stack is a stack implemented in main memory, where a portion of memory is reserved to
store stack data. The stack pointer (SP) is used to point to the top of the stack.
14
Working of Memory Stack
1. Push Operation
o Data is placed at the address indicated by the SP, and the SP is decremented (in a
descending stack) or incremented (in an ascending stack).
2. Pop Operation
o Data is retrieved from the address indicated by the SP, and the SP is adjusted
accordingly.
3. Stack Frame
o A portion of the stack used for storing function-specific data like local variables,
parameters, and return addresses.
Simplifies Instruction Set: Operations like push, pop, and arithmetic are straightforward.
Supports Recursion: Efficient handling of nested function calls.
Dynamic Storage Allocation: Automatically adjusts for variable-sized data.
Using Stack
Copy code
PUSH 5
PUSH 10
ADD
POP RESULT
Using General Registers
sql
Copy code
15
LOAD R1, 5
LOAD R2, 10
ADD R1, R2
STORE RESULT, R1
Addressing modes Definition Addressing modes are techniques used in computer architecture to
specify the location of operands (data or instructions) that the CPU needs to process. These modes
determine how the effective address of the data is calculated during program execution.
Effective Address
The effective address is the actual memory location from which data is fetched or to which data is
written. It is calculated based on the addressing mode used in the instruction.
bash
Copy code
ADD R1, #10
yaml
Copy code
LOAD R1, 5000
16
Loads the data at memory location 5000 into R1.
scss
Copy code
LOAD R1, (5000)
Fetches the address from memory location 5000 and retrieves the data from the
resulting address.
css
Copy code
CLEAR
sql
Copy code
ADD R1, R2
17
o Definition: The register contains the address of the operand.
o Key Information:
Requires one memory access to fetch the operand.
Enables accessing data in memory indirectly.
o Example:
scss
Copy code
LOAD R1, (R2)
Copy code
LOAD R1, BASE + 10
scss
Copy code
LOAD R1, 1000(R2)
18
Copy code
JUMP +5
Full Adder
A Full Adder is a combinational logic circuit that adds three bits: two input bits and a carry-in bit,
producing a sum and a carry-out.
19
Carry-Out (C_out): Cout=(A⋅B)+(B⋅Cin)+(A⋅Cin)C_{\text{out}} = (A \cdot B) + (B \cdot C_{\text{in}}) + (A \
cdot C_{\text{in}})Cout=(A⋅B)+(B⋅Cin)+(A⋅Cin) (OR of the ANDed combinations)
Circuit Diagram:
Logic Gates
Basic Gates
1. AND Gate
o Function: Outputs 1 if all inputs are 1.
o Truth Table:
A B Output
0 0 0
0 1 0
1 0 0
1 1 1
2. OR Gate
o Function: Outputs 1 if at least one input is 1.
o Truth Table:
A B Output
0 0 0
0 1 1
1 0 1
1 1 1
3. NOT Gate
o Function: Inverts the input.
o Truth Table:
20
A Output
0 1
1 0
Derived Gates
1. NAND Gate
o Function: Outputs 0 only if all inputs are 1.
o Truth Table:
A B Output
0 01
0 11
1 01
1 10
2. NOR Gate
o Function: Outputs 0 if at least one input is 1.
o Truth Table:
A B Output
0 01
0 10
1 00
1 10
3. XOR Gate
o Function: Outputs 1 if inputs are different.
o Truth Table:
A B Output
0 0 0
0 1 1
1 0 1
1 1 0
4. XNOR Gate
21
o Function: Outputs 1 if inputs are the same.
o Truth Table:
A B Output
0 0 1
0 1 0
1 0 0
1 1 1
Half Adder
A Half Adder adds two single bits and provides a sum and carry.
Sum: S=A⊕BS = A \oplus BS=A⊕B
Carry: C=A⋅BC = A \cdot BC=A⋅B
Truth Table:
An instruction format defines the structure of an instruction in machine language. It specifies how bits
in an instruction are divided into various fields, such as the opcode (operation code), operands, and
other control bits. Understanding instruction formats is essential to understanding how processors
execute programs and carry out operations.
There are different types of instruction formats based on the number of operands (addressing modes)
they use. These are commonly classified into four categories:
22
1. 3-Address Instruction Format
2. 2-Address Instruction Format
3. 1-Address Instruction Format
4. 0-Address Instruction Format
Each type varies in the number of operands and the structure of the instruction.
Definition:
In a 3-address instruction format, the instruction provides three addresses—two for the operands and
one for the result of the operation. These types of instructions are typically used in load/store
architectures where the operands and result are accessed separately.
Structure:
o Opcode (Operation Code)
o Address 1 (First Operand)
o Address 2 (Second Operand)
o Address 3 (Result/Output)
Characteristics:
This format allows for maximum flexibility in programming because it can handle operations
between two operands and store the result in a third operand location.
It reduces the number of instructions required to perform complex calculations since both
operands and the result can be in memory.
Example:
ADD R1, R2, R3
This instruction means: Add the contents of registers R2 and R3, and store the result in R1.
Key Information:
css
Copy code
[Opcode (6 bits)] [Address1 (6 bits)] [Address2 (6 bits)] [Address3 (6 bits)]
Example in Assembly Language:
ADD R1, R2, R3
23
2. 2-Address Instruction Format
Definition:
In a 2-address instruction format, the instruction contains two addresses, typically one for the operand
and another for the result. One of the operands usually serves as both an input and an output. This
reduces the number of bits required in the instruction, thus saving memory.
Structure:
o Opcode (Operation Code)
o Address 1 (Operand 1 and Result)
o Address 2 (Operand 2)
Characteristics:
Example:
ADD R1, R2
This means: Add the contents of R2 to R1, and store the result in R1.
Key Information:
css
Copy code
[Opcode (6 bits)] [Address1 (6 bits)] [Address2 (6 bits)]
Example in Assembly Language:
sql
Copy code
ADD R1, R2
Definition:
In a 1-address instruction format, the instruction contains only one operand address. The other
operand is implicitly assumed, typically being a constant or a register (often the accumulator). The result
is stored in the same address as the operand.
24
Structure:
o Opcode (Operation Code)
o Address 1 (Operand)
Characteristics:
The use of an accumulator implies that operations generally happen between the accumulator
and the operand.
This format is often used in early microprocessors or simple processors.
Example:
ADD R1
This means: Add the contents of R1 to the accumulator and store the result in the accumulator.
Key Information:
Accumulator-based architecture.
Common in early processors or simple microcontrollers.
Example Instruction Format (in binary):
Definition:
In a 0-address instruction format, there are no explicit operand addresses in the instruction. The
operands are implicitly assumed to be on a stack or similar implicit memory structure. The operation
typically involves pushing or popping values from the stack.
Structure:
o Opcode (Operation Code)
o Operands are assumed to be on the stack.
Characteristics:
Very efficient in terms of instruction size since no operands are explicitly specified.
The stack-based architecture implies that the operands for the operation are the most recently
pushed values.
25
Common in stack-based processors and virtual machines like the Java Virtual Machine (JVM).
Example:
sql
Copy code
ADD
This means: Pop the top two elements from the stack, add them, and push the result back
onto the stack.
Key Information:
csharp
Copy code
[Opcode (6 bits)]
Example in Assembly Language:
sql
Copy code
ADD
This means Pop two values from the stack, add them, and push the result back onto the stack.
Understanding the timing circuits, instruction cycle, and micro-operations is essential for grasping how
a processor executes instructions and controls its internal operations. These concepts are crucial for the
design and functioning of a computer's Central Processing Unit (CPU).
1. Timing Circuits
26
Definition:
Timing circuits are responsible for managing the synchronization of various components within a
computer system. These circuits ensure that the different parts of the CPU work in a coordinated
manner by generating timing signals that control the flow of data and operations.
Control Synchronization: Timing circuits provide clock pulses that synchronize the actions of the
various parts of the processor (such as the ALU, registers, memory).
Clock Signals: These circuits generate clock pulses (usually from a central clock or oscillator) that
regulate the timing of operations, ensuring that the right components are activated at the right
time.
Timing Diagrams: Timing circuits work with timing diagrams that visually represent how
operations are performed in sync with the clock cycles.
1. Synchronous Circuits: These circuits operate in sync with a clock signal. Most modern
processors use synchronous circuits to ensure that operations happen in a precise order.
o Example: The CPU’s clock, which controls the timing of fetch, decode, and execute
phases of an instruction cycle.
2. Asynchronous Circuits: These circuits are not driven by a clock signal. Instead, they rely on
events or signals from other components, making them less common in modern processors but
still useful in certain low-level applications.
o Example: Handshaking signals in communication between devices.
2. Instruction Cycle
Definition:
The instruction cycle is the sequence of steps a CPU follows to fetch, decode, and execute an
instruction. It is the fundamental cycle through which a processor operates to perform tasks.
1. Fetch Phase:
o The instruction is fetched from memory.
o The Program Counter (PC) holds the address of the next instruction.
o The instruction is placed into the Instruction Register (IR).
o The PC is then incremented to point to the next instruction.
o Example: Fetching an instruction like MOV A, B.
2. Decode Phase:
27
o The instruction in the Instruction Register is decoded by the Control Unit (CU).
o The instruction’s opcode (operation code) is identified, and the operands (such as
registers or memory addresses) are extracted.
o Example: Decoding the opcode MOV, which indicates a move operation between two
registers.
3. Execute Phase:
o The operation specified by the instruction is carried out.
o This could involve calculations (in the ALU), data transfer (from registers to memory), or
other operations.
o Example: Performing the MOV operation by transferring data from register B to register
A.
Key Information:
The instruction cycle repeats continuously until the processor is powered down or interrupted.
Each cycle is controlled by the clock signal, with each phase usually taking one or more clock
cycles to complete.
The instruction cycle is often referred to as the fetch-decode-execute cycle.
Timing Diagram for Instruction Cycle:
o A timing diagram shows the relationship between the clock pulse and the activities
performed during the instruction cycle, including when the instruction is fetched,
decoded, and executed.
3. Micro-Operations
Definition:
Micro-operations (also known as micro-ops) are the smallest units of work performed by a processor
during an instruction cycle. A micro-operation usually corresponds to a single action, such as moving
data between registers, performing arithmetic operations, or modifying a flag.
Types of Micro-Operations:
28
o These involve shifting the bits of a register left or right, used in multiplication or division
by powers of two.
o Example: R1 ← R1 << 1 (Left-shift the bits in register R1).
Micro-Operation Cycle:
Micro-operations are executed during the instruction cycle, and the Control Unit coordinates
them with clock signals.
Each instruction, depending on its complexity, may require several micro-operations.
Micro-operations are responsible for manipulating the control register, status flags, and data
registers.
Key Information:
Micro-operations are often represented using register transfer notation (e.g., R1 ← R2 + R3 for
an addition operation).
Each instruction in the instruction cycle is broken down into several micro-operations. For
example:
o MOV A, B: Can be broken down into multiple micro-operations like transferring data
from memory to register, and vice versa.
o ADD A, B: Can break down into fetching operands, performing the addition, and storing
the result
Let's consider an example where the instruction is ADD R1, R2 (add the contents of R2 to R1).
1. Fetch Phase:
o Micro-Operation: PC → MAR (Program Counter to Memory Address Register).
o Micro-Operation: Memory[MAR] → IR (Fetch instruction into Instruction Register).
o Micro-Operation: PC ← PC + 1 (Increment Program Counter).
2. Decode Phase:
o Micro-Operation: Decode instruction in IR (Identify operation ADD).
3. Execute Phase:
o Micro-Operation: R1 ← R1 + R2 (Perform the addition of registers R1 and R2).
Summary
Timing Circuits: Control the synchronization and clock signals to manage the execution of
operations within a CPU.
Instruction Cycle: The sequence of fetch, decode, and execute steps that a CPU follows to
process an instruction.
Micro-Operations: The smallest actions performed during the instruction cycle, including
register transfers, arithmetic operations, logic operations, and shifts.
29
RISC vs. CISC: Short Notes with Key Information, Advantages, Disadvantages, and Architecture
Formats
Definition: RISC is a type of computer architecture that uses a small, highly optimized set of instructions.
The goal of RISC is to simplify the processor design and improve performance by executing instructions
in a single cycle.
Key Features:
Advantages of RISC:
Faster Execution: Since instructions are simple and mostly take a single cycle, RISC processors
can achieve faster execution.
Simpler Design: With fewer instructions, RISC processors are easier to design and optimize.
Pipelining Efficiency: RISC architectures are well-suited for pipelining, which further speeds up
processing.
Lower Power Consumption: The simplicity and efficiency of RISC lead to lower power usage
compared to CISC processors.
Disadvantages of RISC:
More Instructions Required: Complex tasks require more instructions, which can sometimes
lead to inefficiency in terms of program size.
Memory Bandwidth: Since more instructions are used, this can place greater demand on
memory bandwidth.
Larger Code Size: For complex operations, RISC programs can become large as they require
multiple instructions for a single complex operation.
Copy code
30
| Opcode | Source Register | Destination Register | Immediate Value |
| 6 bits | 5 bits | 5 bits | 16 bits |
Definition: CISC is an architecture that uses a large and complex set of instructions, many of which can
perform multiple operations in a single instruction. CISC processors aim to reduce the number of
instructions per program, as each instruction can do more work.
Key Features:
Advantages of CISC:
Fewer Instructions: Complex instructions may require fewer lines of code for a program,
reducing the memory needed to store the program.
Improved Code Density: Since each instruction can perform multiple actions, the total number
of instructions in a program is smaller, potentially saving memory.
Compatibility: Easier to write more compact code, especially for high-level languages.
Disadvantages of CISC:
Slower Execution: Complex instructions often take multiple cycles to execute, which can reduce
performance.
Processor Complexity: The variety of instructions in CISC architectures makes the processor
design more complex.
Inefficient Pipelining: The variable-length instructions and multiple cycles of execution make
pipelining less efficient in CISC architectures.
scss
Copy code
31
| Opcode | Operand(s) |
| 8 bits | Variable |
The Control Unit (CU) is one of the most important components of a computer system's central
processing unit (CPU). It manages and coordinates the activities of the CPU, directs the operation of the
processor, and controls the flow of data between the processor and other hardware components such
as memory and input/output devices.
The Control Unit does not perform any actual data processing itself but plays a crucial role in
interpreting instructions from memory and sending signals to the other units to execute those
instructions.
1. Instruction Fetching: The CU fetches instructions from memory in the correct sequence.
2. Instruction Decoding: It decodes the fetched instructions to understand what actions are
required.
3. Control Signals Generation: The CU generates control signals that control other components
like the ALU, registers, and memory.
4. Execution Monitoring: It monitors the execution of the instructions and ensures that the correct
sequence of operations is followed.
5. Synchronization: The CU ensures that all components of the CPU work in sync with each other.
32
1. Hardwired Control Unit
The Hardwired Control Unit uses fixed logic circuits to generate control signals. The logic of the CU is
implemented using combinational circuits that are hardwired and optimized for a specific task.
Key Features:
Example:
The control unit of a simple processor like MIPS or ARM can be hardwired.
+-------------------------+
| Instruction Register |
+-----------+-------------+
|
v
+-------------------------+
| Instruction Decoder |
+-----------+-------------+
|
v
+-------------------------+
| Control Logic (AND, OR) |---> Generates Control Signals
+-------------------------+
|
v
+---------------------------+
| ALU, Registers, Memory |
+---------------------------+
2. Microprogrammed Control Unit
The Microprogrammed Control Unit (MCU) uses a set of stored instructions (micro-operations) in
memory to generate control signals. It is more flexible than the hardwired CU because the control logic
is stored in memory rather than hardwired in circuits.
Key Features:
33
Complexity: Takes more time to fetch and execute instructions because the control logic is
stored in memory.
Speed: Slower than hardwired CU due to the extra memory access.
Programmed Control: Control signals are generated by sequences of microinstructions, and the
control unit’s behavior can be changed by modifying the microprogram.
Example:
lua
Copy code
+-------------------------+
| Instruction Register |
+-----------+-------------+
|
v
+-------------------------+
| Instruction Decoder |
+-----------+-------------+
|
v
+-------------------------+
| Control Memory (Microprogram) |
+-----------+-------------+
|
v
+---------------------------+
| ALU, Registers, Memory |
+---------------------------+
Control Signals:
Control signals are essential for the operation of the CPU. These signals control the flow of data between
registers, memory, and the Arithmetic Logic Unit (ALU). Control signals are generated based on the
instruction fetched from memory and decoded by the CU.
1. Memory Read/Write: Determines whether data should be read from or written to memory.
2. ALU Control Signals: Directs the ALU to perform specific operations (e.g., ADD, SUB, AND, OR).
3. Register Control Signals: Controls the reading from and writing to registers.
4. Clock Signals: Synchronizes the operation of the processor.
5. Interrupt Signals: Handle interrupt requests from I/O devices.
34
Steps in the Design of Control Unit:
2. Instruction Fetch:
o Design the mechanism to fetch instructions from memory using the Program Counter
(PC).
o The fetched instruction is stored in the Instruction Register (IR).
3. Instruction Decode:
o The fetched instruction is decoded to identify the opcode and operands.
o The Control Unit generates the corresponding control signals based on the decoded
instruction.
5. Execution:
o The ALU performs the desired operation, and data is moved between registers or
memory as needed.
o The CU ensures that the operations happen in the correct sequence.
1. Instruction Fetch:
o Fetch the instruction from the instruction memory.
o Increment the Program Counter (PC).
2. Microprogrammed Execution:
o Fetch the corresponding microinstruction from control memory (ROM or RAM).
o Execute the microinstructions to control different parts of the CPU (ALU, registers,
memory).
o The microinstruction is typically composed of a sequence of bits that control each part
of the system.
3. Next Instruction:
o After executing the instruction, the next instruction is fetched, and the process repeats.
35
Advantages of Hardwired Control Unit:
Faster Execution: The hardwired control unit is faster because it generates control signals
directly using fixed logic.
Simple Design for Fixed Operations: Easier to design for processors with a small or fixed
instruction set.
Less Flexibility: Changes in the instruction set or operations require redesigning the hardware.
Complex Design for Large ISAs: As the number of instructions increases, the design complexity
also increases.
Flexibility: The control unit's behavior can be easily changed by modifying the microprogram.
Easier to Design: More convenient for processors with complex instruction sets or flexible
architectures.
Slower Execution: Fetching and executing microinstructions take more time, making the process
slower.
Complexity in Memory Access: Requires more memory and time for accessing
microinstructions.
Memory in computers refers to the storage systems used to store data and instructions that are
required for processing. Memory is essential for the functioning of a computer system as it allows for
the storage of programs and data that can be quickly accessed by the processor.
Types of Memory:
36
o Definition: Secondary memory is used for long-term storage of data and programs.
Unlike primary memory, the data is retained even when the power is turned off.
o Characteristics: Slower than primary memory but has a larger storage capacity.
o Examples:
Hard Disk Drives (HDD): Magnetic storage used for storing large amounts of
data.
Solid-State Drives (SSD): Faster than HDD, uses flash memory for storing data.
Optical Discs (CD, DVD): Used for storing data on optical media.
USB Flash Drives: Portable storage devices using flash memory.
Memory Hierarchy:
Memory hierarchy refers to the arrangement of different types of memory in a computer system based
on speed, cost, and size. The hierarchy is structured so that faster, more expensive memory types are at
the top, and slower, cheaper memory types are at the bottom. The goal is to make the most frequently
used data quickly accessible.
37
o Example: DRAM (Dynamic RAM), SRAM (Static RAM).
Spatial Locality refers to the concept that if a particular data element is accessed, it is likely that nearby
data elements will be accessed in the near future. This principle is widely used in the design of memory
systems, particularly in caching mechanisms and memory access patterns.
Types of Locality:
1. Temporal Locality:
o Definition: Temporal locality refers to the reuse of specific data or instructions within a
short time period. If a data item is accessed, it is likely to be accessed again soon.
o Example: Accessing the same variable multiple times in a loop.
2. Spatial Locality:
o Definition: Spatial locality refers to the tendency of a program to access data elements
that are located near each other in memory.
o Example: Accessing elements in an array sequentially or traversing elements of a matrix.
o Relevance in Cache: Spatial locality is utilized in cache memory systems by loading not
just the data requested by the CPU but also the nearby data (in blocks or chunks),
anticipating future access to that data.
1. Hit Ratio:
o Definition: The hit ratio refers to the fraction of memory accesses that result in a cache
hit.
o Formula: Hit Ratio=Number of Cache HitsTotal Number of Memory Accesses\text{Hit
Ratio} = \frac{\text{Number of Cache Hits}}{\text{Total Number of Memory
Accesses}}Hit Ratio=Total Number of Memory AccessesNumber of Cache Hits
o Impact: A high hit ratio indicates that most memory accesses are being served by the
cache, which results in faster data retrieval.
2. Miss Ratio:
38
o Definition: The miss ratio refers to the fraction of memory accesses that result in a
cache miss.
o Formula: Miss Ratio=Number of Cache MissesTotal Number of Memory Accesses\
text{Miss Ratio} = \frac{\text{Number of Cache Misses}}{\text{Total Number of Memory
Accesses}}Miss Ratio=Total Number of Memory AccessesNumber of Cache Misses
o Impact: A lower miss ratio indicates better performance, as fewer accesses need to be
fetched from slower memory levels.
3. Miss Latency:
o Definition: Miss latency is the time taken to retrieve data from the main memory or
secondary storage after a cache miss occurs.
o Impact: High miss latency significantly reduces the performance of the system. Reducing
miss latency is a key design goal in memory systems.
4. Hit Latency:
o Definition: Hit latency is the time taken to retrieve data from the cache when a cache
hit occurs.
o Impact: Hit latency is generally much lower than miss latency, as cache memory is faster
to access than main memory.
To handle cache misses effectively, various cache replacement policies are used to determine which
data should be evicted from the cache when new data needs to be loaded. Here are common cache
replacement strategies:
2. First-In-First-Out (FIFO):
o Definition: FIFO replaces the oldest cache entry, regardless of how recently it has been
used.
o Relevance: While simple, FIFO is less efficient than LRU in optimizing cache
performance.
3. Random Replacement:
o Definition: Random replacement evicts a randomly chosen cache entry when a new one
needs to be loaded.
39
o Relevance: Although it is simple, random replacement can have a significant impact on
cache performance, especially in systems with high spatial and temporal locality.
1. Cache Size:
o Definition: The total amount of data that can be stored in the cache.
o Impact: Larger caches can store more data, which can reduce the miss ratio. However,
there is a diminishing return as cache size increases.
3. Associativity:
o Definition: The number of locations in the cache where a given block of data can be
placed.
o Impact: Higher associativity reduces the chance of cache collisions, but it also increases
complexity and latency.
1. Direct-Mapped Cache:
o Definition: In a direct-mapped cache, each block of memory maps to exactly one cache
line.
o Impact: Direct-mapped caches are simple but can have a high miss ratio due to cache
collisions.
2. Fully-Associative Cache:
o Definition: In a fully-associative cache, any block of memory can be placed in any cache
line.
o Impact: This reduces cache collisions but increases the complexity of searching for a
block in the cache.
3. Set-Associative Cache:
o Definition: A set-associative cache is a compromise between direct-mapped and fully-
associative caches. The cache is divided into multiple sets, and each set can have
multiple cache lines.
o Impact: It provides a balance between speed and efficiency, making it a common choice
in modern CPU designs.
40
Other Relevant Concepts in Memory Systems:
1. Virtual Memory:
o Definition: Virtual memory allows programs to access more memory than is physically
available by using part of the disk space as "virtual" RAM.
o Relevance: Virtual memory relies heavily on the memory hierarchy and cache
management techniques to ensure efficient data access.
2. Page Fault:
o Definition: A page fault occurs when a program attempts to access a page that is not
currently in memory, causing the operating system to load the page from secondary
storage.
o Relevance: Minimizing page faults is crucial for maintaining high system performance.
3. Memory Hierarchy:
o Definition: The arrangement of different types of memory (registers, cache, RAM, etc.)
based on speed and cost.
o Relevance: The memory hierarchy leverages spatial and temporal locality to improve
system performance.
Semiconductor memory refers to a type of memory that uses semiconductor-based devices to store
data. Unlike magnetic memory (e.g., hard drives) or optical memory (e.g., CDs), semiconductor
memories are faster, more reliable, and consume less power. These types of memory are used in various
devices such as computers, smartphones, and embedded systems.
Definition: Static RAM (SRAM) is a type of semiconductor memory that stores data in flip-flops, which
are circuits capable of maintaining a state indefinitely as long as power is supplied.
41
Key Characteristics:
Faster: SRAM is faster than Dynamic RAM (DRAM) because it does not require refreshing.
More Expensive: Due to its complexity and speed, SRAM is more expensive and has lower
density compared to DRAM.
Low Density: It occupies more physical space per bit of storage compared to DRAM.
Used in: Cache memory, high-speed buffers, and registers.
Working:
Data is stored using a flip-flop circuit that can store a bit in two stable states (high or low
voltage).
It does not need periodic refreshing, which makes it faster.
Advantages:
Disadvantages:
Definition: Dynamic RAM (DRAM) is a type of semiconductor memory that stores each bit of data in a
capacitor within an integrated circuit. Since capacitors tend to lose their charge over time, the data in
DRAM needs to be refreshed periodically.
Key Characteristics:
Slower than SRAM: DRAM has slower access times due to the need for refreshing.
Higher Density: DRAM can store more data in a given space compared to SRAM, making it
cheaper per bit.
Requires Refreshing: Data in DRAM must be periodically refreshed to avoid data loss.
Working:
Data is stored in capacitors. A charged capacitor represents a binary "1," and a discharged
capacitor represents a binary "0."
The capacitors leak charge over time, which is why the data must be refreshed every few
milliseconds.
42
Advantages:
Disadvantages:
DRAM is commonly used as the main memory in computers (e.g., desktop, laptop) and other
electronic devices.
2D Organization of Memory:
Memory can be organized in various ways to optimize access time, space, and efficiency. One common
organization is the 2D organization of memory, often used in array-based memory configurations such
as cache memory or memory modules.
2D Memory Array:
In a 2D memory organization, memory is arranged in rows and columns, forming a grid-like structure.
This organization helps optimize the addressing and access of memory cells.
Rows and Columns: Memory is organized into two dimensions (rows and columns), making it
easier to access data.
Access Efficiency: This organization improves the efficiency of accessing data in bulk (e.g.,
reading a block of data at once).
Improved Performance: Helps in optimizing speed and reducing latency for large memory
systems, as data can be accessed in parallel.
Example:
In a 2D memory array, if you have a memory block of 8 bytes, you might have it represented as
a 2x4 block of data (2 rows and 4 columns). This structure allows data to be accessed in a more
efficient manner compared to a 1D flat memory layout.
Types of ROM
There are several types of ROM, each with different characteristics regarding how data is written,
erased, and reprogrammed. These include:
43
1. Masked ROM (MROM)
Definition: Masked ROM is the earliest form of ROM, where data is physically "masked" or
programmed onto the chip during manufacturing.
Data Writing: Data is programmed during the manufacturing process, and it cannot be altered
after the chip is fabricated.
Cost: MROM is cheaper to produce when large quantities of chips are required.
Usage: Used for mass production of chips where the data does not need to change, such as in
CD players, gaming consoles, or embedded systems.
Example:
A gaming console with pre-programmed game data stored in Masked ROM.
Definition: PROM is a type of ROM that allows data to be written to it by the user (once) using a
special device called a programmer.
Data Writing: Once programmed, PROM is permanent and cannot be rewritten. The writing
process involves applying a high-voltage current to certain areas of the chip, which permanently
changes the state of the memory cells.
Usage: Used for applications where data needs to be written once and then stored permanently,
such as firmware updates in older devices.
Example:
A PROM chip used to store the firmware for an embedded system after programming.
Definition: EPROM is a type of ROM that can be erased and reprogrammed multiple times using
ultraviolet (UV) light.
Data Writing: Data is written to an EPROM chip using a programmer. The chip can be erased
using UV light, which exposes the chip to ultraviolet rays, removing the data stored on the chip.
After erasure, new data can be written to it.
Advantages:
o Reprogrammable, allowing for updates and changes.
o Can be used in development environments where frequent updates are necessary.
Disadvantages:
o Requires special equipment to erase the data (UV light).
o Slower compared to other types of ROM.
Example:
EPROM used in early BIOS chips in personal computers, which could be updated by erasing and
reprogramming the chip.
44
4. Electrically Erasable Programmable ROM (EEPROM)
Definition: EEPROM is similar to EPROM, but the data can be erased and reprogrammed
electrically, eliminating the need for UV light.
Data Writing: The chip can be erased and written multiple times using electrical signals, and the
process can be done in-circuit (i.e., without having to remove the chip).
Advantages:
o More convenient than EPROM because it can be erased and reprogrammed electrically.
o Can be done while the chip is still in the device (in-circuit).
Disadvantages:
o Slower write speeds compared to other types of memory.
o Limited number of write/erase cycles (usually around 10,000 to 1,000,000 cycles).
Example:
Used in applications such as storing the settings of a device (like a router’s configuration settings) or in
the BIOS of modern computers.
Definition: Flash memory is a more advanced form of EEPROM that allows for faster data
erasure and rewriting.
Data Writing: Flash memory can store data in an array of memory cells, where each memory cell
stores a bit of data. Flash memory can be erased and reprogrammed in blocks, which allows for
faster data manipulation than EEPROM.
Types:
o NAND Flash: Used for larger storage devices, such as USB drives, memory cards, and
SSDs.
o NOR Flash: Provides faster read speeds and is used in applications requiring fast access
to data, such as embedded systems and firmware storage.
Advantages:
o High speed and efficiency.
o Re-writable and erasable without requiring special equipment.
o Highly durable.
Disadvantages:
o Limited write/erase cycles (though much more than EEPROM).
Example:
Used in USB flash drives, solid-state drives (SSDs), and as storage for firmware in modern electronic
devices.
Definition: Ferroelectric RAM is a type of non-volatile memory that stores data using a
ferroelectric layer instead of the normal dielectric layer used in standard RAM. It retains data
even when power is off.
Data Writing: Data is written using an electric field to change the polarization of the
ferroelectric material.
45
Advantages:
o Faster write times compared to Flash memory.
o Non-volatile, meaning it retains data without power.
o Higher endurance than Flash memory (more write/erase cycles).
Disadvantages:
o Typically more expensive than other non-volatile memory.
Example:
FeRAM can be used in applications like medical devices, smart cards, and automotive electronics.
Applications of ROM:
Firmware Storage: ROM is commonly used to store firmware in devices like printers, routers, or
embedded devices.
Boot ROM: In computers, ROM is used for storing the bootloader or BIOS/UEFI firmware, which
is executed when the system is powered on.
Embedded Systems: ROM is widely used in embedded systems like washing machines,
microwave ovens, and automotive systems for storing configuration and operational data.
Consumer Electronics: Devices such as TVs, gaming consoles, and even digital cameras rely on
ROM for storing their operating systems and software.
Advantages of ROM:
Disadvantages of ROM:
46
Limited Modifiability: ROM is not easily modifiable, which can be a limitation if software needs
frequent updates.
Slow Write Times: In cases like EEPROM, the write and erase cycles are slower compared to
volatile memories.
Cache Memory:
Definition:
Cache memory is a small-sized type of volatile computer memory that provides high-speed data access
to the processor and stores frequently used program instructions and data. Cache memory sits between
the CPU and the main memory (RAM), improving overall system performance by reducing the time it
takes for the CPU to access data.
Definition:
The cache coherence problem arises in multi-core processors when multiple CPU cores have their own
private caches. If these cores are working on the same data, changes made by one core might not be
immediately reflected in the cache of the other cores, leading to inconsistent or outdated data being
used.
Example:
This problem can be solved using cache coherence protocols like MESI (Modified, Exclusive, Shared,
Invalid).
47
1. MESI Protocol:
o Modified (M): The cache has the exclusive copy of the data, and it has been modified.
o Exclusive (E): The cache has the exclusive copy of the data, but it has not been modified.
o Shared (S): The cache contains a copy of the data that is also present in other caches.
o Invalid (I): The cache does not contain valid data.
2. MOESI Protocol:
An extension of MESI, adding an additional "Owned" state to indicate that the cache has the
only valid copy of the data and is responsible for writing it back to main memory.
Write-Through Cache:
Definition:
In a write-through cache, every time data is written to the cache, it is also simultaneously
written to the main memory (RAM). This ensures that the main memory always contains a copy
of the most up-to-date data.
Advantages:
o Ensures data consistency between the cache and main memory.
o Simple design: no need for complex mechanisms to manage the cache coherency.
Disadvantages:
o Slower write operations due to the need to write to both cache and main memory.
o Increased load on the memory bus.
Example:
In a write-through cache, if the CPU writes the value 10 to an address A, it will first update the
cache and then immediately update the main memory at address A to store the value 10.
Write-Back Cache:
Definition:
In a write-back cache, data is only written to the main memory when it is evicted or replaced
from the cache (or when the cache line is marked as dirty). The cache may hold modified data
that hasn’t yet been written to the main memory.
Advantages:
o Faster write operations because the CPU only writes to the cache, avoiding frequent
memory writes.
o Reduces traffic to the system memory, making it more efficient.
Disadvantages:
o Main memory might be inconsistent with the data in the cache, leading to potential data
loss in case of a system crash before the cache is written back.
o More complex system design to handle situations where the cache is evicted or needs to
synchronize with the main memory.
Example:
In a write-back cache, if the CPU writes the value 10 to an address A, it will update the cache,
but the main memory at address A will not be immediately updated. The main memory will be
updated only when the cached data is evicted.
48
Key Differences Between Write-Through and Write-Back Cache:
Cache Hit:
Definition: A cache hit occurs when the processor finds the required data in the cache. This
results in faster access to the data since it doesn't need to fetch it from the slower main
memory.
Cache Miss:
Definition: A cache miss occurs when the required data is not found in the cache. The data must
then be fetched from the main memory, which is slower.
Miss Types:
1. Compulsory Miss: Occurs when data is accessed for the first time and is not present in the
cache.
2. Capacity Miss: Occurs when the cache cannot store all the data needed, and some data is
evicted.
3. Conflict Miss: Occurs when multiple data items map to the same cache location, leading to
eviction.
Example:
If the CPU accesses data X and it’s found in the cache, it’s a hit.
If the CPU accesses data Y and it’s not in the cache, it’s a miss, and the data is fetched from the
main memory.
Hit Ratio:
49
o Definition: The hit ratio is the fraction of memory accesses that result in a cache hit. It is
calculated as the number of cache hits divided by the total number of memory accesses.
Miss Ratio:
o Definition: The miss ratio is the fraction of memory accesses that result in a cache miss.
It is calculated as the number of cache misses divided by the total number of memory
accesses.
Miss Latency:
o Definition: Miss latency is the time it takes to fetch data from the main memory after a
cache miss occurs. It involves retrieving the data from slower memory storage, which
increases the access time.
Hit Latency:
o Definition: Hit latency is the time it takes to access data from the cache when a cache
hit occurs. This is typically much faster than fetching data from the main memory.
Definition:
Input/Output (I/O) device management refers to the processes and techniques that an operating system
(OS) uses to manage interaction with peripheral devices like keyboards, mice, printers, and storage
drives. It includes managing the data transfer, interrupt handling, device control, and communication
between devices and the system's CPU.
1. I/O Devices:
o Definition: Devices that interact with the computer by either sending data to it (input
devices like keyboards and mice) or receiving data from it (output devices like monitors
and printers).
o Examples:
Input Devices: Keyboard, Mouse, Scanner, Microphone
Output Devices: Monitor, Printer, Speakers
Storage Devices: Hard Drive, SSD, USB, Optical Drives
2. I/O Controllers:
50
o Definition: Hardware components or circuits that manage the communication between
I/O devices and the CPU. Controllers convert data between the CPU's binary format and
the format required by the I/O device.
o Example: Hard disk controller, USB controller
I/O Interface:
Definition: An interface is the boundary between the computer system and the I/O devices,
allowing data transfer between the CPU and the peripheral devices. It provides the necessary
communication protocols and control signals.
Types of I/O Interfaces:
o Parallel Interface: Multiple bits are transmitted simultaneously across multiple lines
(e.g., Printer Port, SCSI).
o Serial Interface: Bits are transmitted one at a time over a single line (e.g., RS-232, USB,
FireWire).
USB (Universal Serial Bus): Widely used for connecting input/output devices like keyboards,
printers, and external storage to a computer.
SATA (Serial Advanced Technology Attachment): Used for connecting storage devices like hard
drives and SSDs.
PCIe (Peripheral Component Interconnect Express): A high-speed interface used for connecting
internal components like graphics cards.
Interrupt:
Definition: An interrupt is a signal to the processor indicating that an event has occurred that
requires immediate attention. Interrupts temporarily halt the current execution of the CPU and
divert its attention to the interrupt service routine (ISR).
Types of Interrupts:
1. Hardware Interrupts:
o Definition: Generated by hardware devices like I/O devices when they require CPU
attention.
o Example: A keyboard generates an interrupt when a key is pressed.
2. Software Interrupts:
o Definition: Generated by software programs to request system services or indicate
errors.
o Example: A program might generate a software interrupt to request memory allocation
from the OS.
3. Maskable Interrupts:
51
oDefinition: Interrupts that can be ignored or "masked" by the CPU if necessary.
oExample: A user pressing a key on the keyboard while the CPU is processing another
interrupt.
4. Non-Maskable Interrupts (NMI):
o Definition: Interrupts that cannot be ignored or masked by the CPU; typically used for
critical system events.
o Example: Power failure or hardware failure.
Interrupt handling involves several key stages, which are explained below:
1. Interrupt Recognition:
o Definition: The CPU recognizes that an interrupt request has been made by checking the
interrupt request lines (IRQ). This is usually done by the interrupt controller.
o Process:
When the interrupt occurs, the CPU pauses its current task and checks if it is
enabled to handle interrupts.
The interrupt controller (like the Programmable Interrupt Controller, or PIC)
sends the interrupt signal to the CPU.
2. Status Saving (Context Saving):
o Definition: The CPU saves its current state (registers, program counter) before handling
the interrupt, so that it can resume normal execution after servicing the interrupt.
o Process: The CPU stores the status of its registers and program counter in memory or a
special area like the stack.
3. Interrupt Masking:
o Definition: The process of disabling or ignoring certain interrupts for a period of time to
prevent interruptions during critical processing.
o Example: The OS may mask certain interrupts to prevent a lower-priority interrupt from
disrupting the handling of a higher-priority interrupt.
4. Interrupt Acknowledgement:
o Definition: The CPU acknowledges the receipt of the interrupt signal by informing the
interrupt controller that it has recognized the interrupt.
o Process: After receiving the interrupt, the CPU sends an acknowledgment signal to the
interrupt controller, which then identifies the interrupt source.
5. Interrupt Service Routine (ISR):
o Definition: The interrupt service routine is a special function or set of instructions
executed by the CPU in response to an interrupt. It handles the interrupting task (e.g.,
reading a keyboard input or processing data from an I/O device).
o Example: If an interrupt occurs due to a keyboard input, the ISR will read the key press
and store the data into memory.
6. Return from Interrupt (Restoring the Context):
o Definition: After the ISR is executed, the CPU must restore its previous state (registers,
program counter) and resume normal execution.
o Process: The CPU retrieves the saved context (status, registers) and continues the
execution from where it left off before the interrupt occurred.
52
Interrupt Handling Flow:
1. Interrupt occurs.
2. CPU suspends its current task.
3. CPU saves the context (status, register values).
4. Interrupt source is identified.
5. Interrupt service routine (ISR) is executed to handle the interrupt.
6. Context is restored.
7. CPU resumes normal processing.
1. Interrupt Vector:
o Definition: A table of memory addresses that point to the ISR for each interrupt type.
When an interrupt occurs, the CPU uses the interrupt vector to find the address of the
corresponding ISR.
2. Priority of Interrupts:
o Definition: Some interrupts are more urgent than others. The CPU can prioritize
interrupt handling based on predefined priority levels. High-priority interrupts are
handled before low-priority ones.
3. Interrupt Controller (PIC - Programmable Interrupt Controller):
o Definition: A hardware component that manages multiple interrupt requests and
prioritizes them before sending them to the CPU.
4. Nested Interrupts:
o Definition: A situation where an interrupt occurs while another interrupt is being
serviced. The CPU must handle the higher-priority interrupt before returning to the
lower-priority interrupt.
5. DMA (Direct Memory Access):
o Definition: A feature where peripheral devices can directly transfer data to/from the
main memory without involving the CPU, often used to improve efficiency and speed in
data transfers.
Interrupt Latency:
o Definition: The time between when an interrupt occurs and when the CPU starts
executing the ISR. This is critical in real-time systems where timely response to
interrupts is required.
Interrupt Prioritization:
o Definition: In systems with multiple interrupt sources, certain interrupts might have
higher priority than others. The interrupt controller ensures that the highest-priority
interrupt is serviced first.
Definition:
Pipelining is a technique used in the design of computer systems where multiple instructions are
53
processed simultaneously in different stages of execution. It allows for overlapping the execution of
several instructions, improving the throughput of the CPU and enhancing its performance. This concept
is similar to an assembly line in a factory, where different stages of production happen in parallel.
In a non-pipelined processor, instructions are executed one after another, with each instruction passing
through the same stages sequentially. In contrast, pipelining breaks the execution process into several
stages, where each stage performs a part of the task. As a result, multiple instructions can be in different
stages of execution at the same time.
Example:
Consider a simple instruction cycle consisting of 5 stages:
In a pipelined system, while one instruction is in the execute stage, another can be in the decode stage,
and yet another can be in the fetch stage, resulting in better utilization of CPU resources.
Pipeline Stages
A typical pipeline involves multiple stages, with each stage handling a specific part of the instruction
cycle. The most common stages are:
Pipeline Efficiency
54
The efficiency of a pipeline is measured by throughput, which is the number of instructions completed
per unit of time. Ideally, pipelining improves throughput by allowing multiple instructions to be
executed simultaneously, but several factors can affect its effectiveness.
Key Concepts:
Throughput: The number of instructions that are completed over a specific period.
Latency: The time it takes for a single instruction to pass through all stages of the pipeline from
start to finish.
Efficiency Example: If each pipeline stage takes 1 clock cycle, and there are 5 stages, theoretically, one
instruction would be completed every clock cycle once the pipeline is full.
Pipelining Hazards
Although pipelining improves performance, it introduces several types of hazards that can affect the
efficiency of the pipeline. These hazards occur when the pipeline stages cannot proceed as expected due
to dependencies between instructions.
1. Data Hazards:
o Definition: Occur when an instruction depends on the result of a previous instruction
that has not yet completed its execution.
o Types of Data Hazards:
Read After Write (RAW) Hazard (True Dependency): An instruction needs to
read a register that is written to by a previous instruction.
Write After Write (WAW) Hazard (Output Dependency): Two instructions write
to the same register.
Write After Read (WAR) Hazard (Anti Dependency): An instruction writes to a
register after a previous instruction reads from it.
2. Control Hazards (Branch Hazards):
o Definition: Occur when the execution of an instruction depends on the outcome of a
branch (conditional jump), and this outcome is not determined until later in the
pipeline.
o Example: A branch instruction can cause the pipeline to fetch the wrong instruction if
the branch decision is not yet made.
3. Structural Hazards:
o Definition: Occur when the hardware resources of the pipeline (such as memory or ALU)
are insufficient to handle multiple instructions simultaneously.
o Example: If two instructions need to access memory at the same time, but the system
only has one memory unit, this creates a structural hazard.
Pipeline Control
Control Signals:
In a pipelined system, control signals are used to manage the flow of instructions through the pipeline.
These signals determine when to fetch, decode, execute, or write back instructions.
55
Stall: A delay inserted into the pipeline to wait for data or address resolution.
Forwarding (Data Bypassing): A technique used to avoid data hazards by directly passing the
data from one stage to another, skipping intermediate stages.
Performance Considerations
1. Pipeline Speedup:
Speedup due to pipelining is given by the formula:
In the ideal case, the speedup is proportional to the number of pipeline stages. However,
hazards and stalls can reduce this speedup.
Stalls:
A stall is inserted when an instruction cannot proceed because it is waiting for data from a
previous instruction. The pipeline halts for a clock cycle or more to allow the data to become
available.
Forwarding (Bypassing):
Forwarding is a technique used to pass the output of one pipeline stage directly to a later stage
to resolve data hazards without waiting for data to be written back to the register file.
Types of Pipelining
1. Instruction Pipelining:
o This is the most common form, where multiple instructions are processed at the same
time in different stages.
2. Functional Unit Pipelining:
o The pipeline is applied to specific functional units (e.g., an arithmetic logic unit or
memory). This allows multiple operations to be performed simultaneously, even if they
are of different types.
3. Superscalar Pipelining:
o Involves using multiple pipelines to execute multiple instructions simultaneously,
improving the overall throughput.
4. VLIW (Very Long Instruction Word) Pipelines:
o These pipelines issue multiple instructions in a single cycle. Instructions are grouped
together, and the compiler ensures that the instructions are independent, so they can
be executed in parallel.
56
Pipelining in Modern CPUs
Modern processors often have multiple pipeline stages that allow for deeper pipelines, improving
performance even further. Advanced CPUs also use techniques like speculative execution, out-of-order
execution, and branch prediction to further enhance performance in pipelined systems.
Example Processors:
Intel Core i7/i9 and AMD Ryzen CPUs: These processors use complex pipelining techniques to
execute multiple instructions in parallel, including out-of-order execution, branch prediction,
and deep pipelines with multiple execution units.
Advantages of Pipelining
1. Increased Throughput:
Pipelining increases the throughput of the system as multiple instructions can be processed in
parallel at different stages of execution.
2. Higher CPU Utilization:
The CPU is kept busy by processing instructions in parallel, reducing idle time.
3. Improved Performance:
In many cases, pipelining leads to a significant improvement in system performance, especially
for tasks with many instructions.
Disadvantages of Pipelining
1. Pipeline Hazards:
Data hazards, control hazards, and structural hazards can degrade performance and require
special handling techniques like stalling or forwarding.
2. Complexity:
The implementation of pipelining adds complexity to the CPU design. It requires additional
hardware, such as forwarding units, hazard detection units, and control logic.
3. Increased Latency for Individual Instructions:
While throughput improves, the latency for processing a single instruction may increase due to
pipeline stalls and hazards.
57