0% found this document useful (0 votes)
30 views11 pages

Computer Architecture

The document discusses key concepts in computer architecture, focusing on interrupts, CPI, MIPS, MFLOPS, control units, and the ALU. It explains how interrupts allow CPUs to respond to events efficiently, the significance of CPI and MIPS in measuring CPU performance, and the differences between hardwired and micro-programmed control units. Additionally, it outlines the role of the ALU in performing arithmetic and logical operations and describes the stages of instruction decoding and execution in a CPU pipeline.

Uploaded by

isha chatterjee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views11 pages

Computer Architecture

The document discusses key concepts in computer architecture, focusing on interrupts, CPI, MIPS, MFLOPS, control units, and the ALU. It explains how interrupts allow CPUs to respond to events efficiently, the significance of CPI and MIPS in measuring CPU performance, and the differences between hardwired and micro-programmed control units. Additionally, it outlines the role of the ALU in performing arithmetic and logical operations and describes the stages of instruction decoding and execution in a CPU pipeline.

Uploaded by

isha chatterjee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Computer Architecture Important Question

1. What are interrupts? How is interrupt handling done in CPUs?


Answer:
Interrupts:
An interrupt is a mechanism by which a process or event can interrupt the normal flow of execution
in a CPU to gain attention. Interrupts are used to respond to external or internal events that require
immediate action, such as hardware malfunctions, user input, or timer expirations. Instead of the CPU
continuously checking (polling) for conditions like hardware status or user inputs, an interrupt allows
the CPU to be informed when these events occur, so it can stop what its doing and handle the event in
a timely manner.
There are two main types of interrupts:
1. Hardware Interrupts:
These are generated by external hardware devices like I/O peripherals (keyboard, mouse,
network card, etc.) or internal components (timers, etc.). For example, when you press a key on
the keyboard, the keyboard controller generates an interrupt to inform the CPU to process the
key press.
2. Software Interrupts:
These are generated by programs (software) to request services from the operating system or to
handle specific conditions like errors. Software interrupts are used for system calls and exception
handling.

Interrupt Handling Process in CPUs:


When an interrupt occurs, the CPU temporarily halts its current operation and switches to a special
routine to process the interrupt. This process involves several key steps:
1. Interrupt Request (IRQ)
An external device or internal condition generates an interrupt signal, which is sent to the CPU.
This signal is usually a voltage change or a dedicated signal line (interrupt line). Each interrupt
source typically has a specific interrupt number or priority.
2. Interrupt Acknowledgment
The CPU acknowledges the interrupt. If the interrupt is enabled and the interrupt condition meets
certain priority criteria (some systems handle multiple interrupts), the CPU responds to the
interrupt.
3. Context Saving
Before processing the interrupt, the CPU saves the context of the currently running program.
This involves saving the program counter (PC), stack pointer (SP), and possibly other CPU
registers to memory or a dedicated stack. This allows the CPU to resume normal execution after
the interrupt is serviced.
4. Interrupt Vectoring
The CPU uses an interrupt vector (a memory address) to find the appropriate interrupt service
routine (ISR). The vector is a pointer to the code that handles the interrupt. The interrupt vector
table is a pre-defined list that maps interrupt numbers or sources to specific routines in memory.
For example:
 For IRQ 1 (keyboard), the vector might point to the code that processes the keyboard
input.
 For IRQ 14 (hard drive interrupt), the vector would point to the disk I/O handler.

The interrupt vector is often stored at a known location in memory.


5. Interrupt Service Routine (ISR)
Once the CPU has determined the correct ISR, it jumps to that location in memory to execute the
handler code. The ISR is a small function or routine that handles the event (e.g., reading a value
from a keyboard buffer, processing a timer event, etc.).
6. Context Restoration
Once the ISR completes its task, the CPU restores the context of the interrupted program by
loading the saved register values, program counter, and stack pointer. This ensures that the
program continues from where it left off before the interrupt occurred.
7. Interrupt Return (IRET)
Finally, the CPU executes a special instruction like IRET (Interrupt Return), which restores the
state of the program and resumes execution from the point where it was interrupted. The CPU
returns to the task that was interrupted, continuing execution without any disruption to the
overall process.

 Interrupt Priority and Masking:


 Priority: Some systems have multiple interrupt sources, and not all interrupts have the
same importance. Interrupts are usually assigned priorities, with higher-priority interrupts
able to "preempt" lower-priority ones. For example, if both a timer interrupt and a
keyboard interrupt occur at the same time, the timer interrupt (with higher priority) would
be handled first.
 Masking: Interrupts can be masked or disabled by the CPU during critical sections of
code to prevent unwanted interruptions. For example, a device might mask all interrupts
while updating a critical data structure, ensuring that no other interrupts disrupt the
operation.

Types of Interrupt Handling:


1. Polling: The CPU actively checks the status of devices at regular intervals to see if they
require service. This is an inefficient method compared to interrupts.
2. Vectored Interrupts: The interrupt vector table provides direct access to interrupt
service routines based on the interrupt type, making the system more efficient.
3. Non-Vectored Interrupts: The CPU must execute a fixed routine to determine which
interrupt occurred and then call the appropriate ISR. This is less efficient compared to
vectored interrupts.

Example: Interrupt in Action (Keyboard Interrupt):


1. You press a key on the keyboard (hardware interrupt).
2. The keyboard controller sends an interrupt signal to the CPU.
3. The CPU acknowledges the interrupt and saves its current state (program counter,
registers).
4. It uses the interrupt vector table to find the interrupt service routine for the keyboard
interrupt.
5. The ISR reads the key press data from the keyboard buffer.
6. The ISR finishes, and the CPU restores the previous program state.
7. The program continues execution as if it were uninterrupted.
Interrupt Latency:
Interrupt latency is the time it takes for the system to respond to an interrupt. It includes:
 Time for the CPU to acknowledge the interrupt.
 Time to execute the interrupt service routine.
 Time to restore the context and resume the interrupted task.
Reducing interrupt latency is critical in real-time systems, where timely responses to external events
are essential.

2. Explain CPI (Cycles per Instruction), MIPS, and MFLOPS.


Answer:

1. CPI (Cycles Per Instruction)


CPI is a measure of the average number of CPU clock cycles required to execute a single
instruction. It is used to assess the performance of a CPU by reflecting how many clock cycles
are spent per instruction, on average.
 Formula: CPI=Total Clock Cycles / Total Instructions Executed

Here, you divide the total number of clock cycles the CPU has used by the total number
of instructions it has executed.

 Interpretation:
o A lower CPI indicates a more efficient CPU because fewer clock cycles are
required per instruction.
o A higher CPI suggests that more clock cycles are needed to execute instructions,
which typically means the CPU is less efficient.

CPI can vary depending on the type of instructions being executed. For example:

 Simple instructions like ADD or MOV might take fewer cycles.


 Complex instructions like DIVIDE or floating-point operations may take more cycles.

In a pipelined processor, CPI can sometimes be less than 1 (though typically not by much)
because some instructions can overlap in execution stages.

2. MIPS (Million Instructions Per Second)


MIPS is a performance metric that measures how many millions of instructions a processor can
execute per second. It gives an indication of the processor's instruction throughput.
 Formula: MIPS=(Number of Instructions Executed / Execution Time (in seconds))×10^-6
MIPS essentially gives you the instruction rate, which can be interpreted as how fast the
processor is at executing a stream of instructions.
 Relation to CPI: MIPS is related to CPI in that it depends on how fast the processor runs
(Clock Speed, f) and the CPI value:

MIPS=Clock Speed / (CPI×10^-6)

Where:
o Clock Speed is measured in Hertz (cycles per second).
o CPI is the number of cycles per instruction.
 Interpretation:
o High MIPS suggests the processor can execute many instructions quickly.
o However, MIPS should be considered along with the CPI because two processors
with the same MIPS rate may have very different performance due to differences
in the number of cycles per instruction. For example, a processor with a high
MIPS value but a high CPI might not be as efficient as a processor with lower
MIPS but a lower CPI.

MIPS can be misleading if the program uses a wide variety of instructions that take different
numbers of cycles, so while MIPS tells you the instruction throughput, it doesn't necessarily
reflect the real-world performance of the CPU in all cases.

3. MFLOPS (Million Floating Point Operations Per Second):


MFLOPS is a performance metric specifically designed to measure the computational speed of
processors when performing floating-point operations. It is particularly relevant in scientific and
engineering applications where floating-point arithmetic is frequent.
 Formula:
MFLOPS=Number of Floating Point Operations / Execution Time (in seconds)×10^-6
This metric quantifies how many floating-point operations the processor can perform per
second, in millions.

 Floating-Point Operations: These are operations involving real numbers (i.e., numbers
that can have decimals), which are typically more complex than integer operations.
Floating-point operations include addition, subtraction, multiplication, division, and
square roots, among others.
 Relation to MIPS and CPI:
o If a processor performs many floating-point operations, it may have a high
MFLOPS rate, but this is not always directly correlated with MIPS since floating-
point operations often take more cycles than simple integer operations.
o For instance, an integer-heavy workload might show higher MIPS values
compared to a floating-point-heavy workload, where MFLOPS would be a more
meaningful performance measure.
 Interpretation:
o High MFLOPS indicates strong performance for floating-point-intensive tasks,
often relevant for scientific simulations, 3D rendering, and machine learning.
o Just like with MIPS, MFLOPS doesn't tell the entire performance story—it only
tells you about floating-point performance.

3. What’s the difference between a hardwired and micro-programmed control unit?


A Control Unit (CU) is an essential part of a CPU that directs the operations of the computer by
generating the necessary control signals. These control signals determine the operation of the CPU’s
functional units like the ALU (Arithmetic Logic Unit), registers, and I/O devices. There are two
primary ways to design a control unit: Hardwired Control and Micro-programmed Control.
Here's a detailed breakdown of the differences between the two:
1. Hardwired Control Unit
A hardwired control unit uses fixed logic circuits, typically composed of combinational logic
(gates, flip-flops, multiplexers, etc.), to generate control signals. These signals are directly based
on the current instruction and the internal states of the CPU. The logic is designed at the
hardware level and is not easily changed or modified.

Key Characteristics of Hardwired Control:


 Control Signals: The control signals are generated by combinational logic that directly
maps the instruction opcodes to control signals.
 Speed: Typically, hardwired control units are faster because they don't involve complex
instruction decoding steps. The control signals are generated almost immediately as the
instruction is fetched.
 Complexity: The design of a hardwired control unit is more complex for CPUs with
complex instruction sets. Adding new instructions may require redesigning the control
logic.
 Flexibility: Hardwired control is less flexible. It’s challenging to modify the control unit
to add new instructions without hardware changes. The control logic is fixed and static
after design.
 Suitability: Best suited for simpler processors or processors with a limited instruction
set (e.g., simple RISC architectures).

Example:
In a simple processor, if the instruction is a load operation, the hardwired control unit directly
generates control signals to:
 Select the appropriate registers.
 Activate the memory read line.
 Update the program counter.

2. Micro-programmed Control Unit


A micro-programmed control unit uses a set of micro-operations (also known as micro-
operations or micro-instructions) to generate control signals. These micro-operations are
stored in memory, and the control unit fetches these micro-instructions sequentially or
conditionally, based on the opcode of the instruction being executed. The microprogram serves
as an intermediary step between the high-level instructions and the low-level control signals.

Key Characteristics of Micro-programmed Control:


 Control Signals: Control signals are generated from a sequence of micro-instructions
stored in control memory (usually ROM, RAM, or EEPROM).
 Speed: Micro-programmed control units are typically slower than hardwired control
units because they require an extra step of fetching micro-instructions from control
memory and decoding them.
 Complexity: The design of a micro-programmed control unit is simpler compared to a
hardwired one. New instructions can be added easily by simply adding new micro-
instructions to the control memory.
 Flexibility: Micro-programmed control is highly flexible because changes in the
instruction set can be implemented by modifying the microprogram (control memory)
without changing the hardware.
 Suitability: Best suited for complex processors (like CISC processors) with a rich
instruction set that may require frequent updates or enhancements.
Example:
In a micro-programmed control unit, if the instruction is a load operation, the control unit:
1. Looks up the opcode of the instruction in the control memory.
2. Fetches a microprogram that includes steps like:
o Activate the memory read signal.
o Set the address for the data to be loaded.
o Move data into the specified register.

Key Differences:
Feature Hardwired Control Unit Micro-programmed Control Unit
Uses combinational logic circuits Uses a sequence of micro-instructions stored
Implementation
(fixed hardware). in memory.
Slower due to fetching microinstructions
Speed Faster execution of control signals.
from control memory.
Less flexible, requires hardware More flexible, easy to add or modify
Flexibility
modification for new instructions. instructions via microprogramming.
Complexity of More complex for complex Simpler design, especially for complex
Design instruction sets. instruction sets.
Difficult to modify; adding
Easy to modify; new instructions can be
Modifiability instructions requires redesigning
added by updating the microprogram.
hardware.
Suited for simple processors (RISC or Suited for complex processors (CISC or
Suitability
custom hardware). processors with many instructions).
Typically cheaper to implement Generally requires more memory (control
Cost
(fewer memory resources). memory).
RISC processors, simple embedded CISC processors, systems requiring frequent
Example Usage
systems. updates or a rich instruction set.

Example: Control in Action


 Hardwired Control: Consider a processor executing an ADD instruction. The hardwired
control unit directly generates the signals needed for the ALU to perform the addition,
select the registers, and store the result—all based on the instruction’s opcode and fixed
logic.
 Micro-programmed Control: In a micro-programmed control unit, the ADD instruction
would cause the control unit to fetch a sequence of micro-operations. These micro-
operations might include fetching operands from registers, adding them in the ALU, and
storing the result, all defined in the control memory.
4. Explain the role of the ALU in CPU operations.
The Arithmetic Logic Unit (ALU) is a fundamental component of the Central Processing Unit
(CPU). Its primary role is to perform arithmetic and logical operations, which are crucial for
executing instructions in any program. Here's a breakdown of what the ALU does:

1. Arithmetic Operations
The ALU handles basic mathematical functions such as:
 Addition
 Subtraction
 Multiplication
 Division

2. Logical Operations
It also performs logical operations like:
 AND
 OR
 NOT
 XOR (exclusive OR)

These operations are used in decision-making processes, such as determining if certain


conditions are true or false.

3. Comparison Operations
The ALU can compare values to check for:
 Equality (==)
 Greater than (>)
 Less than (<)

These are vital for control flow decisions in programs (like if statements and loops).

4. Bitwise Operations
The ALU can manipulate data at the bit level (bitwise shifts, ANDs, ORs, etc.), which is useful
for low-level programming and optimization.

5. Interface with Control Unit and Registers


 The Control Unit sends signals to the ALU to specify which operation to perform.
 The Registers hold the data that the ALU processes and store the results.

5. How are instructions decoded and executed in a typical CPU pipeline?


🔁 CPU Pipeline Stages Overview:
A typical RISC (Reduced Instruction Set Computer) pipeline consists of five main stages:
1. Instruction Fetch (IF)
2. Instruction Decode (ID)
3. Execute (EX)
4. Memory Access (MEM)
5. Write Back (WB)
1. Instruction Fetch (IF)
 The Program Counter (PC) holds the address of the next instruction.
 The CPU fetches the instruction from memory (usually from the Instruction Cache).
 The PC is incremented to point to the next instruction.

🔹 Goal: Get the instruction from memory into the pipeline.

2. Instruction Decode (ID)


 The instruction is decoded by the Instruction Decoder.
 The opcode (operation code) is identified (e.g., ADD, LOAD, JUMP).
 Source registers are read from the register file.
 Immediate values are extracted if present.
 Control signals are generated to guide the rest of the pipeline.

🔹 Goal: Understand what the instruction does and gather operands.

3. Execute (EX)
 The actual operation is performed (e.g., arithmetic via the ALU, address calculation for
memory ops).
 For branch instructions, the branch decision is often made here.

🔹 Goal: Perform the required computation or address calculation.

4. Memory Access (MEM)


 If the instruction is a load or store, memory is accessed here.
 Load: Read from memory into a register.
 Store: Write data from a register into memory.

🔹 Goal: Access memory if needed.

5. Write Back (WB)


 The result of the computation (or a memory load) is written back to the register file.
🔹 Goal: Save the result of the instruction.

How Instructions Move Through the Pipeline:


Each instruction moves one stage per clock cycle. So, multiple instructions are in different
stages simultaneously—this is what gives the pipeline its speed and throughput.

Example:

Clock Cycle: 1 2 3 4 5 6
Instruction 1: IF -> ID -> EX -> MEM -> WB
Instruction 2: IF -> ID -> EX -> MEM -> WB
Instruction 3: IF -> ID -> EX -> MEM -> WB
6. What is superscalar architecture?
Superscalar Architecture:
A more aggressive approach is to equip the processor with multiple processing units to handle
several instructions in parallel in each processing stage. With this arrangement, several
instructions start execution in the same clock cycle and the process is said to use multiple issue.
Such processors are capable of achieving an instruction execution throughput of more than one
instruction per cycle. They are known as ‘Superscalar Processors’.

In the above diagram, there is a processor with two execution units; one for integer and one for
floating point operations. The instruction fetch unit is capable of reading the instructions at a
time and storing them in the instruction queue. In each cycle, the dispatch unit retrieves and
decodes up to two instructions from the front of the queue. If there is one integer, one floating
point instruction and no hazards, both the instructions are dispatched in the same clock cycle.
Advantages of Superscalar Architecture :
 The compiler can avoid many hazards through judicious selection and ordering of
instructions.
 The compiler should strive to interleave floating point and integer instructions. This would
enable the dispatch unit to keep both the integer and floating point units busy most of the
time.
 In general, high performance is achieved if the compiler is able to arrange program
instructions to take maximum advantage of the available hardware units.
Disadvantages of Superscalar Architecture :
 In a Superscalar Processor, the detrimental effect on performance of various hazards
becomes even more pronounced.
 Due to this type of architecture, problem in scheduling can occur.

7. Describe multicore vs. multithreaded processors.


Multicore Processor:
A multicore processor has multiple independent CPU cores on a single chip.
 Each core can run its own thread or process independently.
 Think of it as multiple full CPUs in one package.

Example: A quad-core CPU has 4 cores and can typically run 4 separate instructions at once
(minimum).
Multithreaded Processor (Simultaneous Multithreading - SMT):
A multithreaded processor allows multiple threads to run on a single core, sharing that
core’s execution resources.
 Most common form: Hyper-Threading (Intel) or SMT (used in AMD, IBM).
 The core doesn’t duplicate its full hardware, but can overlap thread execution to use
idle resources more efficiently.

Example: A single core with 2 threads can sometimes run two instructions simultaneously if
they don’t compete for the same execution units.

8. Describe the concept of parallel processing and its types.


Parallel Processing:
Parallel processing is the simultaneous execution of multiple tasks or instructions to solve a
problem faster and more efficiently. It’s all about dividing a problem into smaller chunks that can
be handled at the same time—either on multiple processors, cores, or machines.

🔹 Goal: Improve performance by doing more work in less time.

Q. Why Use Parallel Processing?


 Speed up computation-heavy tasks (e.g., simulations, rendering, AI)
 Utilize multicore and multiprocessor systems effectively
 Handle large-scale problems that would take too long on a single processor

 Types of Parallel Processing:


Parallel processing can be classified in several ways. The two most common dimensions are:
1. Based on Hardware Architecture
🔁 SISD (Single Instruction, Single Data)
 Traditional single-core CPU.
 Executes one instruction on one piece of data at a time.
🔁 SIMD (Single Instruction, Multiple Data)
 Same instruction operates on multiple data points at once.
 Used in vector processors, GPUs, and AVX/SSE in CPUs.
 Great for things like image processing or matrix operations.
🔁 MISD (Multiple Instruction, Single Data)
 Rare. Multiple processors perform different operations on the same data.
 Mostly theoretical or used in fault-tolerant systems.
🔁 MIMD (Multiple Instruction, Multiple Data)
 Most modern CPUs/GPUs/servers fall here.
 Each core runs its own thread or process on its own data.
 Ideal for multitasking and distributed systems.

2. Based on Execution Level


✅ Bit-level Parallelism
 Operates on multiple bits at once (e.g., using 64-bit instead of 32-bit instructions).
✅ Instruction-level Parallelism (ILP)
 Multiple instructions from the same thread are executed in parallel.
 Achieved through pipelining, superscalar, and out-of-order execution.
✅ Data Parallelism

 Splitting data across multiple processors and performing the same operation on each part.
 Used heavily in scientific computing, machine learning, and GPU computing.

✅ Task Parallelism
 Different tasks are executed concurrently.
 Each processor runs a different part of a larger problem (e.g., web server threads, video
encoding).

Examples in Practice:
Type Example Use Case
SIMD GPU shaders for rendering 3D graphics
MIMD Multicore CPU running different threads
Data Parallel Training neural networks on batches of data
Task Parallel Web browser rendering + downloading + playing media

Parallel Processing Models:


1. Shared Memory Model:
o All processors access the same memory.
o Easier to program, but needs synchronization (e.g., threads in C++, OpenMP).
2. Distributed Memory Model:
o Each processor has its own memory.
o Requires message passing (e.g., MPI, cloud computing clusters).

Parallelism in Programming Languages:


 OpenMP: C/C++ and Fortran for shared memory parallelism
 MPI: Message Passing Interface, for distributed systems
 CUDA/OpenCL: GPU computing
 Python multiprocessing / threading: High-level, easy for prototyping
 Cilk, Go, Rust, Java: Native parallel constructs

You might also like