Computer Architecture.pdf
Computer Architecture.pdf
Computer Architecture
Computer scientists must build a computer with the same principles in mind as building the
foundations of physical structure. The three main pillars they must consider are:
System design - This is what makes up the structure of a computer, including all hardware parts,
such as CPU, data processors, multiprocessors, memory controllers, and direct memory access.
Instruction set architecture (ISA) - This is any software that makes a computer run, including
the CPU’s functions and capabilities, programming languages, data formats, processor register
types, and instructions used by programmers.
Microarchitecture - This defines the data processing and storage element or data paths. These
include storage devices and related computer organization tools.
Despite the rapid advancement of computing, many of the fundamentals of computer architecture
remain the same. There are two main types of computer architecture:
Von Neumann architecture - Named after mathematician and computer scientist John von
Neumann, this features a single memory space for both data and instructions, which are fetched
and executed sequentially. Von Neumann architecture introduced the concept of stored-program
computers, where both instructions and data are stored in the same memory, allowing for flexible
program execution.
Harvard architecture -This, on the other hand, uses separate memory spaces for data and
instructions, allowing for parallel fetching and execution.
While computer architectures can differ greatly depending on the purpose of the computer,
several key components generally contribute to the structure of computer architecture:
NRB LEVEL-4(IT) Shyam Gopal Timsina
Central Processing Unit (CPU) - Often referred to as the "brain" of the computer, the CPU
executes instructions, performs calculations, and manages data. Its architecture dictates factors
such as instruction set, clock speed, and cache hierarchy, all of which significantly impact
overall system performance.
Memory Hierarchy - This includes various types of memory, such as cache memory, random
access memory (RAM), and storage devices. The memory hierarchy plays a crucial role in
optimizing data access times, as data moves between different levels of memory based on their
proximity to the CPU and the frequency of access.
Input/Output (I/O) System - The I/O system enables communication between the computer and
external devices, such as keyboards, monitors, and storage devices. It involves designing
efficient data transfer mechanisms to ensure smooth interaction and data exchange.
Storage Architecture - This deals with how data is stored and retrieved from storage devices
like hard drives, solid-state drives (SSDs), and optical drives. Efficient storage architectures
ensure data integrity, availability, and fast access times.
Instruction Pipelining - Modern CPUs employ pipelining, a technique that breaks down
instruction execution into multiple stages. This allows the CPU to process multiple instructions
simultaneously, resulting in improved throughput.
Parallel Processing - This involves dividing a task into smaller subtasks and executing them
concurrently, often on multiple cores or processors. Parallel processing significantly accelerates
computations, making it key to tasks like simulations, video rendering, and machine learning.
Computer Registers
Registers are a type of computer memory used to quickly accept, store, and transfer data and
instructions that are being used immediately by the CPU. The registers used by the CPU are
often termed as Processor registers.
A processor register may hold an instruction, a storage address, or any data (such as bit sequence
or individual characters).
The computer needs processor registers for manipulating data and a register for holding a
memory address. The register holding the memory location is used to calculate the address of the
next instruction after the execution of the current instruction is completed.
NRB LEVEL-4(IT) Shyam Gopal Timsina
The Memory unit has a capacity of 4096 words, and each word contains 16 bits.
The Data Register (DR) contains 16 bits which hold the operand read from the memory location.
The Memory Address Register (MAR) contains 12 bits which hold the address for the memory
location.
The Program Counter (PC) also contains 12 bits which hold the address of the next instruction to
be read from memory after the current instruction is executed.
The Accumulator (AC) register is a general purpose processing register.
The instruction read from memory is placed in the Instruction register (IR).
The Temporary Register (TR) is used for holding the temporary data during the processing.
The Input Registers (IR) holds the input characters given by the user.
The Output Registers (OR) holds the output after processing the input data.
Memory hierarchy
Memory Hierarchy is an enhancement to organize the memory such that it can minimize the
access time.
The Memory Hierarchy was developed based on a program behavior known as locality of
references. The figure below clearly demonstrates the different levels of the memory hierarchy.
NRB LEVEL-4(IT) Shyam Gopal Timsina
Memory Hierarchy is one of the most required things in Computer Memory as it helps in
optimizing the memory available in the computer.
There are multiple levels present in the memory, each one having a different size, different cost,
etc.
Some types of memory like cache, and main memory are faster as compared to other types of
memory but they are having a little less size and are also costly whereas some memory has a
little higher storage value, but they are a little slower.
Accessing of data is not similar in all types of memory, some have faster access whereas some
have slower access.
Types of Memory Hierarchy
External Memory or Secondary Memory: Comprising of Magnetic Disk, Optical Disk, and
Magnetic Tape i.e. peripheral storage devices which are accessible by the processor via an I/O
Module.
Internal Memory or Primary Memory: Comprising of Main Memory, Cache Memory & CPU
registers. This is directly accessible by the processor.
1. Registers
Registers are small, high-speed memory units located in the CPU. They are used to store the
most frequently used data and instructions. Registers have the fastest access time and the
smallest storage capacity, typically ranging from 16 to 64 bits.
2. Cache Memory
Cache memory is a small, fast memory unit located close to the CPU. It stores frequently used
data and instructions that have been recently accessed from the main memory. Cache memory is
designed to minimize the time it takes to access data by providing the CPU with quick access to
frequently used data.
3. Main Memory
NRB LEVEL-4(IT) Shyam Gopal Timsina
Main memory, also known as RAM (Random Access Memory), is the primary memory of a
computer system. It has a larger storage capacity than cache memory, but it is slower. Main
memory is used to store data and instructions that are currently in use by the CPU.
Types of Main Memory
Static RAM: Static RAM stores the binary information in flip flops and information remains
valid until power is supplied. It has a faster access time and is used in implementing cache
memory.
Dynamic RAM: It stores the binary information as a charge on the capacitor. It requires
refreshing circuitry to maintain the charge on the capacitors after a few milliseconds. It contains
more memory cells per unit area as compared to SRAM.
4. Secondary Storage
Secondary storage, such as hard disk drives (HDD) and solid-state drives (SSD), is a non-volatile
memory unit that has a larger storage capacity than main memory. It is used to store data and
instructions that are not currently in use by the CPU. Secondary storage has the slowest access
time and is typically the least expensive type of memory in the memory hierarchy.
5. Magnetic Disk
Magnetic Disks are simply circular plates that are fabricated with either a metal or a plastic or a
magnetized material. The Magnetic disks work at a high speed inside the computer and these are
frequently used.
6. Magnetic Tape
Magnetic Tape is simply a magnetic recording device that is covered with a plastic film. It is
generally used for the backup of data. In the case of a magnetic tape, the access time for a
computer is a little slower and therefore, it requires some amount of time for accessing the strip.
NRB LEVEL-4(IT) Shyam Gopal Timsina
I/O Management
One of the important jobs of an Operating System is to manage various I/O devices including
mouse, keyboards, touch pad, disk drives, display adapters, USB devices, Bit-mapped screen,
LED, Analog-to-digital converter, On/off switch, network connections, audio I/O, printers etc.
An I/O system is required to take an application I/O request and send it to the physical device,
then take whatever response comes back from the device and send it to the application. I/O
devices can be divided into two categories −
Block devices − A block device is one with which the driver communicates by sending entire
blocks of data. For example, Hard disks, USB cameras, Disk-On-Key etc.
Character devices − A character device is one with which the driver communicates by sending
and receiving single characters (bytes, octets). For example, serial ports, parallel ports, sounds
cards etc
Device Controllers
Device drivers are software modules that can be plugged into an OS to handle a particular
device. Operating System takes help from device drivers to handle all I/O devices.
The Device Controller works like an interface between a device and a device driver. I/O units
(Keyboard, mouse, printer, etc.) typically consist of a mechanical component and an electronic
component where electronic component is called the device controller.
There is always a device controller and a device driver for each device to communicate with the
Operating Systems. A device controller may be able to handle multiple devices. As an interface
its main task is to convert serial bit stream to block of bytes, perform error correction as
necessary.
NRB LEVEL-4(IT) Shyam Gopal Timsina
Any device connected to the computer is connected by a plug and socket, and the socket is
connected to a device controller. Following is a model for connecting the CPU, memory,
controllers, and I/O devices where CPU and device controllers all use a common bus for
communication.
Synchronous I/O − In this scheme CPU execution waits while I/O proceeds
Asynchronous I/O − I/O proceeds concurrently with CPU execution
The CPU must have a way to pass information to and from an I/O device. There are three
approaches available to communicate with the CPU and Device.
➢ Special Instruction I/O
➢ Memory-mapped I/O
➢ Direct memory access (DMA)
When using memory-mapped I/O, the same address space is shared by memory and I/O devices.
The device is connected directly to certain main memory locations so that I/O device can transfer
block of data to/from memory without going through CPU.
Slow devices like keyboards will generate an interrupt to the main CPU after each byte is
transferred. If a fast device such as a disk generated an interrupt for each byte, the operating
system would spend most of its time handling these interrupts. So a typical computer uses direct
memory access (DMA) hardware to reduce this overhead.
Direct Memory Access (DMA) means CPU grants I/O module authority to read from or write to
memory without involvement. DMA module itself controls exchange of data between main
memory and the I/O device. CPU is only involved at the beginning and end of the transfer and
interrupted only after entire block has been transferred.
Direct Memory Access needs a special hardware called DMA controller (DMAC) that manages
the data transfers and arbitrates access to the system bus. The controllers are programmed with
source and destination pointers (where to read/write the data), counters to track the number of
transferred bytes, and settings, which includes I/O and memory types, interrupts and states for
the CPU cycles.
NRB LEVEL-4(IT) Shyam Gopal Timsina
Input-Output Interface
It is used as an method which helps in transferring of information between the internal storage
devices i.e. memory and the external peripheral device .
A peripheral device is that which provide input and output for the computer, it is also called
Input-Output devices. For Example: A keyboard and mouse provide Input to the computer are
called input devices while a monitor and printer that provide output to the computer are called
output devices. Just like the external hard-drives, there is also availability of some peripheral
devices which are able to provide both input and output.
NRB LEVEL-4(IT) Shyam Gopal Timsina
➢ It is used to synchronize the operating speed of CPU with respect to input-output devices.
➢ It selects the input-output device which is appropriate for the interpretation of the input-
output signal.
➢ It is capable of providing signals like control and timing signals.
➢ In this data buffering can be possible through data bus.
➢ There are various error detectors.
➢ It converts serial data into parallel data and vice-versa.
➢ It also convert digital data into analog signal and vice-versa.
RAID
RAID (redundant array of independent disks) is a way of storing the same data in different places
on multiple hard disks or solid-state drives (SSDs) to protect data in the case of a drive failure.
There are different RAID levels, however, and not all have the goal of providing redundancy.
Each drive's storage space is divided into units ranging from a sector of 512 bytes up to several
megabytes. The stripes of all the disks are interleaved and addressed in order. Disk mirroring and
disk striping can also be combined in a RAID array.
Interrrupt
The interrupt is a signal emitted by hardware or software when a process or an event needs
immediate attention. It alerts the processor to a high-priority process requiring interruption of the
current working process. In I/O devices one of the bus control lines is dedicated for this purpose
and is called the Interrupt Service Routine (ISR).
When a device raises an interrupt at let’s say process i, the processor first completes the
execution of instruction i. Then it loads the Program Counter (PC) with the address of the first
instruction of the ISR. Before loading the Program Counter with the address, the address of the
interrupted instruction is moved to a temporary location. Therefore, after handling the interrupt
the processor can continue with process i+1.
While the processor is handling the interrupts, it must inform the device that its request has been
recognized so that it stops sending the interrupt request signal. Also, saving the registers so that
the interrupted process can be restored in the future, increases the delay between the time an
interrupt is received and the start of the execution of the ISR. This is called Interrupt Latency.
Software Interrupts:
A sort of interrupt called a software interrupt is one that is produced by software or a system as
opposed to hardware. Traps and exceptions are other names for software interruptions. They
serve as a signal for the operating system or a system service to carry out a certain function or
respond to an error condition.
A particular instruction known as a “interrupt instruction” is used to create software interrupts.
When the interrupt instruction is used, the processor stops what it is doing and switches over to a
particular interrupt handler code. The interrupt handler routine completes the required work or
handles any errors before handing back control to the interrupted application.
Hardware Interrupts:
In a hardware interrupt, all the devices are connected to the Interrupt Request Line. A single
request line is used for all the n devices. To request an interrupt, a device closes its associated
NRB LEVEL-4(IT) Shyam Gopal Timsina
switch. When a device requests an interrupt, the value of INTR is the logical OR of the requests
from individual devices.
In operating systems, Virtual memory plays a very vital role, in managing the memory allotted to
different processes and efficiently isolating the different memory addresses.
The role of the virtual address is to assign a space to the ledger of all the virtual memory areas
that are provided to different processes. This process enables the process to view the respective
memory independently and be more flexible and maintainable.
NRB LEVEL-4(IT) Shyam Gopal Timsina
Virtual address space enables dynamic memory allocation as it is mainly needed to carry out the
task of assigning memory blocks to the processes when they request dynamically.
A page table is used to maintain the mapping of the Virtual address to the corresponding
Physical address which can be referred to as address translation which is used by the Virtual
address space to maintain the mapping.
The Virtual address space also contains the different access specifiers for specific virtual
memory blocks that specify whether a space will be read-only, read-write access, or no access.
This ensures that the memory or data is safe from any misdoings.
Using Virtual Address Space to give the processes are independent of any running process, as
they have noninterfering virtual address space.
This technique when used to enable memory sharing as helps to map the specific virtual memory
addresses to the same physical memory so that two/more processes can use the same memory
space to make more efficient use of it.
Paging is a memory management scheme that eliminates the need for a contiguous allocation of
physical memory.
The process of retrieving processes in the form of pages from the secondary storage into the
main memory is known as paging.
NRB LEVEL-4(IT) Shyam Gopal Timsina
The basic purpose of paging is to separate each procedure into pages. Additionally, frames will
be used to split the main memory. This scheme permits the physical address space of a process to
be non – contiguous.
In paging, the physical memory is divided into fixed-size blocks called page frames, which are
the same size as the pages used by the process. The process’s logical address space is also
divided into fixed-size blocks called pages, which are the same size as the page frames. When a
process requests memory, the operating system allocates one or more page frames to the process
and maps the process’s logical pages to the physical page frames.
The mapping between logical pages and physical page frames is maintained by the page table,
which is used by the memory management unit to translate logical addresses into physical
addresses. The page table maps each logical page number to a physical page frame number.
CPU architecture
Parts of a CPU:
ALU - The arithmetic logic unit executes all calculations within the CPU
CU - control unit, coordinates how data moves around, decodes instructions
Registers, a memory location within the actual processor that work at very fast speeds. It stores
instructions which await to be decoded or executed.
PC - program counter - stores address of the -> next <- instruction in RAM
MAR - memory address register - stores the address of the current instruction being executed
MDR - memory data register - stores the data that is to be sent to or fetched from memory
NRB LEVEL-4(IT) Shyam Gopal Timsina
CIR - current instruction register - stores actual instruction that is being decoded and executed
ACC - accumulator - stores result of calculations
Buses
address bus - carries the ADDRESS of the instruction or data
data bus - carries data between processor and the memory
control bus - sends control signals such as: memory read, memory write
Together, these buses may be referred to as the “system bus” or the “front-side bus”
Parallel Processing
Parallel processing can be described as a class of techniques which enables the system to achieve
simultaneous data-processing tasks to increase the computational speed of a computer system.
A parallel processing system can carry out simultaneous data-processing to achieve faster
execution time. For instance, while an instruction is being processed in the ALU component of
the CPU, the next instruction can be read from memory.
The primary purpose of parallel processing is to enhance the computer processing capability and
increase its throughput, i.e. the amount of processing that can be accomplished during a given
interval of time.
A parallel processing system can be achieved by having a multiplicity of functional units that
perform identical or different operations simultaneously. The data can be distributed among
various multiple functional units.
NRB LEVEL-4(IT) Shyam Gopal Timsina
RISC
RISC is the way to make hardware simpler.
Reduced Instruction Set Architecture (RISC)
The main idea behind this is to simplify hardware by using an instruction set composed of a few
basic steps for loading, evaluating, and storing operations just like a load command will load
data, a store command will store the data.
Characteristics of RISC
➢ Simpler instruction, hence simple instruction decoding.
➢ Instruction comes undersize of one word.
➢ Instruction takes a single clock cycle to get executed.
➢ More general-purpose registers.
➢ Simple Addressing Modes.
➢ Fewer Data types.
➢ A pipeline can be achieved.
Advantages of RISC
Simpler instructions: RISC processors use a smaller set of simple instructions, which makes
them easier to decode and execute quickly. This results in faster processing times.
NRB LEVEL-4(IT) Shyam Gopal Timsina
Faster execution: Because RISC processors have a simpler instruction set, they can execute
instructions faster than CISC processors.
Lower power consumption: RISC processors consume less power than CISC processors, making
them ideal for portable devices.
Disadvantages of RISC
More instructions required: RISC processors require more instructions to perform complex tasks
than CISC processors.
Increased memory usage: RISC processors require more memory to store the additional
instructions needed to perform complex tasks.
Higher cost: Developing and manufacturing RISC processors can be more expensive than CISC
processors.
Advantages of CISC
➢ Reduced code size: CISC processors use complex instructions that can perform multiple
operations, reducing the amount of code needed to perform a task.
➢ More memory efficient: Because CISC instructions are more complex, they require fewer
instructions to perform complex tasks, which can result in more memory-efficient code.
➢ Widely used: CISC processors have been in use for a longer time than RISC processors,
so they have a larger user base and more available software.
NRB LEVEL-4(IT) Shyam Gopal Timsina
Disadvantages of CISC
➢ Slower execution: CISC processors take longer to execute instructions because they have
more complex instructions and need more time to decode them.
➢ More complex design: CISC processors have more complex instruction sets, which
makes them more difficult to design and manufacture.
➢ Higher power consumption: CISC processors consume more power than RISC processors
because of their more complex instruction sets.
CU
A Central Processing Unit is the most important component of a computer system. A control unit
is a part of the CPU. A control unit controls the operations of all parts of the computer but it does
not carry out any data processing operations.
The Control Unit is the part of the computer’s central processing unit (CPU), which directs the
operation of the processor. It was included as part of the Von Neumann Architecture by John von
Neumann. It is the responsibility of the control unit to tell the computer’s memory,
arithmetic/logic unit, and input and output devices how to respond to the instructions that have
been sent to the processor.
It fetches internal instructions of the programs from the main memory to the processor
instruction register, and based on this register contents, the control unit generates a control signal
that supervises the execution of these instructions.
A control unit works by receiving input information which it converts into control signals, which
are then sent to the central processor. The computer’s processor then tells the attached hardware
what operations to perform. The functions that a control unit performs are dependent on the type
of CPU because the architecture of the CPU varies from manufacturer to manufacturer.
NRB LEVEL-4(IT) Shyam Gopal Timsina