0% found this document useful (0 votes)
993 views17 pages

Coa Unit-3,4 Notes

1. Parallel processing refers to breaking large problems into smaller independent parts that can be solved simultaneously using multiple processors communicating via shared memory to speed up program execution. 2. There are four types of computer architectures based on instruction and data parallelism as classified by Flynn: SISD, SIMD, MISD, and MIMD. SIMD and MIMD are relevant to parallel computers. 3. SIMD computers have multiple processing units receiving a single instruction stream and operating on different data pieces. MIMD computers have multiple processing units each with their own instruction and data streams.

Uploaded by

Deepanshu kr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
993 views17 pages

Coa Unit-3,4 Notes

1. Parallel processing refers to breaking large problems into smaller independent parts that can be solved simultaneously using multiple processors communicating via shared memory to speed up program execution. 2. There are four types of computer architectures based on instruction and data parallelism as classified by Flynn: SISD, SIMD, MISD, and MIMD. SIMD and MIMD are relevant to parallel computers. 3. SIMD computers have multiple processing units receiving a single instruction stream and operating on different data pieces. MIMD computers have multiple processing units each with their own instruction and data streams.

Uploaded by

Deepanshu kr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

UNIT-3

 Parallel processing
Parallel computing refers to the process of breaking down larger problems into smaller,
independent, often similar parts that can be executed simultaneously by multiple processors
communicating via shared memory.
Parallel processing systems are created to speed up the implementation of programs by
breaking the program into several fragments and processing these fragments together.
Flynn has classified the computer systems based on parallelism in the instructions and in the
data streams. These are:
1.     Single instruction stream, single data stream (SISD): this is the traditional CPU
architecture: at any one time only a single instruction is executed, operating on a single data
item.
2.         Single instruction stream, multiple data stream (SIMD).
3.         Multiple instruction streams, single data stream (MISD).
4.         Multiple instruction stream, multiple data stream (MIMD).
Flynn has classified the computer ‘systems into four types based on parallelism but only two
of them are relevant to parallel computers. These are SIMD and MIMD computers.
SIMD computers are consisting of ‘n’ processing units receiving a single stream of
instruction from a central control unit and each processing unit operates on a different piece
of data. Most SIMD computers operate synchronously using a single global dock. The block
diagram of SIMD computer is shown below:

MIMD computers are consisting of ‘n’ processing units; each with its own stream of
instruction and each processing unit operate on unit operates on a different piece of data.
MIMD is the most powerful computer system that covers the range of multiprocessor
systems. The block diagram of MIMD computer is shown.
 Cache Coherence
In a shared memory multiprocessor with a separate cache memory for each processor , it is
possible to have many copies of any one instruction operand : one copy in the main memory
and one in each cache memory. When one copy of an operand is changed, the other copies of
the operand must be changed also. Cache coherence is the discipline that ensures that changes
in the values of shared operands are propagated throughout the system in a timely fashion.

There are three distinct levels of cache coherence:

1. Every write operation appears to occur instantaneously.

2. All processes see exactly the same sequence of changes of values for each separate
operand.

3. Different processes may see an operand assume different sequences of values.

 Vector Processing
Definition: Vector processor is basically a central processing unit that has the ability to
execute the complete vector input in a single instruction. More specifically we can say, it is a
complete unit of hardware resources that executes a sequential set of similar data items in the
memory using a single instruction.
The figure below represents the typical diagram showing vector processing by a vector
computer:
The functional units of a vector computer are as follows:
i. IPU or instruction processing unit
ii. Vector register
iii. Scalar register
iv. Scalar processor
v. Vector instruction controller
vi. Vector access controller
vii. Vector processor
We know that both data and instructions are present in the memory at the desired memory
location. So, the instruction processing unit i.e., IPU fetches the instruction from the memory.
Once the instruction is fetched then IPU determines either the fetched instruction is scalar or
vector in nature. If it is scalar in nature, then the instruction is transferred to the scalar
register and then further scalar processing is performed.
While, when the instruction is a vector in nature then it is fed to the vector instruction
controller. This vector instruction controller first decodes the vector instruction then
accordingly determines the address of the vector operand present in the memory.
Then it gives a signal to the vector access controller about the demand of the respective
operand. This vector access controller then fetches the desired operand from the memory.
Once the operand is fetched then it is provided to the instruction register so that it can be
processed at the vector processor.
At times when multiple vector instructions are present, then the vector instruction controller
provides the multiple vector instructions to the task system. And in case the task system
shows that the vector task is very long then the processor divides the task into subvectors.
These subvectors are fed to the vector processor that makes use of several pipelines in order
to execute the instruction over the operand fetched from the memory at the same time.
Classification of Vector Processor

Register to Register Architecture


This architecture is highly used in vector computers. As in this architecture, the fetching of
the operand or previous results indirectly takes place through the main memory by the use of
registers.

The several vector pipelines present in the vector computer help in retrieving the data from
the registers and also storing the results in the desired register.

Memory to Memory Architecture


Here in memory to memory architecture, the operands or the results are directly fetched from
the memory despite using registers. However, it is to be noted here that the address of the
desired data to be accessed must be present in the vector instruction.

This architecture enables the fetching of data of size 512 bits from memory to pipeline.
However, due to high memory access time, the pipelines of the vector computer requires
higher start-up time, as higher time is required to initiate the vector instruction.

 Vector (Array) Processor and its Types

Array processors are also known as multiprocessors or vector processors. They perform
computations on large arrays of data. Thus, they are used to improve the performance of the
computer.

Types of Array Processors

There are basically two types of array processors:

1. Attached Array Processors

2. SIMD Array Processors

Attached Array Processors

An attached array processor is a processor which is attached to a general purpose computer


and its purpose is to enhance and improve the performance of that computer in numerical
computational tasks. It achieves high performance by means of parallel processing with
multiple functional units.

SIMD Array Processors

SIMD is the organization of a single computer containing multiple processors operating in


parallel. The processing units are made to operate under the control of a common control
unit, thus providing a single instruction stream and multiple data streams.

A general block diagram of an array processor is shown below. It contains a set of identical
processing elements (PE's), each of which is having a local memory M. Each processor
element includes an ALU and registers. The master control unit controls all the operations of
the processor elements. It also decodes the instructions and determines how the instruction is
to be executed.

The main memory is used for storing the program. The control unit is responsible for fetching
the instructions. Vector instructions are send to all PE's simultaneously and results are
returned to the memory.
Why use the Array Processor

 Array processors increases the overall instruction processing speed.

 As most of the Array processors operates asynchronously from the host CPU,
hence it improves the overall capacity of the system.

 Array Processors has its own local memory, hence providing extra memory for
systems with low memory.

 Memory Interleaving
Abstraction is one of the most important aspects of computing. It is a widely
implemented Practice in the Computational field. 
Memory Interleaving is less or More an Abstraction technique. Though it’s a bit
different from Abstraction. It is a Technique that divides memory into a number of
modules such that Successive words in the address space are placed in the Different
modules. 
Why do we use Memory Interleaving? [Advantages]: 
Whenever Processor requests Data from the main memory. A block (chunk) of Data is
Transferred to the cache and then to Processor. So whenever a cache miss occurs the
Data is to be fetched from the main memory. But main memory is relatively slower
than the cache. So to improve the access time of the main memory interleaving is
used. 

Types of Interleaved Memory


In an operating system, there are two types of interleaved memory, such as:

1. High order interleaving: In high order memory interleaving, the most significant bits
of the memory address decides memory banks where a particular location resides.
But, in low order interleaving the least significant bits of the memory address decides
the memory banks. The least significant bits are sent as addresses to each chip.

2. Low order interleaving: The least significant bits select the memory bank
(module) in low-order interleaving. In this, consecutive memory addresses are in
different memory modules, allowing memory access faster than the cycle time.
UNIT-4
INPUT-OUTPUT ORGANIZATION

 Introduction to Input-Output Interface


Input-Output Interface is used as a method which helps in transferring of information
between the internal storage devices i.e. memory and the external peripheral device. A
peripheral device is that which provide input and output for the computer, it is also called
Input-Output devices. For Example: A keyboard and mouse provide Input to the computer
are called input devices while a monitor and printer that provide output to the computer are
called output devices. Just like the external hard-drives, there is also availability of some
peripheral devices which are able to provide both input and output.

Input-Output Interface

Peripheral Devices

Input or output devices that are connected to computer are called  peripheral devices. These
devices are designed to read information into or out of the memory unit upon command
from the CPU and are considered to be the part of computer system. These devices are also
called peripherals.

For example: Keyboards, display units and printers are common peripheral devices.

There are three types of peripherals:

1. Input peripherals: Allows user input, from the outside world to the computer.
Example: Keyboard, Mouse etc.

2. Output peripherals: Allows information output, from the computer to the outside
world. Example: Printer, Monitor etc.

3. Input-Output peripherals: Allows both input (from outside world to computer) as


well as, output (from computer to the outside world). Example: Touch screen etc.

 Modes of I/O Data Transfer

The binary information that is received from an external device is usually stored in the
memory unit. The information that is transferred from the CPU to the external device is
originated from the memory unit. CPU merely processes the information but the source and
target is always the memory unit. Data transfer between CPU and the I/O devices may be
done in different modes

Data transfer between the central unit and I/O devices can be handled in generally three
types of modes which are given below:

1. Programmed I/O

2. Interrupt Initiated I/O

3. Direct Memory Access

1) Programmed I/O
The programmed I/O method controls the transfer of data between connected devices and
the computer. Each I/O device connected to the computer is continually checked for inputs.
Once it receives an input signal from a device, it carries out that request until it no longer
receives an input signal. Let's say you want to print a document. When you select print on
your computer, the request is sent through the central processing unit (CPU) and the
communication signal is acknowledged and sent out to the printer.
Advantages:
 Programmed I/O is simple to implement.
 It requires very little hardware support.
 CPU checks status bits periodically.
 
Disadvantages:
 The processor has to wait for a long time for the I/O module to be ready for either
transmission or reception of data.
 The performance of the entire system is severely degraded.

2) Interrupt Initiated I/O


The interrupt-based I/O method controls the data transfer activity to and from connected
I/O devices. It allows the CPU to continue to process other work instead and will be
interrupted only when it receives an input signal from an I/O device. For example, if you
strike a key on a keyboard, the interrupt I/O will send a signal to the CPU that it needs to
pause from its current task and carry out the request from the keyboard stroke.
Advantages:
 It is faster and more efficient than Programmed I/O.
 It requires very little hardware support.
 CPU does not check status bits periodically.
 
Disadvantages:
 It can be tricky to implement if using a low-level language.
 It can be tough to get various pieces of work well together.
 The hardware manufacturer / OS maker usually implements it, e.g., Microsoft.
3) Direct Memory Access (DMA) I/O
The name itself explains what the direct memory access I/O method does. It directly
transfers blocks of data between the memory and I/O devices without having to involve the
CPU. If the CPU was involved, it would slow down the computer. When an input signal is
received from an I/O device that requires access to memory, the DMA will receive the
necessary information required to make that transfer, allowing the CPU to continue with its
other tasks. For example, if you need to transfer pictures from a camera plugged into a USB
port on your computer, instead of the CPU processing this request, the signal will be sent to
the DMA, which will handle it.

DMA in computer architecture

DMA controller provides an interface between the bus and the input-output devices.
Although it transfers data without intervention of processor, it is controlled by the
processor.

DMA controller contains an address unit, for generating addresses and selecting I/O device
for transfer. It also contains the control unit and data count for keeping counts of the
number of blocks transferred and indicating the direction of transfer of data. When the
transfer is completed, DMA informs the processor by raising an interrupt.

 Privileged and Non-Privileged Instructions

In any Operating System, it is necessary to have a Dual Mode Operation to ensure the
protection and security of the System from unauthorized or errant users. This Dual Mode
separates the User Mode from the System Mode or Kernel Mode. 
 
What are Privileged Instructions? 
The Instructions that can run only in Kernel Mode are called Privileged Instructions.
 If a privileged instruction is attempted to get executed in user mode, that instruction will
get ignored and treated as an illegal instruction. It is trapped in the operating system by
the hardware.
 It is the responsibility of the operating system to ensure that the Timer is set to interrupt
before transferring control to any user application.
 The operating system uses privileged instruction to ensure proper operation.
 Examples of privileged instructions are I/O instructions, Context switching, clear
memory, set the timer of the CPU etc.

What are Non-Privileged Instructions?  


 
The Instructions that can run only in User Mode are called Non-Privileged Instructions. The
whole computer system is divided into two parts- hardware and software. The software part
communicates with the hardware part using the instruction set.
Examples of Non-privileged instructions
1. Generate trap instruction
2. Reading system time
3. Reading status of processor
4. Sending the output to the printer
5. Performing arithmetic operations

 Interrupt

An interrupt is a signal emitted by hardware or software when a process or an event needs


immediate attention. It alerts the processor to a high-priority process requiring interruption of
the current working process. In I/O devices, one of the bus control lines is dedicated for this
purpose and is called the Interrupt Service Routine (ISR).
Software interrupts
The interrupt signal generated from internal devices and software programs need to access
any system call then software interrupts are present.
Software interrupt is divided into two types. They are as follows −
 Normal Interrupts − The interrupts that are caused by the software instructions are
called software instructions.
 Exception − Exception is nothing but an unplanned interruption while executing a
program. For example − while executing a program if we got a value that is divided by
zero is called an exception.
MEMORY ORGANIZATION

 Memory Hierarchy
The Computer memory hierarchy looks like a pyramid structure which is used to describe the
differences among memory types. It separates the computer storage based on hierarchy.
Level 0: CPU registers
Level 1: Cache memory
Level 2: Main memory or primary memory
Level 3: Magnetic disks or secondary memory
Level 4: Optical disks or magnetic types or tertiary Memory

In Memory Hierarchy the cost of memory, capacity is inversely proportional to speed. Here
the devices are arranged in a manner Fast to slow that is form register to Tertiary memory.
Let us discuss each level in detail:
Level-0 − Registers
The registers are present inside the CPU. As they are present inside the CPU, they have least
access time. Registers are most expensive and smallest in size generally in kilobytes. They
are implemented by using Flip-Flops.
Level-1 − Cache
Cache memory is used to store the segments of a program that are frequently accessed by the
processor. It is expensive and smaller in size generally in Megabytes and is implemented by
using static RAM.
Level-2 − Primary or Main Memory
It directly communicates with the CPU and with auxiliary memory devices through an I/O
processor. Main memory is less expensive than cache memory and larger in size generally in
Gigabytes. This memory is implemented by using dynamic RAM.
Level-3 − Secondary storage
Secondary storage devices like Magnetic Disk are present at level 3. They are used as backup
storage. They are cheaper than main memory and larger in size generally in a few TB.
Level-4 − Tertiary storage
Tertiary storage devices like magnetic tape are present at level 4. They are used to store
removable files and are the cheapest and largest in size (1-20 TB).

 Main Memory

The memory unit that communicates directly within the CPU, Auxiliary memory and Cache
memory, is called main memory. It is the central storage unit of the computer system. It is a
large and fast memory used to store data during computer operations. The main memory acts
as the central storage unit in a computer system. It is a relatively large and fast memory
which is used to store programs and data during the run time operations.

The primary technology used for the main memory is based on semiconductor integrated
circuits. The integrated circuits for the main memory are classified into two major units.

1. RAM (Random Access Memory) integrated circuit chips


2. ROM (Read Only Memory) integrated circuit chips

RAM: Random Access Memory

1. DRAM: Dynamic RAM, is made of capacitors and transistors, and must be


refreshed every 10~100 ms. It is slower and cheaper than SRAM.

2. SRAM: Static RAM, has a six transistor circuit in each cell and retains data,
until powered off.

3. NVRAM: Non-Volatile RAM, retains its data, even when turned off.
Example: Flash memory.

ROM: Read Only Memory, is non-volatile and is more like a permanent storage for
information. It also stores the bootstrap loader program, to load and start the operating system
when computer is turned on. PROM (Programmable ROM), EPROM (Erasable PROM)
and EEPROM (Electrically Erasable PROM) are some commonly used ROMs.
 Auxiliary Memory

An Auxiliary memory is known as the lowest-cost, highest-capacity and slowest-access


storage in a computer system. It is where programs and data are kept for long-term storage or
when not in immediate use. The most common examples of auxiliary memories are magnetic
tapes and magnetic disks.

Magnetic Disks

A magnetic disk is a type of memory constructed using a circular plate of metal or plastic
coated with magnetized materials. Usually, both sides of the disks are used to carry out
read/write operations. However, several disks may be stacked on one spindle with read/write
head available on each surface.

The following image shows the structural representation for a magnetic disk.

o The memory bits are stored in the magnetized surface in spots along the concentric
circles called tracks.
o The concentric circles (tracks) are commonly divided into sections called sectors.

 Cache Memory

The data or contents of the main memory that are used again and again by CPU, are stored in
the cache memory so that we can easily access that data in shorter time.

Whenever the CPU needs to access memory, it first checks the cache memory. If the data is
not found in cache memory then the CPU moves onto the main memory. It also transfers
block of recent data into the cache and keeps on deleting the old data in cache to
accommodate the new one.
Hit Ratio

The performance of cache memory is measured in terms of a quantity called hit ratio. When
the CPU refers to memory and finds the word in cache it is said to produce a hit. If the word
is not found in cache, it is in main memory then it counts as a miss.

The ratio of the number of hits to the total CPU references to memory is called hit ratio.

Hit Ratio = Hit/(Hit + Miss)

 Associative Memory

An associative memory can be considered as a memory unit whose stored data can be
identified for access by the content of the data itself rather than by an address or memory
location.

Associative memory is often referred to as Content Addressable Memory (CAM).

When a write operation is performed on associative memory, no address or memory location


is given to the word. The memory itself is capable of finding an empty unused location to
store the word.

On the other hand, when the word is to be read from an associative memory, the content of
the word, or part of the word, is specified. The words which match the specified content are
located by the memory and are marked for reading.

Applications of Associative memory :-


1. It can be only used in memory allocation format.
2. It is widely used in the database management systems, etc.

Advantages of Associative memory :-


1. It is used where search time needs to be less or short.
2. It is suitable for parallel searches.
3. It is often used to speedup databases.
4. It is used in page tables used by the virtual memory and used in neural networks.

Disadvantages of Associative memory :-


1. It is more expensive than RAM.
2. Each cell must have storage capability and logical circuits for matching its content with
external argument

 Cache Mapping:
The process /technique of bringing data of main memory blocks into the cache block
is known as cache mapping.
The mapping techniques can be classified as :
1. Direct Mapping
2. Associative
3. Set-Associative

 Associative Mapping
In associative mapping both the address and data of the memory word are stored.
The associative mapping method used by cache memory is very flexible one as well as very
fast.
This mapping method is also known as fully associative cache.
 
Advantages of associative mapping
 Associative mapping is fast.
 Associative mapping is easy to implement.
Disadvantages of associative mapping
 Cache Memory implementing associative mapping is expensive as it requires to store
address along with the data.

 Direct Mapping
In direct mapping cache, instead of storing total address information with data in cache only
part of address bits is stored along with data.
The new data has to be stored only in a specified cache location as per the mapping rule for
direct mapping. So it doesn't need replacement algorithm.
 
Advantages of direct mapping
 Direct mapping is simplest type of cache memory mapping.
 Here only tag field is required to match while searching word that is why it fastest
cache.
 Direct mapping cache is less expensive compared to associative cache mapping.
Disadvantages of direct mapping
 The performance of direct mapping cache is not good as requires replacement for
data-tag value.

 Set-Associative Mapping
In Set-Associative cache memory two or more words can be stored under the same index
address. 
Here every data word is stored along with its tag. The number of tag-data words under an
index is said to form a text.
 
Advantages of Set-Associative mapping
 Set-Associative cache memory has highest hit-ratio compared two previous two cache
memory discussed above. Thus its performance is considerably better.
Disadvantages of Set-Associative mapping
 Set-Associative cache memory is very expensive. As the set size increases the cost
increases.

 Virtual memory

Virtual Memory is a storage scheme that provides user an illusion of having a very big main
memory. This is done by treating a part of secondary memory as the main memory.

In this scheme, User can load the bigger size processes than the available main memory by
having the illusion that the memory is available to load the process.

Instead of loading one big process in the main memory, the Operating System loads the
different parts of more than one process in the main memory.

How Virtual Memory Works?

In modern word, virtual memory has become quite common these days. In this scheme,
whenever some pages needs to be loaded in the main memory for the execution and the
memory is not available for those many pages, then in that case, instead of stopping the pages
from entering in the main memory, the OS search for the RAM area that are least used in the
recent times or that are not referenced and copy that into the secondary memory to make the
space for the new pages in the main memory.

Since all this procedure happens automatically, therefore it makes the computer feel like it is
having the unlimited RAM.

Advantages of Virtual Memory


1. The degree of Multiprogramming will be increased.
2. User can run large application with less real RAM.
3. There is no need to buy more memory RAMs.

Disadvantages of Virtual Memory


1. The system becomes slower since swapping takes time.
2. It takes more time in switching between applications.
3. The user will have the lesser hard disk space for its use.

You might also like