Group 6 Task
Group 6 Task
This is a computer system with two or more CPUs that share full access to a common RAM. The main
objective of using a multiprocessor is to boost the system’s execution speed, with other objectives being
fault tolerance and application matching.
There are two types of multiprocessors, one is called shared memory multiprocessor and another is
distributed memory processor.
In shared memory processors, all the CPUs share the common memory but in a distributed memory
multiprocessor, every CPU has its own private memory.
• Enhanced performance
• Multiple applications
• Multitasking inside an application
• High throughput and responsiveness
• Hardware sharing among CPUs
Advantages:
Improved performance: Multiprocessor systems can execute tasks faster than single-processor systems,
as the workload can be distributed across multiple processors.
Better scalability: Multiprocessor systems can be scaled more easily than single-processor systems, as
additional processors can be added to the system to handle increased workloads.
Increased reliability: Multiprocessor systems can continue to operate even if one processor fails, as the
remaining processors can continue to execute tasks.
Reduced cost: Multiprocessor systems can be more cost-effective than building multiple single-
processor systems to handle the same workload.
Enhanced parallelism: Multiprocessor systems allow for greater parallelism, as different processors can
execute different tasks simultaneously.
Disadvantages:
Increased complexity: Multiprocessor systems are more complex than single-processor systems, and
they require additional hardware, software, and management resources.
Higher power consumption: Multiprocessor systems require more power to operate than single-
processor systems, which can increase the cost of operating and maintaining the system.
Difficult programming: Developing software that can effectively utilize multiple processors can be
challenging, and it requires specialized programming skills.
Limited performance gains: Not all applications can benefit from multiprocessor systems, and some
applications may only see limited performance gains when running on a multiprocessor system.
Multicore architecture.
A processor that has more than one core is called Multicore Processor while one with single core is
called unicore Processor or Uniprocessor. Nowadays, most of systems have four cores (Quad-core) or
eight cores (Octa-core).
These cores can individually read and execute program instructions, giving feel like computer system has
several processors but in reality, they are cores and not processors. Instructions can be calculation, data
transferring instruction, branch instruction, etc.
Processor can run instructions on separate cores at same time. This increases overall speed of program
execution in system. Thus, heat generated by processor gets reduced and increases overall speed of
execution. Multicore systems support Multithreading and Parallel Computing.
Multicore processors are widely used across many application domains, including general-purpose,
embedded, network, digital signal processing (DSP), and graphics (GPU). Efficient software algorithms
should be used for implementation of cores to achieve higher performance. Software that can run
parallelly is preferred because we want to achieve parallel execution with the help of multiple cores.
Advantages:
• These cores are usually integrated into single IC (integrated circuit) die, or onto multiple dies but
in single chip package. Thus, allowing higher Cache Coherency.
• These systems are energy efficient since they allow higher performance at lower energy. A
challenge in this, however, is additional overhead of writing parallel code.
• It will have less traffic (cores integrated into single chip and will require less time).
Disadvantages:
• Dual-core processor do not work at twice speed of single processor. They get only 60-80% more
speed.
Asymmetric Multiprocessing:
Asymmetric Multiprocessing system is a multiprocessor computer system where not all of the multiple
interconnected central processing units (CPUs) are treated equally. In asymmetric multiprocessing, only
a master processor runs the tasks of the operating system.
The processors in this instance are in a master-slave relationship. While the other processors are viewed
as slave processors, one serves as the master or supervisor process.
In this system, the master processor is responsible for assigning tasks to the slave processor.
For example, AMP can be used in assigning specific tasks to the CPU based on the priority and the
importance of task completion.
The disadvantage of this kind of multiprocessing system is the unequal load placed on the processors.
While the other processors might be idle, one CPU might have a huge job queue.
Symmetric Multiprocessing:
It involves a multiprocessor computer hardware and software architecture where two or more identical
processors are connected to a single, shared main memory, and have full access to all input and output
devices, In other words, Symmetric Multiprocessing is a type of multiprocessing where each processor is
self-scheduling.
For example, SMP applies multiple processors to that one problem, known as parallel programming.
Tasks of the operating system are done by master Tasks of the operating system are
2.
processor. done individual processor.
Symmetric multiprocessing
5. Asymmetric multiprocessing systems are cheaper.
systems are costlier.
It is complex as synchronization is
It is simple as here the master processor has access to
8. required of the processors in order
the data, etc.
to maintain the load balance.
In multiprocessor system where many processes needs a copy of same memory block, the maintenance
of consistency among these copies raises a problem referred to as Cache Coherence Problem.
This occurs mainly due to these causes: -
• Process migration.
1. MSI Protocol:
This is a basic cache coherence protocol used in multiprocessor system. The letters of protocol name
identify possible states in which a cache can be. So, for MSI each block can have one of the following
possible states:
• Modified –
The block has been modified in cache, i.e., the data in the cache is inconsistent with the backing
store (memory). So, a cache with a block in “M” state has responsibility to write the block to
backing store when it is evicted.
• Shared –
This block is not modified and is present in at least one cache. The cache can evict the data
without writing it to backing store.
• Invalid –
This block is invalid and must be fetched from memory or from another cache if it is to be stored
in this cache.
2. MOSI Protocol:
This protocol is an extension of MSI protocol. It adds the following state in MSI protocol:
• Owned –
It indicates that the present processor owns this block and will service requests from other
processors for the block.
3. MESI Protocol –
It is the most widely used cache coherence protocol. Every cache line is marked with one the following
states:
• Modified –
This indicates that the cache line is present in current cache only and is dirty i.e its value is
different from the main memory. The cache is required to write the data back to main memory
in future, before permitting any other read of invalid main memory state.
• Exclusive –
This indicates that the cache line is present in current cache only and is clean i.e its value
matches the main memory value.
• Shared –
It indicates that this cache line may be stored in other caches of the machine.
• Invalid –
It indicates that this cache line is invalid.
4. MOESI Protocol:
This is a full cache coherence protocol that encompasses all of the possible states commonly used in
other protocols. Each cache line is in one of the following states:
• Modified –
A cache line in this state holds the most recent, correct copy of the data while the copy in the
main memory is incorrect and no other processor holds a copy.
• Owned –
A cache line in this state holds the most recent, correct copy of the data. It is similar to shared
state in that other processors can hold a copy of most recent, correct data and unlike shared
state however, copy in main memory can be incorrect. Only one processor can hold the data in
owned state while all other processors must hold the data in shared state.
• Exclusive –
A cache line in this state holds the most recent, correct copy of the data. The main memory copy
is also most recent, correct copy of data while no other holds a copy of data.
• Shared –
A cache line in this state holds the most recent, correct copy of the data. Other processors in
system may hold copies of data in shared state as well. The main memory copy is also the most
recent, correct copy of the data, if no other processor holds it in owned state.
• Invalid –
A cache line in this state does not hold a valid copy of data. Valid copies of data can be either in
main memory or another processor cache.
Parallel processing techniques.
Parallel processing can be described as a class of techniques which enables the system to achieve
simultaneous data-processing tasks to increase the computational speed of a computer system.
A parallel processing system can carry out simultaneous data-processing to achieve faster execution
time. For instance, while an instruction is being processed in the ALU component of the CPU, the next
instruction can be read from memory.
The primary purpose of parallel processing is to enhance the computer processing capability and
increase its throughput, i.e. the amount of processing that can be accomplished during a given interval
of time.
A parallel processing system can be achieved by having a multiplicity of functional units that perform
identical or different operations simultaneously. The data can be distributed among various multiple
functional units.
The following diagram shows one possible way of separating the execution unit into eight
functional units operating in parallel.
The operation performed in each functional unit is indicated in each block if the diagram:
o The adder and integer multiplier performs the arithmetic operation with integer numbers.
o The floating-point operations are separated into three circuits operating in parallel.
o The logic, shift, and increment operations can be performed concurrently on different data. All
units are independent of each other, so one number can be shifted while another number is
being incremented.
The main advantage of parallel processing is that it provides better utilization of system resources by
increasing resource multiplicity which overall system throughput.
An interrupt I/O is a process of data transfer in which an external device or a peripheral informs the CPU
that it is ready for communication and requests the attention of the CPU.
I/O Configuration
The terminals send and receive serial information. Each portion of serial data has eight bits of
alphanumeric code, where the leftmost bit is continually 0. The serial data from the input register is
transferred into the input register INPR. The output register OUTR can save the serial data for the
printer. These two registers interact with the Accumulator (AC) in parallel and with a communication
interface in a serial structure.
The Input/Output configuration is displayed in the figure. The transmitter interface gets serial data from
the keyboard and sends it to INPR. The receiver interface gets data from OUTR and transfers it to the
printer serially.
The input/output registers include eight bits. The FGI is a 1-bit input flag, which is a control flip-flop. The
flag bit is set to 1 when new data is accessible in the input device and is cleared to 0 when the data is
approved through the device.
When a key is clicked on the keyboard, the alphanumeric code equivalent to the key is shifted to INPR
and the input flag FGI is set to 0. The data in INPR cannot be modified considering the flag is set. The
device tests the flag bit; if it is 1, the data from INPR is sent in parallel into AC, and FGI is cleared to 0.
The output register OUTR works equivalent to the input register INPR.
The flow of data by the OUTR is the opposite of INPR. Therefore, the output flag FGO is set to 1
originally. The device tests the flag bit; if it is 1, the data from AC is sent in parallel to OUTR, and FGO is
cleared to 0. The new data cannot be loaded into OUTR when the FGO is 0 because this condition
denotes that the output device is in the procedure of printing a character.
Input Register
The INPR input register is a register that includes eight bits and influences alphanumeric input data. The
1-bit input flag FGI is a control flip-flop. When new data is accessible in the input device, the flag bit is
set to 1. It is cleared to 0 when the data is approved by the device. The flag is needed to synchronize the
timing rate difference between the input device and the computer.
Output Register
The working of the output register OUTR is equivalent to that of the input register INPR, therefore the
control of data flow is in the opposite.
DMA Controller is a hardware device that allows I/O devices to directly access memory with less
participation of the processor. DMA controller needs the same old circuits of an interface to
communicate with the CPU and Input/Output devices.
Direct Memory Access uses hardware for accessing the memory, that hardware is called a DMA
Controller. It has the work of transferring the data between Input Output devices and main memory
with very less interaction with the processor. The direct Memory Access Controller is a control unit,
which has the work of transferring data.
DMA Controller is a type of control unit that works as an interface for the data bus and the I/O Devices.
As mentioned, DMA Controller has the work of transferring the data without the intervention of the
processors, processors can control the data transfer. DMA Controller also contains an address unit,
which generates the address and selects an I/O device for the transfer of data. Here we are showing the
block diagram of the DMA Controller.
• Single-Ended DMA
• Dual-Ended DMA
• Arbitrated-Ended DMA
• Interleaved DMA
Single-Ended DMA: Single-Ended DMA Controllers operate by reading and writing from a single memory
address. They are the simplest DMA.
Dual-Ended DMA: Dual-Ended DMA controllers can read and write from two memory addresses. Dual-
ended DMA is more advanced than single-ended DMA.
Arbitrated-Ended DMA: Arbitrated-Ended DMA works by reading and writing to several memory
addresses. It is more advanced than Dual-Ended DMA.
Interleaved DMA: Interleaved DMA are those DMA that read from one memory address and write from
another memory address.
Note: All registers in the DMA appear to the CPU as I/O interface registers. Therefore, the CPU can both
read and write into the DMA registers under program control via the data bus.