0% found this document useful (0 votes)

8 views15 pages

MCSE-103 Advanced Computer Architecture (June 2020)

The document discusses Flynn's classification of parallel computing structures, detailing four categories: SISD, SIMD, MISD, and MIMD, each with definitions, architectures, examples, and use cases. It also covers the need for parallel processing, emphasizing performance, efficiency, scalability, and cost-effectiveness, along with various parallel computing structures like pipelined processors and vector processors. Additionally, it addresses data and control hazards in pipelining, differences between multi-computer and multi-processor systems, and architectural models of multi-processor systems.

Uploaded by

gixayew714

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views15 pages

MCSE-103 Advanced Computer Architecture (June 2020)

Uploaded by

gixayew714

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

### Unit 1: Flynn's and Handler's Classiﬁca on of Parallel Compu ng Structures

#### Ques on 1a: Flynn's Classiﬁca on of Parallel Processing

- **Flynn's Taxonomy**:

- SISD (Single Instruc on stream, Single Data stream):

- Deﬁni on: A single instruc on operates on a single data point at a me.

- **Architecture**:

- Single control unit directs opera ons.

- One processing element executes instruc ons.

- **Examples**:

- Tradi onal personal computers.

- Simple microprocessors (e.g., early Intel CPUs).

- **Use Cases**: Suitable for general-purpose compu ng tasks where parallelism is not required.

- **Diagram**:

```

Control Unit -> Processing Element -> Memory

```

- SIMD (Single Instruc on stream, Mul ple Data streams):

- **Deﬁni on**: One instruc on operates on mul ple data points simultaneously.

- **Architecture**:

- Single control unit broadcasts instruc ons to mul ple processing elements.

- Each processing element executes the same instruc on on diﬀerent pieces of data.

- **Examples**:

- Modern Graphics Processing Units (GPUs).

- Vector processors used in scien ﬁc compu ng.

- **Use Cases**: Ideal for tasks that can be parallelized across large data sets, such as image
processing or scien ﬁc simula ons.

- **Diagram**:

```
Control Unit

| | |

PE1 PE2 PE3 ... PEn

```

- MISD (Mul ple Instruc on streams, Single Data stream):

- **Deﬁni on**: Mul ple instruc ons operate on a single data stream.

- **Architecture**:

- Mul ple control units and processing elements.

- Rarely used due to limited prac cal applica ons.

- **Examples**:

- Hypothe cal systems, certain fault-tolerant systems.

- **Use Cases**: Poten ally useful in scenarios requiring mul ple types of analysis on the same data.

- **Diagram**:

```

CU1 -> PE1

CU2 -> PE2

CU3 -> PE3 ... CUn -> PEn

Data Stream

```

- MIMD (Mul ple Instruc on streams, Mul ple Data streams):

- **Defini on**: Mul ple processors execute different instruc ons on different data points
simultaneously.

- **Architecture**:

- Mul ple autonomous processors, each with its own control unit.

- Processors can operate asynchronously.

- **Examples**:

- Mul core processors.

- Distributed systems like computer clusters.

- **Use Cases**: Suitable for a wide range of applica ons from general-purpose compu ng to large-
scale scien ﬁc simula ons.

- **Diagram**:

```

CU1 -> PE1 -> Data Stream 1

CU2 -> PE2 -> Data Stream 2

CU3 -> PE3 -> Data Stream 3 ... CUn -> PEn -> Data Stream n

```

#### Ques on 1b: Need for Parallel Processing and Classiﬁca on of Parallel Compu ng Structures

- Need for Parallel Processing:

- **Performance**:

- Parallel processing can signiﬁcantly increase computa onal speed by dividing tasks among mul ple
processors.

- Example: Weather simula ons, where data can be processed in parallel to speed up forecasts.

- **Eﬃciency**:

- U lizes resources more eﬀec vely by distribu ng workloads across processors.

- Example: Data centers distribu ng web requests across mul ple servers.

- **Scalability**:

- Can handle larger problems by adding more processors.

- Example: Scien ﬁc research requiring large-scale simula ons or data analysis.

- **Cost-Eﬀec veness**:

- Reduces processing me, leading to lower opera onal costs.

- Example: Financial modeling where faster computa ons can lead to mely decisions and cost
savings.

- Classiﬁca on of Parallel Compu ng Structures:

- **Pipelined Processors**:

- Deﬁni on: Overlapping phases of instruc on execu on.

- Stages: Fetch, decode, execute, memory access, write-back.

- Use Case: Increases instruc on throughput, commonly used in modern CPUs.

- **Diagram**:

```

Fetch -> Decode -> Execute -> Memory Access -> Write-Back

```

- **Vector Processors**:

- Deﬁni on: Perform opera ons on en re vectors simultaneously.

- **Use Case**: Eﬃcient for scien ﬁc computa ons involving large data sets.

- Example: Cray supercomputers.

- **Diagram**:

```

Vector Processor

```

- **Array Processors**:

- **Deﬁni on**: Grid of processors performing the same instruc on on diﬀerent data points.

- Use Case: Suitable for data-parallel tasks.

- Example: Early SIMD systems.

- **Diagram**:

```

Array of Processing Elements (PE)

```

- Mul threaded Processors:

- **Deﬁni on**: Use mul ple threads within a single processor to perform tasks concurrently.

- **Use Case**: Enhances performance for applica ons that can be parallelized at the thread level.

- Example: Modern CPUs with hyper-threading.

- **Diagram**:

```

Processor Core

Thread 1

Thread 2
```

- **Mul processors**:

- **Deﬁni on**: Systems with mul ple processors working on diﬀerent tasks.

- **Types**: Tightly coupled (shared memory) vs. loosely coupled (distributed memory).

- Use Case: Suitable for high-performance compu ng tasks.

- **Diagram**:

```

Mul ple Processors -> Shared Memory

```

### Unit 1: Pipelined and Vector Processors

#### Ques on 2a: What is Pipelining?

- **Deﬁni on**:

- Pipelining is a technique where mul ple instruc on phases are overlapped to improve processing
eﬃciency.

- Each stage in the pipeline performs a part of an instruc on, passing it to the next stage in a sequen al
manner.

- Processing in the Pipeline:

- **Stages**:

- Fetch: Retrieve instruc on from memory.

- Decode: Determine the required opera ons and operands.

- Execute: Perform the opera ons.

- Memory Access: Read/write data from/to memory.

- Write-Back: Store the results back in registers.

- **Example**:

- An instruc on pipeline with ﬁve stages: Fetch, Decode, Execute, Memory Access, Write-Back.

- Each stage processes a diﬀerent part of an instruc on simultaneously.

- **Beneﬁts**:
- **Increased Throughput**: Mul ple instruc ons are processed simultaneously, increasing the overall
processing speed.

- Resource Eﬃciency: Be er u liza on of processor resources by keeping all stages ac ve.

- **Diagram**:

```

Time ->

Fetch Decode Execute Mem Access Write-Back

| | | | |

+------->+------->+------->+--------->+---------->

```

#### Ques on 2b: Why Does Pipelining Improve Performance?

- **Increased Throughput**:

- Mul ple instruc ons are processed simultaneously, leading to a higher number of instruc ons
executed per unit of me.

- Example: If each stage takes one clock cycle, a five-stage pipeline can complete five instruc ons every
five cycles once the pipeline is full.

- Latency Reduc on:

- Each instruc on has a shorter wait me as the stages overlap.

- Example: In a non-pipelined system, instruc ons would be executed sequen ally, increasing wait me.

- **Resource Eﬃciency**:

- Keeps all stages of the processor ac ve, maximizing resource usage.

- Example: Instead of having one instruc on monopolize the processor, mul ple instruc ons share
resources, reducing idle me.

- **Illustra on**:

- In a non-pipelined architecture, an instruc on might take ﬁve cycles to complete. In a pipelined

architecture, once the pipeline is full, an instruc on completes every cycle.

- Diagram:

```
Non-Pipelined:

Time -> 1 2 3 4 5 6 7 8 9 10

I1 I2 I3

Pipelined:

Time -> 1 2 3 4 5 6 7 8 9 10

Fetch -> I1 I2 I3

Decode -> I1 I2 I3

Execute-> I1 I2 I3

Mem -> I1 I2 I3

WB -> I1 I2 I3

```

### Unit 1: Speedup, Throughput, and Eﬃciency of Pipelined Architecture

#### Ques on 3a: Speedup, Throughput, and Eﬃciency of a Pipelined Architecture

- **Speedup**:

- **Deﬁni on**: The ra o of the me taken to complete a task without pipelining to the me taken
with pipelining.

- Formula: Speedup (S) = Non-Pipelined Time / Pipelined Time.

- **Example**:

- Non-pipelined execu on me: 100 cycles.

- Pipelined execu on me: 25 cycles.

- Speedup: 100 / 25 = 4.

- **Diagram**:

```

Speedup ->

Non-Pipelined -> |------------------------------|

Pipelined -> |----|----|----|----|

```

- **Throughput**:

- **Deﬁni on**: The number of instruc ons processed per unit me.

- Formula: Throughput (T) = Number of Instruc ons / Time.

- **Example**:

- If a pipelined processor can complete 10 instruc ons in 10 cycles, its throughput is 1 instruc on per
cycle.

- This is compared to a non-pipelined processor where instruc ons might complete sequen ally,
resul ng in a lower throughput.

- **Diagram**:

```

Throughput ->

Pipelined -> |----|----|----|----|----|

```

- **Eﬃciency**:

- **Deﬁni on**: The ra o of useful work done to the total work expended.

- Formula: Eﬃciency = (Number of Instruc ons / Pipelined Time) * 100%.

- **Example**:

- If a pipelined processor completes 100 instruc ons in 20 cycles:

- Eﬃciency = (100 / 20) * 100% = 500%.

- This high eﬃciency is due to the overlap of instruc on processing stages, minimizing idle me and
maximizing use of resources.

- **Diagram**:

```

Eﬃciency ->

Pipelined -> |----|----|----|----|----|

```
### Unit 1: Vector Processing and SIMD Array Processor

#### Ques on 4a: What is Vector Processing?

- **Deﬁni on**:

- Vector processing involves execu ng a single instruc on on mul ple data elements simultaneously.

- This contrasts with scalar processing, where each instruc on operates on a single data element at a
me.

- **Applica ons**:

- **Scien ﬁc Compu ng**: Vector processors excel in tasks such as linear algebra opera ons (matrix
mul plica ons, vector addi ons).

- **Graphics Processing**: Used in rendering pipelines for transforming and shading ver ces.

- **Signal Processing**: Eﬃcient for processing large volumes of data in real- me applica ons (e.g.,
audio and video processing).

- **Beneﬁts**:

- **Performance**: Handles large datasets eﬃciently by processing mul ple data elements in parallel.

- **Speed**: Signiﬁcantly faster than scalar processing for opera ons on large arrays or matrices.

- **Power Eﬃciency**: Achieves higher performance per wa compared to scalar processors due to
parallelism.

- **Example**:

- Cray supercomputers historically used vector processing units for scien ﬁc simula ons and modeling.

- **Diagram**:

```

Vector Processor

```

#### Ques on 4b: SIMD Array Processor

- **Deﬁni on**:

- SIMD (Single Instruc on, Mul ple Data) array processors execute the same instruc on on mul ple
data elements simultaneously.

- Arrays of processing elements (PEs) operate in parallel under the control of a central unit (CU).
- **Architecture**:

- Control Unit (CU): Issues instruc ons to mul ple PEs.

- **Processing Elements (PEs)**: Execute the same instruc on but on diﬀerent data elements.

- Interconnec on Network: Facilitates data exchange between PEs and memory.

- **Applica ons**:

- **Graphics Processing Units (GPUs)**: Use SIMD architecture for parallel execu on of shader
programs.

- **Scien ﬁc Compu ng**: Accelerates simula ons involving large-scale computa ons.

- **Machine Learning**: SIMD processors op mize parallel opera ons in neural network training.

- **Diagram**:

```

Control Unit (CU) -> [PE1, PE2, PE3, ... , PEn]

```

### Unit 2: Data and Control Hazards

#### Ques on 5a: Data and Control Hazards

- **Data Hazards**:

- **Deﬁni on**: Occur when instruc ons depend on the results of previous instruc ons.

- **Types**:

- RAW (Read A er Write): Reading a register before its value is updated.

- **WAR (Write A er Read)**: Wri ng to a register before its previous value is read.

- **WAW (Write A er Write)**: Wri ng to the same register mul ple mes before the previous write
completes.

- **Resolu on**:

- **Forwarding**: Passing data directly from one pipeline stage to another to avoid stalls.

- Pipeline Interlocks: Inser ng bubbles (no-ops) to prevent data hazards.

- Register Renaming: Using addi onal registers to avoid name conﬂicts.

- **Control Hazards**:
- **Defini on**: Arise due to condi onal branches that affect program flow.

- **Resolu on**:

- **Branch Predic on**: Specula ng whether a branch will be taken or not before the actual decision.

- **Delayed Branching**: Delaying the eﬀect of a branch instruc on un l its outcome is known.

- Dynamic Scheduling: Reordering instruc ons to execute independently of branch decisions.

#### Ques on 5b: Diﬀerence Between Mul computer and Mul processor Systems

- Mul computer Systems:

- **Deﬁni on**: Comprise mul ple independent computers connected via a network.

- **Characteris cs**:

- Each computer has its own memory and opera ng system.

- Communica on between computers occurs via message passing.

- Examples include clusters of PCs or worksta ons connected over a network.

- **Use Cases**: High availability, scalability, and fault tolerance in distributed compu ng
environments.

- Mul processor Systems:

- **Deﬁni on**: Consist of mul ple processors sharing a common memory and opera ng system.

- **Characteris cs**:

- Processors access shared memory for communica on and synchroniza on.

- Examples include symmetric mul processing (SMP) systems or NUMA architectures.

- **Use Cases**: High-performance compu ng, where shared memory access speeds up inter-process
communica on and data sharing.

### Unit 2: Mul processor Models

#### Ques on 6a: Mul processor Architectural Models

- **Deﬁni on**:

- Mul processor systems feature mul ple processors that share a common memory space and can
execute tasks concurrently.

- **Architectural Models**:
- **Tightly Coupled**:

- Processors share memory and communicate directly.

- Suitable for applica ons requiring high-speed communica on and synchroniza on.

- **Loosely Coupled**:

- Processors have separate memories and communicate via interconnec on networks.

- Oﬀers scalability and fault tolerance but requires eﬃcient message-passing protocols.

- **Use Cases**:

- Tightly coupled systems are ideal for real- me processing and high-performance compu ng (HPC).

- Loosely coupled systems excel in distributed compu ng environments where scalability and fault
tolerance are cri cal.

#### Ques on 6b: Loosely Coupled Mul processor System

- **Characteris cs**:

- Independent Memory: Each processor has its own memory space.

- Communica on: Processors communicate via message passing over a network.

- **Scalability**: Easily scalable by adding more processors and nodes to the network.

- Examples: Beowulf clusters, grid compu ng networks.

- Intra-processor Communica on:

- Processors communicate within the system using shared buses or interconnec on networks.

- Inter-processor Communica on:

- Data exchange between processors involves message-passing protocols that manage communica on
overhead.

### Unit 3: Interconnec on Networks and Load Balancing

#### Ques on 7a: Interconnec on Network Schemes

- **Deﬁni on**:

- Interconnec on networks connect processors, memory, and I/O devices within a mul processor or
mul computer system.

- **Schemes**:
- **Bus-based**:

- Uses a shared communica on bus for data exchange.

- Simple and cost-eﬀec ve but can lead to bo lenecks.

- **Crossbar Switch**:

- Directly connects mul ple devices in a non-blocking manner.

- Oﬀers high throughput but can be costly and complex to implement.

- Mul stage Networks:

- Connects devices in mul ple stages (layers) of switches.

- Balances cost and performance, commonly used in large-scale systems.

- Mesh and Torus:

- Grid-based structures connec ng processors in a mesh or toroidal topology.

- Provides fault tolerance and scalability, common in supercomputers and HPC clusters.

#### Ques on 7b: Load Balancing in Mul processor Systems

- **Deﬁni on**:

- Load balancing distributes tasks and computa onal load evenly across processors to op mize system
performance.

- **Techniques**:

- Sta c Load Balancing:

- Pre-determined assignment of tasks based on known workload characteris cs.

- Example: Round-robin scheduling or par oning tasks based on computa onal complexity.

- Dynamic Load Balancing:

- Adjusts task assignment in real- me based on current system load and performance metrics.

- Example: Task stealing where idle processors take on tasks from overloaded processors.

- **Example**:

- Job scheduling algorithms dynamically allocate tasks to processors based on their current workload.

- Load balancing ensures eﬃcient resource u liza on and minimizes idle me in mul processor
systems.
### Unit 3:

Synchroniza on and Coherence in Mul processor Systems

#### Ques on 8a: Synchroniza on Mechanisms in Mul processor Systems

- **Deﬁni on**:

- Synchroniza on ensures orderly execu on of concurrent processes or threads sharing resources.

- **Mechanisms**:

- **Mutual Exclusion**:

- Prevents mul ple processes from accessing a shared resource simultaneously.

- Example: Locks, semaphores, or atomic instruc ons.

- Atomic Opera ons:

- Guarantees that a sequence of opera ons is executed as a single unit without interrup on.

- Example: Compare-and-swap (CAS) in shared memory systems.

- Barrier Synchroniza on:

- Ensures that all processes reach a speciﬁc point before con nuing execu on.

- Example: Barrier synchroniza on used in parallel computa ons to synchronize threads.

- **Diagram**:

```

Synchroniza on Mechanisms ->

Mutual Exclusion -> Locks, Semaphores

Atomic Opera ons -> Compare-and-swap

Barrier Synchroniza on -> Barrier

```

#### Ques on 8b: Cache Coherence Protocols

- **Deﬁni on**:

- Cache coherence ensures that mul ple processors accessing shared data maintain consistency across
their local caches.
- **Protocols**:

- **MESI Protocol**:

- Maintains cache coherence using four states: Modiﬁed, Exclusive, Shared, and Invalid.

- Ensures that only one cache has the right to modify a given block of data at a me.

- **MOESI Protocol**:

- Enhances MESI by adding an Owned state, where a cache can read data without checking main
memory.

- Improves eﬃciency by reducing memory access latency.

- **MESIF Protocol**:

- Further reﬁnes MOESI by introducing a Forward state for quicker data transfer between caches.

- Reduces traﬃc on the interconnect network by allowing direct cache-to-cache transfers.

- **Implementa on**:

- Hardware-based coherence protocols ensure consistent data across caches through snooping or
directory-based approaches.

- Example: Intel processors use MESI-based protocols to maintain cache coherence eﬃciently.

### Conclusion

This detailed response covers various aspects of parallel compu ng, from Flynn's and Handler's
classiﬁca ons to pipelining, vector processing, hazards, mul processor architectures, interconnec on
networks, load balancing, synchroniza on, and cache coherence. Each sec on provides in-depth
explana ons, examples, and diagrams to illustrate key concepts in advanced computer architecture.

If you have any further ques ons or need addi onal clariﬁca on on any topic, feel free to ask!

CSC580 Quick Notes Lect1and2
100% (1)
CSC580 Quick Notes Lect1and2
18 pages
JNTUH COA Unit 5
No ratings yet
JNTUH COA Unit 5
31 pages
Affidavit For Gap Certificate Format
0% (1)
Affidavit For Gap Certificate Format
2 pages
Unit 5-2 COA
No ratings yet
Unit 5-2 COA
52 pages
Ca Unit 2.2
100% (2)
Ca Unit 2.2
22 pages
Blue-J: ICSE Class - 10
No ratings yet
Blue-J: ICSE Class - 10
186 pages
Lec18 Pipeline
No ratings yet
Lec18 Pipeline
59 pages
Presentation 3
No ratings yet
Presentation 3
37 pages
Parallelism in Uniprocessor System and Granularity
100% (5)
Parallelism in Uniprocessor System and Granularity
5 pages
Jun 2017
No ratings yet
Jun 2017
17 pages
MCSE
No ratings yet
MCSE
12 pages
5
No ratings yet
5
13 pages
MCSE-103 Advanced Computer Architecture (June 2020)
No ratings yet
MCSE-103 Advanced Computer Architecture (June 2020)
15 pages
Parallel Processing
No ratings yet
Parallel Processing
35 pages
5.1-5.3 Pipelining and Parallel Processing
No ratings yet
5.1-5.3 Pipelining and Parallel Processing
56 pages
Parallel Archtecture and Computing
No ratings yet
Parallel Archtecture and Computing
65 pages
Pipelining, Introduction To Parallel Processing and Operating System
No ratings yet
Pipelining, Introduction To Parallel Processing and Operating System
50 pages
Classic Papers in Programming Languages and Logic
No ratings yet
Classic Papers in Programming Languages and Logic
863 pages
Arch13 Multiprocessors Afterlecture
No ratings yet
Arch13 Multiprocessors Afterlecture
70 pages
MCSE-103 Advanced Computer Architecture
No ratings yet
MCSE-103 Advanced Computer Architecture
9 pages
Host Script Samples
100% (7)
Host Script Samples
4 pages
Parallel Processing Exam Answers
No ratings yet
Parallel Processing Exam Answers
9 pages
Adawdawdwa
No ratings yet
Adawdawdwa
4 pages
Top 50 Microservices Interview Questions
No ratings yet
Top 50 Microservices Interview Questions
16 pages
Ca 3
No ratings yet
Ca 3
4 pages
MCSE-103 Advanced Computer Architecture
No ratings yet
MCSE-103 Advanced Computer Architecture
9 pages
Introduction To Parallel Processing
No ratings yet
Introduction To Parallel Processing
49 pages
Chapter 8 Function Overloading PDF
100% (1)
Chapter 8 Function Overloading PDF
6 pages
System-On-Chip (Soc) Architecture Soc Example
No ratings yet
System-On-Chip (Soc) Architecture Soc Example
71 pages
Final
No ratings yet
Final
26 pages
CA Unit 6 (Pipelining)
No ratings yet
CA Unit 6 (Pipelining)
13 pages
Lect28-Pipeline 15012019
No ratings yet
Lect28-Pipeline 15012019
36 pages
Instruction Pipelining and SuperScalar Development - 2019
No ratings yet
Instruction Pipelining and SuperScalar Development - 2019
53 pages
Computer Architecture 1
No ratings yet
Computer Architecture 1
37 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
30 pages
Chapter 3 - Pipelining-And-Vector-Processing
100% (1)
Chapter 3 - Pipelining-And-Vector-Processing
29 pages
Eden Net GSM Anr Guide
No ratings yet
Eden Net GSM Anr Guide
101 pages
Vendor Invoice - Retention
0% (1)
Vendor Invoice - Retention
8 pages
A4 版本1 （未使用）
No ratings yet
A4 版本1 （未使用）
2 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
37 pages
Ref. 1. (N. G. Palan) VHDL Programming: Half and Full Adder, Full Subractor, Four Bit Binary
No ratings yet
Ref. 1. (N. G. Palan) VHDL Programming: Half and Full Adder, Full Subractor, Four Bit Binary
2 pages
Pipelining and Vector Processing: - Parallel
No ratings yet
Pipelining and Vector Processing: - Parallel
37 pages
Week1 - Parallel and Distributed Computing
100% (1)
Week1 - Parallel and Distributed Computing
46 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
37 pages
Unit 6 COA
No ratings yet
Unit 6 COA
37 pages
Unit 7 - Parallel Processing Paradigm
No ratings yet
Unit 7 - Parallel Processing Paradigm
26 pages
Unit-V NEW
No ratings yet
Unit-V NEW
21 pages
Unit 4 COA
No ratings yet
Unit 4 COA
5 pages
Parallel Processing Parallel Processing
No ratings yet
Parallel Processing Parallel Processing
64 pages
1-Month VAPT Study Plan
No ratings yet
1-Month VAPT Study Plan
2 pages
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
No ratings yet
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
72 pages
ch.9 Pipeline MoDIFIED
No ratings yet
ch.9 Pipeline MoDIFIED
76 pages
Parallel Computer Structures
No ratings yet
Parallel Computer Structures
23 pages
MS Publisher Excel Training
No ratings yet
MS Publisher Excel Training
4 pages
Pipelining and Vector Processing
No ratings yet
Pipelining and Vector Processing
37 pages
MCSE-103 Advanced Computer Architecture (December 2020)
No ratings yet
MCSE-103 Advanced Computer Architecture (December 2020)
28 pages
COAU5
No ratings yet
COAU5
31 pages
ECE.488 Multi-Various Analog Controls
No ratings yet
ECE.488 Multi-Various Analog Controls
3 pages
24-25 - Parallel Processing PDF
No ratings yet
24-25 - Parallel Processing PDF
36 pages
Chapter 9
No ratings yet
Chapter 9
28 pages
Coa Unit 5
No ratings yet
Coa Unit 5
20 pages
MVC - Restful API
No ratings yet
MVC - Restful API
22 pages
Aca Unit-3
No ratings yet
Aca Unit-3
10 pages
Jun 2017
No ratings yet
Jun 2017
17 pages
MCSE-103 Advanced Computer Architecture (December 2017)
No ratings yet
MCSE-103 Advanced Computer Architecture (December 2017)
19 pages
MCSE-103 Advanced Computer Architecture (December 2017)
No ratings yet
MCSE-103 Advanced Computer Architecture (December 2017)
19 pages
OVERVIEW - MCSE-103 Advanced Computer Architecture
No ratings yet
OVERVIEW - MCSE-103 Advanced Computer Architecture
5 pages
OVERVIEW - MCSE-103 Advanced Computer Architecture
No ratings yet
OVERVIEW - MCSE-103 Advanced Computer Architecture
5 pages
Parallel Computing
No ratings yet
Parallel Computing
32 pages
(PC Game) (PC Game) Anno 1404 Dawn of Discovery
No ratings yet
(PC Game) (PC Game) Anno 1404 Dawn of Discovery
7 pages
Bluetooth Protocol Bluetooth 4dd
No ratings yet
Bluetooth Protocol Bluetooth 4dd
60 pages
Data Structures in C MCQ Questions and Answers For Sppu Exam 2020 Part-1 - ToolsandJobs
No ratings yet
Data Structures in C MCQ Questions and Answers For Sppu Exam 2020 Part-1 - ToolsandJobs
4 pages
Fortinet: NSE4 - FGT-6.2 Exam
No ratings yet
Fortinet: NSE4 - FGT-6.2 Exam
7 pages
MDI Form: VB Program For Creating MDI Application
No ratings yet
MDI Form: VB Program For Creating MDI Application
2 pages
Proposal - Website +software AMC For Cryptoconnect
No ratings yet
Proposal - Website +software AMC For Cryptoconnect
5 pages
Pipeline and Vector Processing
100% (1)
Pipeline and Vector Processing
18 pages
CS1101 Computational Engineering: Introduction To C Programming Language
No ratings yet
CS1101 Computational Engineering: Introduction To C Programming Language
34 pages
CSO Lecture Notes Unit - 5
No ratings yet
CSO Lecture Notes Unit - 5
11 pages
Mobile Service Technician - VHC Process
No ratings yet
Mobile Service Technician - VHC Process
16 pages
Advance Computer Networks
No ratings yet
Advance Computer Networks
76 pages
Final Project Report
No ratings yet
Final Project Report
66 pages
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
No ratings yet
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
47 pages
Ba Unit 2
No ratings yet
Ba Unit 2
20 pages
Practicing Netiquettes
No ratings yet
Practicing Netiquettes
21 pages
Module 11 Notes
No ratings yet
Module 11 Notes
3 pages
ACA1
No ratings yet
ACA1
26 pages
Android Questions
No ratings yet
Android Questions
28 pages
TitanLegends Hunter Security Audit Report 1.0
No ratings yet
TitanLegends Hunter Security Audit Report 1.0
8 pages
Aca Unit-1
No ratings yet
Aca Unit-1
17 pages
Web Technologies Miniproject
No ratings yet
Web Technologies Miniproject
64 pages
Aca Unit-4
No ratings yet
Aca Unit-4
4 pages
New 2 CV
No ratings yet
New 2 CV
1 page
ASSIGNMENT Question
No ratings yet
ASSIGNMENT Question
2 pages
ASSIGNMENT Question
No ratings yet
ASSIGNMENT Question
2 pages
Workshop
No ratings yet
Workshop
2 pages
TypeFocus Sample Report
No ratings yet
TypeFocus Sample Report
7 pages
Kubernetes Made Easy
From Everand
Kubernetes Made Easy
Pankaj Joshi
No ratings yet
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
MARIO FRANCO
No ratings yet

MCSE-103 Advanced Computer Architecture (June 2020)

Uploaded by

MCSE-103 Advanced Computer Architecture (June 2020)

Uploaded by

### Unit 1: Flynn's and Handler's Classiﬁca on of Parallel Compu ng Structures

#### Ques on 1a: Flynn's Classiﬁca on of Parallel Processing

- **SISD (Single Instruc on stream, Single Data stream)**:

- **Deﬁni on**: A single instruc on operates on a single data point at a me.

- Single control unit directs opera ons.

- One processing element executes instruc ons.

- Tradi onal personal computers.

- Simple microprocessors (e.g., early Intel CPUs).

Control Unit -> Processing Element -> Memory

- **SIMD (Single Instruc on stream, Mul ple Data streams)**:

- Modern Graphics Processing Units (GPUs).

- Vector processors used in scien ﬁc compu ng.

PE1 PE2 PE3 ... PEn

- **MISD (Mul ple Instruc on streams, Single Data stream)**:

- Mul ple control units and processing elements.

- Rarely used due to limited prac cal applica ons.

- Hypothe cal systems, certain fault-tolerant systems.

CU1 -> PE1

CU2 -> PE2

CU3 -> PE3 ... CUn -> PEn

- **MIMD (Mul ple Instruc on streams, Mul ple Data streams)**:

- Processors can operate asynchronously.

- Mul core processors.

- Distributed systems like computer clusters.

CU1 -> PE1 -> Data Stream 1

CU2 -> PE2 -> Data Stream 2

- **Need for Parallel Processing**:

- U lizes resources more eﬀec vely by distribu ng workloads across processors.

- Can handle larger problems by adding more processors.

- Example: Scien ﬁc research requiring large-scale simula ons or data analysis.

- Reduces processing me, leading to lower opera onal costs.

- **Classiﬁca on of Parallel Compu ng Structures**:

- **Deﬁni on**: Overlapping phases of instruc on execu on.

- **Stages**: Fetch, decode, execute, memory access, write-back.

- **Use Case**: Increases instruc on throughput, commonly used in modern CPUs.

- **Deﬁni on**: Perform opera ons on en re vectors simultaneously.

- **Example**: Cray supercomputers.

- **Use Case**: Suitable for data-parallel tasks.

- **Example**: Early SIMD systems.

Array of Processing Elements (PE)

- **Mul threaded Processors**:

- **Example**: Modern CPUs with hyper-threading.

- **Use Case**: Suitable for high-performance compu ng tasks.

Mul ple Processors -> Shared Memory

### Unit 1: Pipelined and Vector Processors

#### Ques on 2a: What is Pipelining?

- **Processing in the Pipeline**:

- **Fetch**: Retrieve instruc on from memory.

- **Decode**: Determine the required opera ons and operands.

- **Execute**: Perform the opera ons.

- **Memory Access**: Read/write data from/to memory.

- **Write-Back**: Store the results back in registers.

- Each stage processes a diﬀerent part of an instruc on simultaneously.

- **Resource Eﬃciency**: Be er u liza on of processor resources by keeping all stages ac ve.

Fetch Decode Execute Mem Access Write-Back

#### Ques on 2b: Why Does Pipelining Improve Performance?

- **Latency Reduc on**:

- Each instruc on has a shorter wait me as the stages overlap.

- Keeps all stages of the processor ac ve, maximizing resource usage.

- In a non-pipelined architecture, an instruc on might take ﬁve cycles to complete. In a pipelined

### Unit 1: Speedup, Throughput, and Eﬃciency of Pipelined Architecture

#### Ques on 3a: Speedup, Throughput, and Eﬃciency of a Pipelined Architecture

- **Formula**: Speedup (S) = Non-Pipelined Time / Pipelined Time.

- Non-pipelined execu on me: 100 cycles.

- Pipelined execu on me: 25 cycles.

Non-Pipelined -> |------------------------------|

Pipelined -> |----|----|----|----|

- **Formula**: Throughput (T) = Number of Instruc ons / Time.

Pipelined -> |----|----|----|----|----|

- **Formula**: Eﬃciency = (Number of Instruc ons / Pipelined Time) * 100%.

- If a pipelined processor completes 100 instruc ons in 20 cycles:

- Eﬃciency = (100 / 20) * 100% = 500%.

Pipelined -> |----|----|----|----|----|

#### Ques on 4a: What is Vector Processing?

#### Ques on 4b: SIMD Array Processor

- **Control Unit (CU)**: Issues instruc ons to mul ple PEs.

- SISD (Single Instruc on stream, Single Data stream):

- Deﬁni on: A single instruc on operates on a single data point at a me.

- SIMD (Single Instruc on stream, Mul ple Data streams):

- MISD (Mul ple Instruc on streams, Single Data stream):

- MIMD (Mul ple Instruc on streams, Mul ple Data streams):

- Need for Parallel Processing:

- Classiﬁca on of Parallel Compu ng Structures:

- Deﬁni on: Overlapping phases of instruc on execu on.

- Stages: Fetch, decode, execute, memory access, write-back.

- Use Case: Increases instruc on throughput, commonly used in modern CPUs.

- Deﬁni on: Perform opera ons on en re vectors simultaneously.

- Example: Cray supercomputers.

- Use Case: Suitable for data-parallel tasks.

- Example: Early SIMD systems.

- Mul threaded Processors:

- Example: Modern CPUs with hyper-threading.

- Use Case: Suitable for high-performance compu ng tasks.

- Processing in the Pipeline:

- Fetch: Retrieve instruc on from memory.

- Decode: Determine the required opera ons and operands.

- Execute: Perform the opera ons.

- Memory Access: Read/write data from/to memory.

- Write-Back: Store the results back in registers.

- Resource Eﬃciency: Be er u liza on of processor resources by keeping all stages ac ve.

- Latency Reduc on:

- Formula: Speedup (S) = Non-Pipelined Time / Pipelined Time.

- Formula: Throughput (T) = Number of Instruc ons / Time.

- Formula: Eﬃciency = (Number of Instruc ons / Pipelined Time) * 100%.

- Control Unit (CU): Issues instruc ons to mul ple PEs.

- Interconnec on Network: Facilitates data exchange between PEs and memory.

- RAW (Read A er Write): Reading a register before its value is updated.

- Pipeline Interlocks: Inser ng bubbles (no-ops) to prevent data hazards.

- Register Renaming: Using addi onal registers to avoid name conﬂicts.

- Dynamic Scheduling: Reordering instruc ons to execute independently of branch decisions.

- Mul computer Systems:

- Mul processor Systems:

- Independent Memory: Each processor has its own memory space.

- Communica on: Processors communicate via message passing over a network.

- Examples: Beowulf clusters, grid compu ng networks.

- Intra-processor Communica on:

- Inter-processor Communica on:

- Mul stage Networks:

- Mesh and Torus:

- Sta c Load Balancing:

- Dynamic Load Balancing:

- Atomic Opera ons:

- Barrier Synchroniza on: