0% found this document useful (0 votes)

21 views13 pages

Multiprocessors I

Computer architecture

Uploaded by

oliviaclark2905

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views13 pages

Multiprocessors I

Computer architecture

Uploaded by

oliviaclark2905

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Introduction

Multiprocessor System

Professor: Dr. Tran Ngoc Thinh

Group Members: Mr. Ibtasam Rehman
Mr. La Minh Tuan Kiet
Mr. Nguyen Thanh Loc

Content

● Introduction to Multiprocessors
● Historical Context
● Difference MIMD and MDSI
● Multiprocessor - 1
● Flynn Classification
● Memory Consistency
● Sequential Consistency
● Relaxed Memory Models
● Thread & Multithread
● Superpipeline & Superscalar
● Multithreading & Superscalar
● Conclusion

2
1
Introduction to Multiprocessors

Definition

● Multiprocessor systems: integrate multiple processors within a single computing device.

● These systems enable parallel execution of tasks across multiple processors.

Significance

● Revolutionized computing landscape.

● Enabled smaller, faster, and more powerful devices.
● Democratized access to computing power:
○ Increased affordability.
○ Widened accessibility to advanced computing capabilities.
Impact

● Facilitated advancements in various fields:

○ Artificial intelligence.
○ Data analytics.
○ Scientific research.

Symmetric Multiprocessing
● Processing done by multiple processors that share a
common OS and memory.
● The processors share the same input/output (I/O) bus or
data path.
● This shared memory architecture is fundamental to the
operation of SMP systems and is typically implemented
using various techniques: Physical Memory, System bus
arbitration

4
2
Working of Multiprocessor

Flynn Classification
Flynn Classification

● Single Instruction, Single Data

● Single Instruction, Multiple Data (SIMD):
● Multiple Instruction, Single Data (MISD)
● Multiple Instruction, Multiple Data (MIMD)

6
3
Difference

Classification Instruction Stream Data Stream Example

SISD One One Traditional Von Neumann

architecture

SIMD One Multiple Vector processors like GPUs

or SIMD extensions in CPUs

MISD Multiple One Space shuttle flight control

system.

MIMD Multiple Multiple Multi-core CPUs, distributed

systems, clusters

Multiprocessor - 1
Synchronization
● A spin lock is a synchronization technique used in concurrent programming. When a thread wants to
access a shared resource protected by a spin lock and finds it already locked, instead of waiting, it
repeatedly checks the lock in a loop until it becomes available.

8
4
Multiprocessor - 1
Barrier
● A synchronization construct used in concurrent programming to ensure that a group of threads or
processes reach a designated point (the barrier) in their execution before any of them are allowed to
proceed further

Intuitive Model: Sequential Consistency (SC)

A multiprocessor is sequentially consistent if the result of any execution is the same as if

the operations of all the
processors were executed in some sequential order, and the operations of each
individual processor appear in this sequence in the order specified by its program.

10
5
Relaxed Memory Models
Why do we need Relaxed Consistency ?
- To keep hardware simple and performance high, relax the ordering requirements
- Relaxed Memory Models

Relaxed Consistency Models

SC maintains all memory access orderings -> very strict

- Can we relax some or all memory access orderings ?

12
6
Local Ordering: No Relaxing (SC)
- All prior LOADs and prior STORES must be performed ->
LOAD is performed
- All prior LOADs and previous STORE must be performed ->
STORE is performed

SC: Perform memory operations in-program order

- No Out-of-Order (OoO) execution for memory operations
- Any miss will stall the memory operations behind it

Local Ordering: Relaxing W→R

- Initially proposed for processors with inorder
pipelines -> allow Post-retirement Store Buffers

- Later loads can bypass earlier stores to independent

addresses

- TSO (Total Store Ordering) and Processor

Consistency are 2 examples of memory model with
this relaxation

14
7
Local Ordering: Relaxing W→W & R→RW
- In Processor Consistency and TSO, W→W and R→R are still enforced
- Enforcing R→R order strictly (i.e., reads executed in the order issued)
- Strict enforcement of Write-to-Write order

Relax Constraints on Memory Orders

16
8
Memory Model: Weak Ordering

- In well-synchronized program, all reorderings

inside a critical section should be allowed
- Data-race freedom ensures that no other thread
can observe the order of execution
- Mark instructions used for synchronization

Memory Model: Release Consistency

- Similar to Weak Ordering but distinguishes between:
- SYNCH op used to start a critical section (Acquire)
- SYNCH op used to end a critical section (Release)

18
9
Instruction Level Parallelism - ILP
Definition
● Used to refer to the architecture in which multiple operations can be performed parallelly in a particular process,
with its own set of resources – address space, registers, identifiers, state, and program counters.\
● It refers to the compiler design techniques and processors designed to execute operations, like memory load and
store, integer addition, and float multiplication, in parallel to improve the performance of the processors.

Example
1. y1 = x1*1010
2. y2 = x2*1100
3. z1 = y1+0010
4. z2 = y2+0101
5. t1 = t1+1
6. p = q*1000
7. clr = clr+0010
8. r = r+0001

Instruction Level Parallelism - ILP

Advantages of Instruction-Level Parallelism
● Improved Performance: ILP can significantly improve the performance of processors by allowing multiple
instructions to be executed simultaneously or out-of-order. This can lead to faster program execution and
better system throughput.
● Efficient Resource Utilization: ILP can help to efficiently utilize processor resources by allowing multiple
instructions to be executed at the same time. This can help to reduce resource wastage and increase efficiency.
● Reduced Instruction Dependency: ILP can help to reduce the number of instruction dependencies, which can
limit the amount of instruction-level parallelism that can be exploited. This can help to improve performance
and reduce bottlenecks.
● Increased Throughput: ILP can help to increase the overall throughput of processors by allowing multiple
instructions to be executed simultaneously or out-of-order. This can help to improve the performance of multi-
threaded applications and other parallel processing tasks.

20
10
Instruction Level Parallelism - ILP

Disadvantages of Instruction-Level Parallelism

● Increased Complexity: Implementing ILP can be complex and requires additional hardware resources, which
can increase the complexity and cost of processors.
● Instruction Overhead: ILP can introduce additional instruction overhead, which can slow down the execution
of some instructions and reduce performance.
● Data Dependency: Data dependency can limit the amount of instruction-level parallelism that can be
exploited. This can lead to lower performance and reduced throughput.
● Reduced Energy Efficiency: ILP can reduce the energy efficiency of processors by requiring additional
hardware resources and increasing instruction overhead. This can increase power consumption and result in
higher energy costs.

Thread Level Parallelism -TLP

Motivation:
● A single thread leaves a processor underutilized
● By doubling processor area, single thread performance barely improves

-> Multiple threads share the same large processor, leading to reduces underutilization, efficient resource
allocation.

Strategies for thread level parallelism:

● Simultaneous Multi-Threading - SMT: Multiple threads executed simultaneously.
● Chip Multi-Processing - CMP: Multiple cores on the same die.

Multithreading: Multiple thread to share the functional units of one processor via overlapping by duplicating
independent state of each thread e.g., a separate copy of register file or a separate PC. Memory shared through the virtual
memory mechanisms, which already support multiple processes. Hardware for fast thread switch which is much faster
than full process switch

22
11
Thread Level Parallelism -TLP

Fine-Grained Multithreading:
● Switch between threads on each instruction, causing the
execution of multiples threads to be interleaved
● Usually done in a round-robin fashion, skipping any stalled
threads

Coarse-Grained Multithreading: switches threads only on costly

stalls, such as cache misses

Thread Level Parallelism -TLP

24
12
Multithreading vs Superscalar

T92 Manual
100% (2)
T92 Manual
9 pages
305 Prep Azure
No ratings yet
305 Prep Azure
118 pages
EPI-USE - PPT - Template (Effective March 2023)
No ratings yet
EPI-USE - PPT - Template (Effective March 2023)
49 pages
Tube and Pipe Inventor PDF
No ratings yet
Tube and Pipe Inventor PDF
15 pages
Unit 6
No ratings yet
Unit 6
15 pages
L38 TLP
No ratings yet
L38 TLP
13 pages
Unit Iv Parallelism
No ratings yet
Unit Iv Parallelism
80 pages
Unit 5
No ratings yet
Unit 5
86 pages
Unit-IV ILP
No ratings yet
Unit-IV ILP
6 pages
Pipelining
No ratings yet
Pipelining
5 pages
Lecture19 ILP SMT
No ratings yet
Lecture19 ILP SMT
31 pages
Multi-Core Architectures
100% (1)
Multi-Core Architectures
43 pages
Lec 4 Superscalarprocessor PDF
No ratings yet
Lec 4 Superscalarprocessor PDF
23 pages
Unit-5 Part1
No ratings yet
Unit-5 Part1
85 pages
Motivation For Parallelism Motivation For Parallelism
No ratings yet
Motivation For Parallelism Motivation For Parallelism
6 pages
Unit 7 - Parallel Processing Paradigm
No ratings yet
Unit 7 - Parallel Processing Paradigm
26 pages
Unit V
No ratings yet
Unit V
95 pages
COA - Unit 4
No ratings yet
COA - Unit 4
84 pages
SSC Course 6 CPU
No ratings yet
SSC Course 6 CPU
17 pages
Chapter 8 - Parallel Processing
No ratings yet
Chapter 8 - Parallel Processing
50 pages
MULTIPROCTLPA
No ratings yet
MULTIPROCTLPA
99 pages
Lec 4 Superscalarprocessor Updated PDF
No ratings yet
Lec 4 Superscalarprocessor Updated PDF
40 pages
Multi Threading and Multi Core Handout
No ratings yet
Multi Threading and Multi Core Handout
3 pages
Mod 7
No ratings yet
Mod 7
56 pages
Instruction Level Parallelism
No ratings yet
Instruction Level Parallelism
19 pages
SMT and CMP Architectures
No ratings yet
SMT and CMP Architectures
19 pages
SMT and CMP Architectures
100% (3)
SMT and CMP Architectures
19 pages
Instruction Level Parallelism
No ratings yet
Instruction Level Parallelism
3 pages
Unit Iv
No ratings yet
Unit Iv
31 pages
Coa PPT-2
No ratings yet
Coa PPT-2
16 pages
Presentation On Multithreading/Vector
No ratings yet
Presentation On Multithreading/Vector
7 pages
Multithreading, SMT and CMP
No ratings yet
Multithreading, SMT and CMP
7 pages
CS Chap7 Multicores Multiprocessors Clusters
No ratings yet
CS Chap7 Multicores Multiprocessors Clusters
65 pages
CH17 COA9e
No ratings yet
CH17 COA9e
51 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
51 pages
Coa Unit 04
No ratings yet
Coa Unit 04
85 pages
5 4 Parallel
No ratings yet
5 4 Parallel
47 pages
Lecture #1 - Class-1
No ratings yet
Lecture #1 - Class-1
17 pages
Unit 5
No ratings yet
Unit 5
96 pages
CSE211 Computer Architecturemodule 18-21
No ratings yet
CSE211 Computer Architecturemodule 18-21
19 pages
Thread Level Parallelism
No ratings yet
Thread Level Parallelism
3 pages
Osa Multi Core
No ratings yet
Osa Multi Core
37 pages
Future Processors To Use Coarse-Grain Parallelism
No ratings yet
Future Processors To Use Coarse-Grain Parallelism
48 pages
EE6304 Lecture12 TLP
No ratings yet
EE6304 Lecture12 TLP
70 pages
Multi-Core Computing: Osama Awwad
No ratings yet
Multi-Core Computing: Osama Awwad
37 pages
Computer Architecture Unit 3
No ratings yet
Computer Architecture Unit 3
8 pages
L7 Multicore 2
No ratings yet
L7 Multicore 2
22 pages
Coa Chapter 5
No ratings yet
Coa Chapter 5
96 pages
CSE211 Computer Architecture
No ratings yet
CSE211 Computer Architecture
18 pages
Thread Level Parallelism
No ratings yet
Thread Level Parallelism
21 pages
Memory Coherent
No ratings yet
Memory Coherent
62 pages
Lec2 ParallelProgrammingPlatforms
No ratings yet
Lec2 ParallelProgrammingPlatforms
26 pages
Multiprocessing Vs Multithreading 2
No ratings yet
Multiprocessing Vs Multithreading 2
16 pages
Parallel Computing Platforms-Dr Nausheen
No ratings yet
Parallel Computing Platforms-Dr Nausheen
47 pages
DigitalLogic ComputerOrganization L23 Multicore Handout
No ratings yet
DigitalLogic ComputerOrganization L23 Multicore Handout
32 pages
Parallel Processors From Client To Cloud: Omputer Rganization and Esign
No ratings yet
Parallel Processors From Client To Cloud: Omputer Rganization and Esign
43 pages
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
No ratings yet
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
26 pages
Computer Architecture
No ratings yet
Computer Architecture
29 pages
Simultaneous Multithreading
No ratings yet
Simultaneous Multithreading
50 pages
Parallel Computing
No ratings yet
Parallel Computing
32 pages
Hyper-Threading Technology: Processor Microarchitecture
No ratings yet
Hyper-Threading Technology: Processor Microarchitecture
18 pages
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
MARIO FRANCO
No ratings yet
Next-Generation switching OS configuration and management: Troubleshooting NX-OS in Enterprise Environments
From Everand
Next-Generation switching OS configuration and management: Troubleshooting NX-OS in Enterprise Environments
Mamta Devi
No ratings yet
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
From Everand
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
Jonathan Rigdon
No ratings yet
Data Engineering 101 SQL Basics
No ratings yet
Data Engineering 101 SQL Basics
94 pages
Riso Manual Rz200 Rz220 Rz230 Rz310 Rz370 RZ390 Operation Guide
No ratings yet
Riso Manual Rz200 Rz220 Rz230 Rz310 Rz370 RZ390 Operation Guide
112 pages
GC 2024 11 17
No ratings yet
GC 2024 11 17
14 pages
CV Tran Phuong Nam Embedded
No ratings yet
CV Tran Phuong Nam Embedded
3 pages
Chapter 2 Time Management (Part 1)
No ratings yet
Chapter 2 Time Management (Part 1)
19 pages
0478 Computer Science: MARK SCHEME For The October/November 2015 Series
No ratings yet
0478 Computer Science: MARK SCHEME For The October/November 2015 Series
5 pages
Ibrahim Shaikh - SR - Solutions Developer
No ratings yet
Ibrahim Shaikh - SR - Solutions Developer
4 pages
Go To SE91 Transaction and Give Message Class Name As J - 1IG - MSGS' Click On Change'
No ratings yet
Go To SE91 Transaction and Give Message Class Name As J - 1IG - MSGS' Click On Change'
4 pages
Sims STKDMP
No ratings yet
Sims STKDMP
5 pages
Playing With Pi-Star: Amateur Radio Notes
100% (1)
Playing With Pi-Star: Amateur Radio Notes
40 pages
Baseline Data Preparation of Revenue Land Record - IJS&ER 2015
No ratings yet
Baseline Data Preparation of Revenue Land Record - IJS&ER 2015
8 pages
With Chapter 2, Ciosk
No ratings yet
With Chapter 2, Ciosk
13 pages
Saidu-Musa-Cv Verified 2024
No ratings yet
Saidu-Musa-Cv Verified 2024
2 pages
ICT IGCSE - Hardware and Software - Computers - Quizizz
No ratings yet
ICT IGCSE - Hardware and Software - Computers - Quizizz
5 pages
Minestar Edge Features and Benefits
100% (1)
Minestar Edge Features and Benefits
11 pages
Intro Lecture 2
No ratings yet
Intro Lecture 2
28 pages
Mobile Marketing!
No ratings yet
Mobile Marketing!
32 pages
100 Vim Commands Every Programmer Should Know - Cats Who Code
No ratings yet
100 Vim Commands Every Programmer Should Know - Cats Who Code
49 pages
CS244 Final Exam
No ratings yet
CS244 Final Exam
4 pages
From Forms To HTML: Understanding and Using Oracle Projects' HTML Pages
100% (1)
From Forms To HTML: Understanding and Using Oracle Projects' HTML Pages
29 pages
E3d Commands
No ratings yet
E3d Commands
21 pages
Log2020 08 09
No ratings yet
Log2020 08 09
4 pages
JKSSB Patwari Syllabus
No ratings yet
JKSSB Patwari Syllabus
3 pages
Explain Following CSS Properties
No ratings yet
Explain Following CSS Properties
8 pages
AN12149
No ratings yet
AN12149
28 pages
Database Engineering Summary of Coursework-1
No ratings yet
Database Engineering Summary of Coursework-1
4 pages

Multiprocessors I

Uploaded by

Multiprocessors I

Uploaded by

Introduction

Professor: Dr. Tran Ngoc Thinh

● Multiprocessor systems: integrate multiple processors within a single computing device.

● Revolutionized computing landscape.

● Facilitated advancements in various fields:

● Single Instruction, Single Data

Classification Instruction Stream Data Stream Example

SISD One One Traditional Von Neumann

SIMD One Multiple Vector processors like GPUs

MISD Multiple One Space shuttle flight control

MIMD Multiple Multiple Multi-core CPUs, distributed

Intuitive Model: Sequential Consistency (SC)

A multiprocessor is sequentially consistent if the result of any execution is the same as if

Relaxed Consistency Models

- Can we relax some or all memory access orderings ?

SC: Perform memory operations in-program order

Local Ordering: Relaxing W→R

- Later loads can bypass earlier stores to independent

- TSO (Total Store Ordering) and Processor

Relax Constraints on Memory Orders

- In well-synchronized program, all reorderings

Memory Model: Release Consistency

Instruction Level Parallelism - ILP

Disadvantages of Instruction-Level Parallelism

Thread Level Parallelism -TLP

Strategies for thread level parallelism:

Coarse-Grained Multithreading: switches threads only on costly

Thread Level Parallelism -TLP

You might also like