CSE211 Computer Architecturemodule 18-21

Uploaded by

kartavya9878

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views19 pages

CSE211 Computer Architecturemodule 18-21

Uploaded by

kartavya9878

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 19

CSE211

Computer
Architecture
Modules 14 to 21
Multi-threading
• Multithreading allows multiple threads to execute simultaneously,
enhancing parallelism and resource utilization. It can be categorized into
fi ne-grain and coarse-grain multithreading.
• Simultaneous multithreading (SMT) enables issuing instructions from
diff erent threads into various functional units at the same time,
maximizing the use of processor resources.
• SMT is a hardware technique that allows multiple threads to share the
execution resources of a single processor core. This is achieved by
interleaving the instruction execution of diff erent threads.
• By allowing multiple threads to share the execution resources, SMT can
increase the utilization of the processor and improve overall
performance.
• While increasing the number of threads in SMT can enhance parallelism,
it is crucial to balance the number of threads with the architecture's
ability to manage resources eff ectively.
Parallelism vs
Synchronization
• Parallel programming allows multiple programs or threads to run
simultaneously, which is essential for improving performance in
modern computer architectures.
• Synchronization is crucial for coordinating communication
between concurrent processes, ensuring that shared resources
are accessed safely.
• The producer-consumer model illustrates how one entity produces
data while another consumes it, highlighting the need for
eff ective communication and resource management.
• Mutual exclusion ensures that only one processor accesses a
shared resource at a time, preventing confl icts and ensuring data
integrity. To implement we use strategies like:
• Exclusive Access
• Lock Mechanisms
• Avoiding Race Conditions
• Synchronization
Producer consumer problem
• In a producer-consumer scenario, a producer generates
values while consumers read and process those values.
When there are two consumers, issues can arise if they
access shared data simultaneously.
• Sequential consistency ensures that operations appear to
occur in a specifi c order, preventing reordering of reads
and writes, which is benefi cial for maintaining data
integrity.
• Producer: Generates a data item. • Consumer: Checks if the buffer is
• Adds the item to the buffer. empty.
• If the buffer is full, the producer • If the buffer is not empty, removes
an item from the buffer and
may be blocked until space processes it.
becomes available. • If the buffer is empty, the consumer
may be blocked until a new item is
added.
Mutual exclusion
Understanding Mutual Exclusion
• Mutual exclusion is essential for preventing multiple processes from
accessing shared resources simultaneously, which can lead to
inconsistencies.
• Atomic operations are crucial for implementing mutual exclusion,
allowing operations to be completed without interruption from other
processes.
Atomic Operations and Their Implementation
• The test and set operation is a fundamental atomic operation that
checks a memory address and modifies it atomically, ensuring that no
other operations interfere during this process.
• More advanced atomic operations, such as compare and swap, enhance
functionality by allowing conditional updates based on the current value
in memory.
Sequential consistency
• Sequential Consistency ensures that the execution sequence of
instructions from all processors appears as a valid interleaving of
their individual instruction orders.
• It is a strong model that guarantees that all processors see the
same order of operations, which is not typically implemented in
modern computers due to performance constraints.
Examples of Valid and Invalid Orders
• Valid sequentially consistent orders can include various
interleavings, such as executing instructions from diff erent
processors in a way that respects their individual order.
• An invalid order occurs when the relative order of operations from
a single processor is violated, leading to inconsistencies in the
observed results.
Issues in Sequential
Consistency
• Performance Overhead
• Hardware Complexity
• Programming Complexity
• Practical Limitations
• Distributed Systems
True sequential consistency is challenging to achieve,
especially with caches, as data visibility between
processors becomes a concern.
Terminologies
• Defi nition of Race Conditions: A race condition occurs when two or more threads or
processes access shared data and try to change it at the same time. The fi nal outcome
depends on the timing of their execution, which can lead to unpredictable results.
• Role of Sequential Consistency: Sequential consistency provides a model that
ensures all memory operations appear to occur in a specifi c order. This means that if a
program adheres to sequential consistency, the operations from diff erent threads will be
interleaved in a way that respects the order of operations from each individual thread.
• Prevention of Race Conditions: By enforcing a sequentially consistent memory
model, the likelihood of race conditions is reduced. Since all threads see the same order
of operations, it becomes easier to reason about the state of shared data and avoid
confl icts.
• Simplifi ed Reasoning: With sequential consistency, programmers can assume that
operations will execute in a predictable manner, making it easier to identify potential
race conditions and implement appropriate synchronization mechanisms.
• Weak Models and Race Conditions: In contrast, weaker memory models may allow for
out-of-order execution and diff erent visibility of operations, increasing the risk of race
conditions. Programmers must be more cautious and implement additional
synchronization to ensure correctness.
Locks
• Locks (mutexes) allow mutual exclusion,
ensuring that only one process can execute a
critical section of code at any given time.
• Mutual Exclusion: Locks ensure that only one
thread can access a critical section of code
at a time. This prevents race conditions
where multiple threads might try to read or
write shared data simultaneously.
• Synchronization: By locking a resource, a
thread can safely perform operations without
interference from other threads, ensuring
data integrity.
Semaphores
• Semaphores provide a more fl exible
approach, allowing a specifi ed number of
processes to enter a critical section
concurrently, which is useful in scenarios
with multiple resources.

• Semaphores
⚬ Controlled Access: Semaphores allow a
specifi ed number of threads to access a
resource concurrently.
⚬ Flexibility: Unlike locks, which only allow
one thread at a time, semaphores can be
confi gured to permit a certain number of
threads (N) to enter a critical section.
Memory fences and models
Memory Fences and Their Importance
• Memory fences (or barriers) are introduced to ensure that
certain memory operations are completed before others
begin, helping to maintain order and consistency.
• Diff erent types of memory fences exist, such as load memory
fences and directional memory fences, which provide varying
levels of control over memory operations.
Weak Memory Models
• Most modern processors implement weaker memory models
rather than strict sequential consistency, allowing for
performance optimizations through reordering.
• Examples of memory ordering models include total store
ordering, partial store ordering, and weak ordering, each with
specifi c rules about how loads and stores can be reordered.
Memory Bus
The memory bus is a type
of computer bus, usually
in the form of a set of
wires or conductors which
connects electrical
components and allow
transfers of data and
addresses from the main
memory to the central
processing unit (CPU) or a
memory controller.
Bus- based
multiprocessor
A bus-based multiprocessor system is a type
of parallel computing architecture where
multiple processors share a common bus to
communicate with each other and access
shared memory.
Key Components of a Bus-Based
Multiprocessor:
• Processors: Multiple processors, each with
its own registers and local cache.
• Shared Memory: A common memory area
accessible to all processors.

• Bus: A communication channel that

connects the processors and the shared
memory.
• Cache Memory: High-speed memory that
stores frequently accessed data for each
processor.
Message passing
Shared Memory Architecture
• In shared memory systems, one core can write data to a
memory address, and another core can read from that
address without needing to know which core will read it in the
future.
• This model allows for implicit communication, but it often
requires locking mechanisms to ensure data consistency
between writes and reads.
Explicit Message Passing
• Explicit message passing requires a sender to specify a
destination when sending data, using an API that includes
send and receive functions.
The receive function can be designed to accept data from any
source or a specifi c source, allowing for more controlled
communication.
Memory in Multiprocessor system
Multi-core bus systems August 2026

By placing two or more processor cores on the same device,

it can use shared components -- such as common internal
buses and processor caches -- more effi ciently.
Shared memory vs message
Shared Memory
passing
• Communication Method: Implicit communication through loads and stores to
shared memory addresses.
• Destination Knowledge: The sender does not need to know which core or process
will read the data.
• Synchronization: Requires explicit synchronization mechanisms (like locks and
flags) to prevent race conditions.
• Memory Access: Memory is shared among all cores or processes, allowing for easy
access to shared data structures.
Explicit Message Passing
• Communication Method: Explicit communication using send and receive functions.
• Destination Knowledge: The sender must specify the destination when sending
data.
• Synchronization: Synchronization is built into the messaging model, as sending and
receiving messages inherently creates a producer-consumer relationship.
• Memory Access: Memory is typically private to each process or core, meaning data
must be sent explicitly between them.
Networking concepts
• Topologies
• Routing
• Flow control
• Deadlock
• Avoidance strategies

BUSI 472 - Business Etiquette PowerPoint
100% (2)
BUSI 472 - Business Etiquette PowerPoint
13 pages
Parallel Computing
No ratings yet
Parallel Computing
28 pages
Memory in Multiprocessor System
No ratings yet
Memory in Multiprocessor System
52 pages
3300-M08COMMON-P05-00018-00-D - 01-C02 Flange Management Procedure
100% (2)
3300-M08COMMON-P05-00018-00-D - 01-C02 Flange Management Procedure
21 pages
The Three Main Pillars of Siebel Architecutre
100% (2)
The Three Main Pillars of Siebel Architecutre
10 pages
Chapter 8 - Parallel Processing
No ratings yet
Chapter 8 - Parallel Processing
50 pages
2 Unit 2 Python Library For Data Wrangling
No ratings yet
2 Unit 2 Python Library For Data Wrangling
37 pages
Design and Analysis of Differential Gearbox
50% (4)
Design and Analysis of Differential Gearbox
49 pages
Financial Accounting Libby 6th Edition Full Download
No ratings yet
Financial Accounting Libby 6th Edition Full Download
399 pages
Parallel Processing
No ratings yet
Parallel Processing
31 pages
15cs72aca Module-5 Aca
No ratings yet
15cs72aca Module-5 Aca
53 pages
Coa Unit 04
No ratings yet
Coa Unit 04
85 pages
PP Unit5
No ratings yet
PP Unit5
43 pages
Unit Iv Parallelism
No ratings yet
Unit Iv Parallelism
80 pages
Fortigate Getting Started 56
No ratings yet
Fortigate Getting Started 56
105 pages
Unit 6
No ratings yet
Unit 6
36 pages
09 Communication Models of Parallel Platforms
No ratings yet
09 Communication Models of Parallel Platforms
25 pages
Unit 4
No ratings yet
Unit 4
7 pages
Comporg6 ch12
No ratings yet
Comporg6 ch12
36 pages
Lecture (2) .PPT-1
100% (1)
Lecture (2) .PPT-1
19 pages
17 Computer Architecture and Organization
No ratings yet
17 Computer Architecture and Organization
28 pages
Advanced Operating System: Unit I
No ratings yet
Advanced Operating System: Unit I
27 pages
Parallel Computers
No ratings yet
Parallel Computers
39 pages
Parallel Computing
No ratings yet
Parallel Computing
32 pages
Unit 4
No ratings yet
Unit 4
42 pages
09 Communication Models of Parallel Platforms
No ratings yet
09 Communication Models of Parallel Platforms
25 pages
Module 2 - Parallel Computing
No ratings yet
Module 2 - Parallel Computing
55 pages
Multiprocessors
No ratings yet
Multiprocessors
39 pages
Module 4 - Architecture
No ratings yet
Module 4 - Architecture
22 pages
10 Multithreading
No ratings yet
10 Multithreading
60 pages
Chap2 Slides
No ratings yet
Chap2 Slides
127 pages
PP1-2M90-5023-004 - 0 - GAD With Loading Data
No ratings yet
PP1-2M90-5023-004 - 0 - GAD With Loading Data
2 pages
CS Chap7 Multicores Multiprocessors Clusters
No ratings yet
CS Chap7 Multicores Multiprocessors Clusters
65 pages
Parallel Processing
No ratings yet
Parallel Processing
127 pages
Unit 2 Cloud Computing
No ratings yet
Unit 2 Cloud Computing
19 pages
Unit-5 Part-2
No ratings yet
Unit-5 Part-2
22 pages
07 Multiprocessors MF PDF
No ratings yet
07 Multiprocessors MF PDF
99 pages
Chapter 3
No ratings yet
Chapter 3
23 pages
Shared Memory Multiprocessors: Logical Design and Software Interactions
No ratings yet
Shared Memory Multiprocessors: Logical Design and Software Interactions
107 pages
Ecm F150 1994
No ratings yet
Ecm F150 1994
2 pages
MCP-Unit 2
No ratings yet
MCP-Unit 2
77 pages
CH17 COA9e
No ratings yet
CH17 COA9e
51 pages
UNIT 5 - IOT and Ard
No ratings yet
UNIT 5 - IOT and Ard
14 pages
User Manual 44953
No ratings yet
User Manual 44953
32 pages
2 Parallel Computer Memory Architectures
No ratings yet
2 Parallel Computer Memory Architectures
26 pages
Java Concurrency
No ratings yet
Java Concurrency
70 pages
Part 1 - Lecture 2 - Parallel Hardware
No ratings yet
Part 1 - Lecture 2 - Parallel Hardware
60 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
51 pages
Operating System 6
No ratings yet
Operating System 6
16 pages
CSE211 Computer Architecture
No ratings yet
CSE211 Computer Architecture
18 pages
ACA Unit 4
No ratings yet
ACA Unit 4
41 pages
Businessnews Simplified Supply Chains Intermediate Teachersnotes
No ratings yet
Businessnews Simplified Supply Chains Intermediate Teachersnotes
2 pages
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
No ratings yet
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
33 pages
Cloud
No ratings yet
Cloud
9 pages
Unit 1
No ratings yet
Unit 1
25 pages
A502018463 23825 5 2019 Unit6
No ratings yet
A502018463 23825 5 2019 Unit6
36 pages
PSR S-Band: Primary Surveillance Radar
No ratings yet
PSR S-Band: Primary Surveillance Radar
2 pages
Multi-Processor / Parallel Processing
No ratings yet
Multi-Processor / Parallel Processing
12 pages
Threads On A Multi Core Processor 1737287536
No ratings yet
Threads On A Multi Core Processor 1737287536
9 pages
Dsa Report
No ratings yet
Dsa Report
14 pages
Multi-Processor-Parallel Processing PDF
No ratings yet
Multi-Processor-Parallel Processing PDF
12 pages
Restricting Data and Sorting Data PDF
No ratings yet
Restricting Data and Sorting Data PDF
24 pages
Emmanuel Seminar
No ratings yet
Emmanuel Seminar
9 pages
Thread Level Parallelism
No ratings yet
Thread Level Parallelism
21 pages
15 Parallel Processing
No ratings yet
15 Parallel Processing
36 pages
Unit 2 Pram Algorithms: Structure Page Nos
No ratings yet
Unit 2 Pram Algorithms: Structure Page Nos
25 pages
Reviewer in Stas (Finals) 2.0
No ratings yet
Reviewer in Stas (Finals) 2.0
8 pages
Class 2 Word Processing (Ms Word)
No ratings yet
Class 2 Word Processing (Ms Word)
8 pages
KTMTSS Shared Memory Multiprocessor
No ratings yet
KTMTSS Shared Memory Multiprocessor
29 pages
Slides Taken From: Parallel Computing Platforms
No ratings yet
Slides Taken From: Parallel Computing Platforms
11 pages
Udyam Registration
No ratings yet
Udyam Registration
4 pages
DTIN Assg. Q
No ratings yet
DTIN Assg. Q
5 pages
COM155 F2019 Sheet3
No ratings yet
COM155 F2019 Sheet3
3 pages
How To Use SignalWire As An SMS Provider For GoHighLevel - Simple Steps
No ratings yet
How To Use SignalWire As An SMS Provider For GoHighLevel - Simple Steps
3 pages
Definition of UMA: Basis For Comparison UMA Numa
No ratings yet
Definition of UMA: Basis For Comparison UMA Numa
10 pages
How To Code A State Machine in C or C
No ratings yet
How To Code A State Machine in C or C
10 pages
Use of Robot Kits in Manufacturing Industry-CIM
No ratings yet
Use of Robot Kits in Manufacturing Industry-CIM
11 pages
TP-Link Archer C9 v1 - Unbrick and Back To Stock Step-By-Step Guide
No ratings yet
TP-Link Archer C9 v1 - Unbrick and Back To Stock Step-By-Step Guide
7 pages
Parallel Computer Architecture A Hardware-Software
No ratings yet
Parallel Computer Architecture A Hardware-Software
18 pages
Concurrent and Parallel Programming .Unit-1
No ratings yet
Concurrent and Parallel Programming .Unit-1
8 pages
Stree 2 Sarkate Ka Aatank Movie Showtimes in Hyderabad & Online Ticket Booking
No ratings yet
Stree 2 Sarkate Ka Aatank Movie Showtimes in Hyderabad & Online Ticket Booking
1 page
Multi-Processor / Parallel Processing
No ratings yet
Multi-Processor / Parallel Processing
12 pages
What Is Parallel Computing
No ratings yet
What Is Parallel Computing
9 pages
How Has The Content of Magazines Changed Over Time, and Why
No ratings yet
How Has The Content of Magazines Changed Over Time, and Why
2 pages
Last Resume
No ratings yet
Last Resume
1 page
Mastering Concurrency and Parallel Programming Unlock the Secrets of Expert-Level Skills.pdf
From Everand
Mastering Concurrency and Parallel Programming Unlock the Secrets of Expert-Level Skills.pdf
Larry Jones
No ratings yet
Java Concurrency and Multithreading: Unlock the Secrets of Expert-Level Skills
From Everand
Java Concurrency and Multithreading: Unlock the Secrets of Expert-Level Skills
Larry Jones
No ratings yet
Optimized Caching Techniques: Application for Scalable Distributed Architectures
From Everand
Optimized Caching Techniques: Application for Scalable Distributed Architectures
Peter Jones
No ratings yet
OpenCL Programming and Architecture: Definitive Reference for Developers and Engineers
From Everand
OpenCL Programming and Architecture: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Distributed Cluster Operations with DC/OS: Definitive Reference for Developers and Engineers
From Everand
Distributed Cluster Operations with DC/OS: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet

CSE211 Computer Architecturemodule 18-21

Uploaded by

CSE211 Computer Architecturemodule 18-21

Uploaded by

CSE211

• Bus: A communication channel that

By placing two or more processor cores on the same device,

You might also like