CSE211 Computer Architecturemodule 18-21
CSE211 Computer Architecturemodule 18-21
Computer
Architecture
Modules 14 to 21
Multi-threading
• Multithreading allows multiple threads to execute simultaneously,
enhancing parallelism and resource utilization. It can be categorized into
fi ne-grain and coarse-grain multithreading.
• Simultaneous multithreading (SMT) enables issuing instructions from
diff erent threads into various functional units at the same time,
maximizing the use of processor resources.
• SMT is a hardware technique that allows multiple threads to share the
execution resources of a single processor core. This is achieved by
interleaving the instruction execution of diff erent threads.
• By allowing multiple threads to share the execution resources, SMT can
increase the utilization of the processor and improve overall
performance.
• While increasing the number of threads in SMT can enhance parallelism,
it is crucial to balance the number of threads with the architecture's
ability to manage resources eff ectively.
Parallelism vs
Synchronization
• Parallel programming allows multiple programs or threads to run
simultaneously, which is essential for improving performance in
modern computer architectures.
• Synchronization is crucial for coordinating communication
between concurrent processes, ensuring that shared resources
are accessed safely.
• The producer-consumer model illustrates how one entity produces
data while another consumes it, highlighting the need for
eff ective communication and resource management.
• Mutual exclusion ensures that only one processor accesses a
shared resource at a time, preventing confl icts and ensuring data
integrity. To implement we use strategies like:
• Exclusive Access
• Lock Mechanisms
• Avoiding Race Conditions
• Synchronization
Producer consumer problem
• In a producer-consumer scenario, a producer generates
values while consumers read and process those values.
When there are two consumers, issues can arise if they
access shared data simultaneously.
• Sequential consistency ensures that operations appear to
occur in a specifi c order, preventing reordering of reads
and writes, which is benefi cial for maintaining data
integrity.
• Producer: Generates a data item. • Consumer: Checks if the buffer is
• Adds the item to the buffer. empty.
• If the buffer is full, the producer • If the buffer is not empty, removes
an item from the buffer and
may be blocked until space processes it.
becomes available. • If the buffer is empty, the consumer
may be blocked until a new item is
added.
Mutual exclusion
Understanding Mutual Exclusion
• Mutual exclusion is essential for preventing multiple processes from
accessing shared resources simultaneously, which can lead to
inconsistencies.
• Atomic operations are crucial for implementing mutual exclusion,
allowing operations to be completed without interruption from other
processes.
Atomic Operations and Their Implementation
• The test and set operation is a fundamental atomic operation that
checks a memory address and modifies it atomically, ensuring that no
other operations interfere during this process.
• More advanced atomic operations, such as compare and swap, enhance
functionality by allowing conditional updates based on the current value
in memory.
Sequential consistency
• Sequential Consistency ensures that the execution sequence of
instructions from all processors appears as a valid interleaving of
their individual instruction orders.
• It is a strong model that guarantees that all processors see the
same order of operations, which is not typically implemented in
modern computers due to performance constraints.
Examples of Valid and Invalid Orders
• Valid sequentially consistent orders can include various
interleavings, such as executing instructions from diff erent
processors in a way that respects their individual order.
• An invalid order occurs when the relative order of operations from
a single processor is violated, leading to inconsistencies in the
observed results.
Issues in Sequential
Consistency
• Performance Overhead
• Hardware Complexity
• Programming Complexity
• Practical Limitations
• Distributed Systems
True sequential consistency is challenging to achieve,
especially with caches, as data visibility between
processors becomes a concern.
Terminologies
• Defi nition of Race Conditions: A race condition occurs when two or more threads or
processes access shared data and try to change it at the same time. The fi nal outcome
depends on the timing of their execution, which can lead to unpredictable results.
• Role of Sequential Consistency: Sequential consistency provides a model that
ensures all memory operations appear to occur in a specifi c order. This means that if a
program adheres to sequential consistency, the operations from diff erent threads will be
interleaved in a way that respects the order of operations from each individual thread.
• Prevention of Race Conditions: By enforcing a sequentially consistent memory
model, the likelihood of race conditions is reduced. Since all threads see the same order
of operations, it becomes easier to reason about the state of shared data and avoid
confl icts.
• Simplifi ed Reasoning: With sequential consistency, programmers can assume that
operations will execute in a predictable manner, making it easier to identify potential
race conditions and implement appropriate synchronization mechanisms.
• Weak Models and Race Conditions: In contrast, weaker memory models may allow for
out-of-order execution and diff erent visibility of operations, increasing the risk of race
conditions. Programmers must be more cautious and implement additional
synchronization to ensure correctness.
Locks
• Locks (mutexes) allow mutual exclusion,
ensuring that only one process can execute a
critical section of code at any given time.
• Mutual Exclusion: Locks ensure that only one
thread can access a critical section of code
at a time. This prevents race conditions
where multiple threads might try to read or
write shared data simultaneously.
• Synchronization: By locking a resource, a
thread can safely perform operations without
interference from other threads, ensuring
data integrity.
Semaphores
• Semaphores provide a more fl exible
approach, allowing a specifi ed number of
processes to enter a critical section
concurrently, which is useful in scenarios
with multiple resources.
• Semaphores
⚬ Controlled Access: Semaphores allow a
specifi ed number of threads to access a
resource concurrently.
⚬ Flexibility: Unlike locks, which only allow
one thread at a time, semaphores can be
confi gured to permit a certain number of
threads (N) to enter a critical section.
Memory fences and models
Memory Fences and Their Importance
• Memory fences (or barriers) are introduced to ensure that
certain memory operations are completed before others
begin, helping to maintain order and consistency.
• Diff erent types of memory fences exist, such as load memory
fences and directional memory fences, which provide varying
levels of control over memory operations.
Weak Memory Models
• Most modern processors implement weaker memory models
rather than strict sequential consistency, allowing for
performance optimizations through reordering.
• Examples of memory ordering models include total store
ordering, partial store ordering, and weak ordering, each with
specifi c rules about how loads and stores can be reordered.
Memory Bus
The memory bus is a type
of computer bus, usually
in the form of a set of
wires or conductors which
connects electrical
components and allow
transfers of data and
addresses from the main
memory to the central
processing unit (CPU) or a
memory controller.
Bus- based
multiprocessor
A bus-based multiprocessor system is a type
of parallel computing architecture where
multiple processors share a common bus to
communicate with each other and access
shared memory.
Key Components of a Bus-Based
Multiprocessor:
• Processors: Multiple processors, each with
its own registers and local cache.
• Shared Memory: A common memory area
accessible to all processors.