0% found this document useful (0 votes)
15 views

Module 3

Uploaded by

Mustafiz Ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Module 3

Uploaded by

Mustafiz Ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

MODULE 3

Mapping and Scheduling


Definition:
• Mapping: The process of assigning computational tasks or data
elements to specific processors or processing units within a parallel
computing system. Effective mapping aims to optimize performance
by balancing workload and minimizing communication overhead.
• Scheduling: The process of determining the order and timing of task
execution on processors. It ensures that tasks are executed in an
efficient sequence to meet performance and resource constraints.
Mapping:
1. Task Mapping:
• Definition: Assigns computational tasks to processors based on the
task’s requirements and the processor’s capabilities.
• Objective: Optimize performance by balancing the load across
processors and minimizing communication between tasks.
• Techniques:
• Static Mapping: Tasks are assigned to processors before
runtime based on predetermined criteria. Suitable for
predictable workloads.
• Dynamic Mapping: Tasks are assigned to processors during
runtime based on current load and system conditions. Allows
for adaptive load balancing.
2. Data Mapping:
• Definition: Assigns data elements to processors in such a way that
data access and communication are optimized.
• Objective: Enhance performance by improving data locality and
reducing data transfer overhead.
• Techniques:
• Block Mapping: Data is divided into blocks, each assigned to a
different processor. Useful for load balancing and simplifying
data management.
• Cyclic Mapping: Data elements are assigned to processors in
a cyclic fashion, ensuring an even distribution of data.
Applications:
• Parallel Computing: Mapping and scheduling are crucial for
optimizing the performance of parallel algorithms and ensuring
efficient use of resources.
• Distributed Systems: Effective mapping and scheduling are
essential for managing tasks and data across distributed computing
environments.
Challenges:
• Load Balancing: Ensuring that all processors or nodes have an
equal amount of work to prevent idle times and performance
bottlenecks.
• Communication Overhead: Minimizing the overhead associated
with data transfer and synchronization between processors.
Scheduling:
1. Static Scheduling:
• Definition: Tasks are assigned to processors in a predetermined
manner before runtime. The scheduling decisions do not change
during the execution.
• Advantages:
• Simplicity: Easier to implement and manage due to fixed task
assignments.
• Predictability: Provides predictable performance if task
execution times are known.
• Challenges:
• Inflexibility: Does not adapt to changes in task execution times
or processor availability.
• Load Imbalance: Potential for uneven distribution of tasks if
not carefully managed.
2. Dynamic Scheduling:
• Definition: Tasks are assigned to processors during runtime based
on current system conditions, such as processor load and task
requirements.
• Advantages:
• Adaptability: Can adjust to changes in workload and system
conditions, leading to better load balancing and performance.
• Fault Tolerance: Can handle task and processor failures by
redistributing tasks dynamically.
• Challenges:
• Complexity: More complex to implement due to runtime
monitoring and task reassignment.
• Overhead: May introduce additional overhead in terms of
communication and decision-making.
Techniques:
• Preemptive Scheduling: Tasks can be paused and resumed based
on current conditions, allowing for better load balancing.
• Non-Preemptive Scheduling: Tasks are assigned to processors
without interruptions, suitable for systems with minimal context
switching.
Applications:
• Real-Time Systems: Ensures that tasks are executed within
specified deadlines by managing task scheduling effectively.
• High-Performance Computing: Optimizes the execution of parallel
algorithms and simulations by managing task execution efficiently.
Challenges:
• Resource Constraints: Managing resources effectively while
ensuring that tasks are completed on time.
• Synchronization: Coordinating tasks to avoid conflicts and ensure
correct execution.
Comparison:
• Mapping focuses on assigning tasks or data to processors to
optimize resource usage and communication.
• Scheduling focuses on determining the order and timing of task
execution to improve performance and meet deadlines.
Both mapping and scheduling are critical for optimizing the performance
of parallel and distributed computing systems, each addressing different
aspects of task management and resource utilization.
Mapping Data to Processors on Processor Arrays and
Multicomputers
1. Mapping Data to Processor Arrays
Processor Arrays:
• Definition: Processor arrays are a type of parallel architecture
where processors are arranged in a regular grid or array. Each
processor typically performs computations on a specific portion of
the data, with direct communication between neighboring
processors.
Data Mapping Techniques:
1. Spatial Mapping:
•Definition: Data elements are mapped to processors based on
their spatial location in the processor array.
• Technique: In a 2D processor array, for example, a 2D grid of
data is mapped to a 2D array of processors. Each processor
handles a specific region of the data grid.
• Applications: Useful for problems with regular, grid-like data
structures, such as image processing and matrix operations.
2. Row/Column Mapping:
• Definition: Data rows or columns are mapped to rows or
columns of processors in the array.
• Technique: Each row or column of data is assigned to a
corresponding row or column of processors, facilitating efficient
data access and parallel processing.
• Applications: Common in applications involving matrix
operations where data can be naturally divided into rows or
columns.
3. Block Mapping:
• Definition: Data is divided into blocks, and each block is
assigned to a processor.
• Technique: The data grid is partitioned into smaller blocks, with
each processor responsible for a specific block. This approach
can help balance the computational load and manage data
locality.
• Applications: Useful for problems where data can be
partitioned into independent blocks, such as in large-scale
simulations and computational fluid dynamics.
Challenges:
• Load Balancing: Ensuring that each processor has an equal
amount of work to prevent bottlenecks.
• Communication Overhead: Managing communication between
processors, especially when data dependencies exist.
2. Mapping Data to Multicomputers
Multicomputers:
• Definition: Multicomputers consist of multiple independent
computers or nodes, each with its own memory. These nodes
communicate through a network.
Data Mapping Techniques:
1. Data Distribution:
• Definition: Data is partitioned and distributed across the nodes
of the multicomputer.
• Technique: Data can be partitioned into chunks or segments,
with each node responsible for a specific chunk. Nodes perform
computations on their assigned data and communicate with
other nodes as needed.
• Applications: Suitable for large-scale data processing tasks
and distributed databases where data can be naturally
partitioned.
2. Block Partitioning:
• Definition: Data is divided into blocks, and each block is
assigned to a different node.
• Technique: The data set is divided into blocks of roughly equal
size, and each node processes a specific block. This approach
helps in balancing the load and improving parallel efficiency.
• Applications: Common in parallel computing applications
where data can be evenly divided, such as in matrix
computations and distributed machine learning.
3. Striping:
• Definition: Data is divided into strips or segments, which are
distributed across multiple nodes.
• Technique: Data is split into overlapping or non-overlapping
strips, with each node responsible for processing a part of the
strip. This method helps in balancing the load and optimizing
data access patterns.
• Applications: Used in file systems and distributed storage
systems where data needs to be spread across multiple nodes.
Challenges:
• Data Communication: Efficiently managing communication
between nodes, especially when frequent data exchanges are
required.
• Scalability: Ensuring that the system scales effectively with
increasing numbers of nodes and data volume.
Comparison:
• Processor Arrays:
• Topology: Fixed grid-based arrangement with direct
communication between neighboring processors.
• Mapping: Often uses spatial, row/column, or block mapping.
• Best for: Regular, grid-like data structures with predictable
access patterns.
• Multicomputers:
• Topology: Network-based arrangement with independent
nodes communicating over a network.
• Mapping: Typically uses data distribution, block partitioning, or
striping.
• Best for: Large-scale data processing and distributed
applications with flexible data partitioning needs.

Dynamic Load Balancing on Multicomputers


Definition:
• Dynamic Load Balancing refers to the process of distributing
computational tasks or workload dynamically across multiple
processors or nodes in a multicomputer system to ensure that each
node is utilized efficiently and that no single node becomes a
bottleneck.
Characteristics:
• Real-Time Adjustments: Load balancing decisions are made in
real-time based on current system conditions and workloads. This
allows the system to adapt to varying demands and resource
availability.
• Task Redistribution: Tasks or data can be dynamically reassigned
from overloaded nodes to underutilized nodes to balance the load
and improve overall system performance.
Techniques:
1. Centralized Approach:
• Definition: A central controller or coordinator monitors the load
on each node and makes decisions about task redistribution.
• Advantages: Simplifies the load balancing process and
ensures global view of the system’s load.
• Challenges: The central controller can become a bottleneck
and may not scale well with a large number of nodes.
2. Distributed Approach:
• Definition: Each node or processor independently monitors its
own load and communicates with other nodes to share load
information and coordinate task redistribution.
•Advantages: Reduces the risk of a single point of failure and
can scale better with a larger number of nodes.
• Challenges: Requires efficient communication protocols and
algorithms to ensure accurate load distribution.
3. Hybrid Approach:
• Definition: Combines elements of both centralized and
distributed approaches. A central coordinator may be used for
high-level management, while nodes also make local load
balancing decisions.
• Advantages: Balances the benefits of centralized oversight
with the scalability of distributed approaches.
• Challenges: Complexity in implementation and coordination
between central and local mechanisms.
Algorithms:
• Task Stealing: Nodes with lighter workloads can "steal" tasks from
nodes that are heavily loaded. This approach dynamically
redistributes tasks based on current load.
• Work Redistribution: Tasks are periodically reassigned based on
observed load imbalances. This can be done using algorithms that
measure load and redistribute work accordingly.
• Load Estimation: Nodes estimate their load and communicate with
other nodes to exchange load information. Based on this
information, tasks are reassigned to balance the load.
Applications:
• Parallel Computing: Used in parallel computing environments to
optimize the execution of parallel algorithms by ensuring that
computational resources are used efficiently.
• Distributed Databases: Helps in managing query loads and data
distribution across multiple nodes in a distributed database system.
• High-Performance Computing: Enhances performance in high-
performance computing tasks by ensuring balanced utilization of
compute resources.
Advantages:
• Improved Utilization: Ensures that all nodes are utilized effectively,
reducing idle times and improving overall system throughput.
• Adaptability: Can adapt to changes in workload and resource
availability, leading to better performance and efficiency.
• Fault Tolerance: Helps mitigate the impact of node failures by
redistributing tasks from failed or underperforming nodes to others.
Challenges:
• Overhead: Dynamic load balancing introduces overhead in terms of
communication and computation for monitoring and redistributing
tasks.
• Complexity: Implementing effective dynamic load balancing
algorithms can be complex, requiring careful design and tuning.
• Scalability: Ensuring that load balancing remains effective as the
number of nodes increases can be challenging.
Comparison with Static Load Balancing:
• Dynamic Load Balancing: Adapts to real-time changes in workload
and system conditions, offering greater flexibility and
responsiveness.
• Static Load Balancing: Involves fixed task assignments and does
not adjust to changing conditions, which can lead to inefficiencies if
workload distribution is not optimal.

Static Scheduling on UMA Multiprocessors


Definition:
• Static Scheduling refers to the process of assigning tasks to
processors in a multiprocessor system before runtime, based on
predetermined criteria. In UMA multiprocessors, where all
processors have equal access to a shared memory, static
scheduling involves assigning tasks and managing their execution
without dynamic adjustments during runtime.
UMA Multiprocessors:
• Uniform Memory Access (UMA): A multiprocessor architecture
where all processors share a single, uniform memory space. Each
processor has equal access time to any location in the memory.
Characteristics:
• Predefined Task Assignment: Tasks are assigned to processors
before execution begins. This assignment does not change during
the execution of the tasks.
• Equal Memory Access: All processors have the same access time
to the shared memory, which simplifies memory access and
coordination compared to Non-Uniform Memory Access (NUMA)
systems.
Static Scheduling Techniques:
1. List Scheduling:
• Definition: Tasks are ordered in a list based on their priority or
dependencies, and each task is assigned to a processor in a
fixed sequence.
• Advantages: Simple to implement and manage. Suitable for
problems with predictable task execution times and
dependencies.
• Challenges: Does not adapt to runtime variations in task
execution times or processor availability.
2. Block Scheduling:
•Definition: Tasks are grouped into blocks and assigned to
processors in such a way that each processor handles a block
of tasks.
• Advantages: Can simplify task management and reduce
overhead in environments with predictable workloads.
• Challenges: May lead to load imbalances if the task blocks are
not evenly distributed in terms of computational demands.
3. Cyclic Scheduling:
• Definition: Tasks are assigned to processors in a cyclic
fashion, ensuring that each processor gets a fair share of tasks
over time.
• Advantages: Provides an even distribution of tasks across
processors.
• Challenges: May not be optimal if tasks have varying execution
times or if there are dependencies between tasks.
Applications:
• Real-Time Systems: Ensures that tasks are completed within their
deadlines by assigning tasks based on known execution times and
priorities.
• Embedded Systems: Used in embedded systems where the
workload is predictable and task execution can be planned in
advance.
• Parallel Computing: Suitable for applications with fixed task
structures where runtime variations are minimal.
Advantages:
• Simplicity: Easier to implement and manage compared to dynamic
scheduling approaches.
• Predictability: Provides predictable performance if the task
execution times and processor capabilities are well understood.
• Reduced Overhead: No need for runtime monitoring and task
reassignment, reducing the system's overhead.
Challenges:
• Inflexibility: Cannot adapt to changes in task execution times or
processor availability during runtime. This can lead to inefficiencies if
tasks or processors deviate from expected behavior.
• Load Imbalance: May result in uneven utilization of processors if
tasks are not perfectly balanced, leading to some processors being
overburdened while others are underutilized.
• Dependency Management: Handling task dependencies can be
complex, as tasks must be assigned in a way that respects their
dependencies and execution order.
Comparison with Dynamic Scheduling:
• Static Scheduling: Involves predetermined task assignments and is
simpler to implement but lacks adaptability.
• Dynamic Scheduling: Adapts to runtime conditions and task
variations, potentially offering better load balancing but with
increased complexity and overhead.

Deadlock
Definition:
• Deadlock is a situation in a multiprogramming or multi-threading
environment where a set of processes or threads become stuck in a
state of indefinite waiting because each is waiting for a resource that
another process holds. As a result, none of the processes can
proceed, leading to a standstill.
Conditions for Deadlock: For a deadlock to occur, the following four
conditions must be present simultaneously (known as the Coffman
conditions):
1. Mutual Exclusion:
•Definition: At least one resource must be held in a non-
shareable mode. That is, only one process can use the
resource at any given time.
• Example: A printer can only be used by one process at a time.
2. Hold and Wait:
• Definition: A process holding at least one resource is waiting to
acquire additional resources currently held by other processes.
• Example: A process that has a printer but is waiting for access
to a scanner held by another process.
3. No Preemption:
• Definition: Resources cannot be forcibly taken from a process
holding them; they must be released voluntarily.
• Example: A process cannot be preempted from holding a
resource like memory until it finishes using it.
4. Circular Wait:
• Definition: There exists a set of processes such that each
process is waiting for a resource held by the next process in the
set, forming a circular chain.
• Example: Process A waits for a resource held by Process B,
Process B waits for a resource held by Process C, and Process
C waits for a resource held by Process A.
Detection of Deadlock:
• Deadlock Detection Algorithm:
• Definition: Algorithms are used to detect the presence of
deadlock in a system. One common method is to use a
Resource Allocation Graph where nodes represent processes
and resources, and edges represent requests and allocations.
• Technique: A cycle in the resource allocation graph indicates
the presence of a deadlock.
Prevention of Deadlock:
• Eliminate One Condition: Prevent deadlock by ensuring that at
least one of the Coffman conditions cannot occur.
• Mutual Exclusion: Make resources shareable if possible,
though this might not always be feasible.
• Hold and Wait: Require processes to request all resources at
once, which might increase resource consumption but prevents
hold and wait.
• No Preemption: Allow resources to be forcibly taken from
processes, which might involve rolling back processes to a safe
state.
• Circular Wait: Impose an ordering on resource requests and
require processes to request resources in a specific sequence.
Avoidance of Deadlock:
• Banker’s Algorithm:
• Definition: A resource allocation algorithm that determines
whether resource requests can be safely granted without
leading to a deadlock.
• Technique: It uses information about available resources,
maximum claims, and current allocations to ensure that
resource allocation will not leave the system in an unsafe state.
Recovery from Deadlock:
• Process Termination: Kill one or more processes involved in the
deadlock to break the cycle.
• Technique: Choose processes to terminate based on criteria
such as priority or the amount of work lost.
• Resource Preemption: Take resources away from some processes
and allocate them to others to resolve the deadlock.
• Technique: Rollback processes to a safe state and reallocate
resources.
Applications:
• Database Systems: Managing concurrent transactions and
ensuring that database locks do not lead to deadlock.
• Operating Systems: Handling resource allocation and process
scheduling to avoid deadlock situations.
Challenges:
• Complexity: Implementing deadlock prevention, avoidance, or
recovery mechanisms can be complex and may impact system
performance.
• Overhead: The overhead associated with deadlock detection and
resolution can affect system efficiency and resource utilization.
Comparison:
• Detection vs. Prevention: Detection involves identifying deadlocks
after they occur and taking corrective actions, while prevention aims
to avoid the conditions that lead to deadlock in the first place.
• Avoidance vs. Recovery: Avoidance focuses on dynamically
managing resources to ensure safe allocation, while recovery deals
with handling deadlocks after they have occurred.

You might also like