Concurrency in Computing
Concurrency in Computing
1. Asymmetric Multiprocessing: Uses different processing units specialized for distinct tasks.
For example, CPUs handle general tasks while GPUs process graphical data.
2. Symmetric Multiprocessing (SMP): Multiple identical processors share the load equally,
improving task efficiency.
3. NUMA: Nonuniform Memory Access is a system where memory access time depends on the
memory location relative to the processor.
4. Clustered Multiprocessing: A system where multiple computers work together as a single
virtual computer, distributing tasks.
5. Shared Memory Access: In SMP systems, processors share a common memory which leads
to faster data access.
6. Asymmetric Multiprocessing Example: GPUs and CPUs in a machine use this approach,
where CPUs handle general-purpose tasks and GPUs manage specific functions like
rendering.
7. SMP Evolution: Modern multicore systems are an evolution of symmetric multiprocessing
where each core acts as an independent processing unit.
8. Task Distribution: In SMP, tasks are dynamically distributed to available processors to
balance load.
9. Memory Architecture: SMP benefits from a shared memory architecture, while asymmetric
systems may rely on distributed memory.
10. Parallelism in SMP: It allows efficient parallelism by using multiple processors to work on
tasks simultaneously.
Multicore Systems
1. Multicore Processor: A single processor with multiple cores that work independently,
sharing a common memory pool.
2. L1 Cache: Each core in a multicore processor has its own L1 cache, providing fast access to
frequently used data.
3. L2 Cache: This cache is shared among all cores in the processor, acting as a bridge to the
slower main memory.
4. Shared Bus: Cores are connected to each other and the memory through a shared
communication bus.
5. Core Independence: Each core can run separate instructions, enabling true parallel
processing within the same processor.
6. Parallel Workloads: Multicore systems are ideal for workloads that can be split into
independent tasks.
7. Energy Efficiency: Multicore systems are more energy-efficient than increasing the clock
speed of a single-core processor.
8. Task Execution: Tasks are distributed among cores to maximize performance and minimize
execution time.
9. Multithreading in Multicore: Cores in a multicore processor can handle multiple threads
simultaneously.
10. Frequency Scaling Limitation: Due to physical constraints, increasing processor frequency
is not feasible, making multicore systems essential for improved performance.
Process vs Thread
1. Process Definition: A process is an instance of a running program that has its own memory
space.
2. Thread Definition: A thread is a smaller unit of execution within a process, capable of
running independently but sharing memory with other threads.
3. Multiple Processes: A system can run multiple processes concurrently, each isolated from
others.
4. Memory Isolation: Processes have isolated memory spaces, preventing interference between
them.
5. Thread Communication: Threads within the same process can communicate directly as they
share the same memory space.
6. Task Independence: Threads allow tasks to be broken into smaller units that can be executed
independently.
7. Process Overhead: Creating and managing processes is resource-intensive compared to
threads.
8. Thread Efficiency: Threads are more efficient for tasks requiring frequent communication or
data sharing.
9. Process Synchronization: Processes need explicit mechanisms like inter-process
communication (IPC) for synchronization, while threads use synchronization primitives.
10. Multithreading Benefits: Multithreading enhances application performance, particularly in
systems with multiple cores.
1. Multitasking: The ability of an operating system to execute multiple processes at the same
time.
2. Time Slicing: In multitasking, the operating system allocates small time slices for each
process, giving the illusion of concurrent execution.
3. Thread Scheduling: The operating system schedules threads, allowing multiple threads to run
concurrently on a single processor.
4. Context Switching: When switching between processes, the operating system must save and
load the process state, which incurs overhead.
5. Thread vs Process Context Switching: Thread context switching is faster than process
switching, as threads share memory and resources.
6. Multithreading Support: Modern operating systems and programming languages natively
support multithreading through APIs.
7. Concurrency Illusion: In single-core systems, multithreading gives the illusion of parallelism
by rapidly switching between threads.
8. Parallelism in Multicore Systems: True parallelism is achieved in multicore systems, where
each core can execute a separate thread.
9. Thread Synchronization: Threads need to be synchronized to avoid conflicts when accessing
shared resources.
10. System Responsiveness: Multithreading improves system responsiveness, especially in GUI
applications where background tasks run without freezing the interface.
1. Implicit Threading: Threads are managed automatically by the system or APIs, often used in
virtual machine-based environments like Java or .NET.
2. API Control: APIs handle internal threading for specific operations like GUI rendering or
garbage collection.
3. Developer Involvement: Implicit threading requires minimal involvement from developers,
making it easier to use for certain applications.
4. Explicit Threading: Developers explicitly create and manage threads in the program,
allowing fine-grained control over thread execution.
5. Explicit Parallelism: Developers introduce explicit threading to achieve parallelism and
optimize performance for computation-heavy tasks.
6. Thread Pooling: In explicit threading, developers can use thread pools to manage a set of
reusable threads for efficient execution.
7. Concurrency in Explicit Threading: Developers can design programs with concurrent
execution paths using explicit threads.
8. I/O Handling in Explicit Threading: Long I/O operations can be handled in background
threads, keeping the main program responsive.
9. Performance Optimization: Explicit threading allows developers to fine-tune thread
behavior and synchronization for better performance.
10. API Availability: Most modern programming languages provide APIs for both implicit and
explicit threading.
1. Thread Creation: A thread starts in the Unstarted state when it is created but not yet
scheduled for execution.
2. Starting a Thread: When the Start() method is called, the thread transitions from the
Unstarted to the Started state.
3. Staging In: If files or resources are needed for the thread's execution, it enters the StagingIn
state, where data is uploaded to the node.
4. Queued State: After staging, the thread moves to the Queued state, waiting for a free node or
processor to execute the task.
5. Running State: Once a node becomes available, the thread enters the Running state, where
its execution takes place.
6. Failure Handling: If an error occurs during file uploading or execution, the thread transitions
to the Failed state.
7. Completion: After successfully completing its task, the thread transitions to the Completed
state.
8. Staging Out: If output files need to be retrieved, the thread enters the StagingOut state,
where data is transferred back to the main node.
9. Abort State: If the thread is explicitly terminated, it enters the Aborted state, which is a final
state.
10. Middleware Scheduling: Aneka threads are managed by middleware, which handles
scheduling, file transfer, and state transitions automatically.
Thread Synchronization
1. Thread Synchronization Need: Threads often access shared resources, and synchronization
ensures that only one thread modifies a resource at a time.
2. Join Operation: Aneka supports the join operation, which allows one thread to wait for
another thread to complete before proceeding.
3. Locks and Semaphores: In traditional systems, threads use synchronization primitives like
locks and semaphores to coordinate access to shared data.
4. Reader-Writer Locks: This allows multiple threads to read a shared resource simultaneously
but ensures exclusive access for writing.
5. No Shared Memory in Distributed Systems: In a distributed environment like Aneka,
threads don’t share memory, so traditional synchronization mechanisms are not necessary.
6. Message Passing: Instead of sharing memory, threads in a distributed system communicate by
passing messages or data between nodes.
7. Distributed Deadlocks: Synchronization in distributed systems can lead to deadlocks, where
threads wait indefinitely for resources.
8. Minimal Synchronization in Aneka: Aneka minimizes the use of synchronization beyond
the join operation to avoid complex locking strategies and potential deadlocks.
9. Synchronization Overhead: Excessive synchronization can lead to performance issues due to
waiting, so it’s used sparingly.
10. Thread Independence: In distributed environments, threads are designed to be as
independent as possible to reduce the need for synchronization.
Thread Priorities
Type Serialization
1. Interface Compatibility: Aneka threads maintain almost the same interface as .NET
System.Threading.Thread, ensuring easy porting of local multithreaded applications.
2. Basic Operations: The core thread operations, such as Start() and Abort(), are directly
mapped to Aneka threads for consistency.
3. Limited Suspension: The Suspend() and Resume() operations, which abruptly interrupt
thread execution, are not supported in Aneka as they are deprecated practices.
4. No Sleep Support: Aneka threads do not support the Sleep() operation, which pauses thread
execution for a specified time, as it leads to inefficient resource use in distributed systems.
5. Synchronization Challenges: While local threads have advanced synchronization features
like locks and semaphores, Aneka threads minimize synchronization to avoid distributed
deadlocks.
6. Join Operation: The Join() method, which allows a thread to wait for another to complete, is
supported for basic synchronization in distributed threads.
7. Distributed Execution Context: Unlike local threads that share memory, Aneka threads
execute in separate processes on different machines, requiring type serialization.
8. No Thread Priorities: Aneka does not support thread priorities, meaning all threads are
treated equally in terms of scheduling, regardless of their priority in local applications.
9. State Management: Aneka threads have a different life cycle compared to common threads,
as their execution is managed by the middleware.
10. Serialization Requirement: Since Aneka threads run in a distributed environment, objects
and data need to be serialized and transmitted between nodes.
1. Unstarted State: Aneka threads begin in the Unstarted state after they are created but before
execution starts.
2. Staging In: If files or data need to be uploaded to a node before execution, the thread enters
the StagingIn state.
3. Queued State: Once the thread is ready, it enters the Queued state, waiting for a node to
become available for execution.
4. Running State: The thread enters the Running state when it is actively executing on a remote
node.
5. Staging Out: After execution, if output files need to be retrieved, the thread enters the
StagingOut state while files are transferred back.
6. Completed State: When the thread successfully finishes its task and all data is retrieved, it
transitions to the Completed state.
7. Failed State: If an error occurs during execution, the thread enters the Failed state, ending its
life cycle.
8. Aborted State: If the thread is explicitly terminated by the developer, it transitions to the
Aborted state.
9. Middleware Control: Aneka middleware controls many state transitions, handling aspects
like file uploads, node availability, and thread execution.
10. Execution Failures: If a thread fails to execute due to missing reservation credentials or other
issues, it enters a Rejected state.
Domain Decomposition
1. Definition: Domain decomposition involves splitting the data into smaller chunks that can be
processed independently in parallel.
2. Embarrassingly Parallel Problems: These problems are ideal for domain decomposition, as
tasks don’t need to communicate, leading to easy parallelization.
3. Independent Data Units: In domain decomposition, each task operates on a different subset
of the data, making them independent of each other.
4. Master-Slave Model: A common approach in domain decomposition where a master thread
assigns tasks to multiple slave threads, which perform the actual computation.
5. Task Creation: The master thread is responsible for dividing the data and creating tasks for
each subset to be processed by slave threads.
6. Repetitive Computation: Problems with repetitive computations on different datasets are
good candidates for domain decomposition.
7. Simple Coordination: Minimal coordination is required between tasks in domain
decomposition, reducing the complexity of parallelization.
8. Result Collection: After each thread completes its computation, the master thread collects the
results and composes the final output.
9. Thread Pooling: To optimize resource use, thread pooling can be applied, limiting the
number of threads while reusing existing threads for new tasks.
10. Optimizing Thread Use: Thread pooling helps reduce overhead by avoiding the creation and
destruction of threads for each task.
1. Master-Slave Pattern: In this model, the master thread is responsible for decomposing the
problem into smaller tasks and distributing them to slave threads.
2. Master Responsibilities: The master thread manages task decomposition, assigns tasks to
slave threads, and collects the results.
3. Slave Threads: Each slave thread performs the computation on a portion of the data, working
independently of other slave threads.
4. Task Assignment: The master thread dynamically assigns tasks to slave threads, ensuring
load balancing across the system.
5. Independent Execution: Slave threads execute their assigned tasks independently without
needing to communicate with each other.
6. Result Composition: After all slave threads complete their tasks, the master thread gathers
their outputs and combines them into a final result.
7. High Efficiency: The master-slave model is highly efficient for embarrassingly parallel
problems where tasks are independent.
8. Error Handling: The master thread can also handle error detection and recovery, reassigning
failed tasks to other available threads.
9. Scalability: The model scales well as more slave threads (or distributed nodes) can be added
to handle more tasks in parallel.
10. Centralized Control: The master thread retains centralized control over task management,
while slave threads focus solely on computation.
Functional Decomposition
1. Distributed Execution: Aneka threads are distributed across multiple machines, with each
machine executing a portion of the application.
2. Independence from Parent Process: Unlike traditional threads that run within the same
process, Aneka threads are independent processes that execute on separate nodes.
3. Thread Proxy: Local objects act as proxies for remote threads, allowing developers to control
and monitor distributed threads as if they were local.
4. Network-Based Execution: Distributed threads require network communication to transfer
data between nodes, adding complexity to thread management.
5. Data Transfer: Before execution, data and code need to be serialized and transferred to the
remote node where the thread will run.
6. Output Collection: After execution, results are collected from the remote node and sent back
to the originating process or thread.
7. Remote Synchronization: Although threads are distributed, basic synchronization can still be
applied, such as waiting for all threads to complete before proceeding.
8. Independent Memory Space: Distributed threads run in separate memory spaces, eliminating
the need for shared memory but requiring communication for data exchange.
9. Performance Optimization: By distributing threads across multiple machines, applications
can significantly reduce execution time and increase throughput.
10. Fault Tolerance: Aneka provides mechanisms for handling failures in distributed threads,
allowing for task reassignment or thread recovery.
1. Aneka Thread Abstraction: Aneka provides a programming model that abstracts the
complexities of distributed multithreading, allowing developers to focus on task design.
2. Thread Creation: Threads are created and controlled by the application developer, but their
execution is managed by the Aneka middleware.
3. Seamless Transition: The transition from local multithreaded applications to distributed ones
is seamless, as developers can reuse existing thread code.
4. Thread Scheduling: Aneka is responsible for scheduling thread execution across distributed
nodes, ensuring efficient resource use and load balancing.
5. Thread Proxies: Developers control threads via local proxies, which act as interfaces for
remote thread execution and communication.
6. Embarrassingly Parallel Support: The Aneka thread programming model is ideal for
embarrassingly parallel applications, where tasks can run independently.
7. Local vs Distributed Threads: Aneka threads mimic the behavior of local threads, but they
are executed in a distributed environment, providing scalability.
8. Distributed Infrastructure: The thread programming model allows applications to leverage
cloud resources, distributing threads across multiple virtual machines.
9. Parallelism Control: Developers can specify the degree of parallelism by controlling the
number of threads and how they are distributed across nodes.
10. Transparent Distribution: Aneka abstracts the complexities of distributed thread execution,
allowing developers to work with threads as if they were local.
1. Distinct Operations: Functional decomposition is used when the problem involves several
different types of operations, each performed by separate threads.
2. Data-Parallelism vs Functional-Parallelism: Functional decomposition focuses on
parallelism in operations rather than parallelism in data processing.
3. Pipeline Processing: A common use case is pipeline processing, where data passes through
different stages, each handled by a separate function or thread.
4. Example – Image Processing: In image processing, one thread could handle image filtering,
while another thread manages color correction, and yet another performs scaling.
5. Task Isolation: Each thread performs its operation on the entire dataset, but operations are
isolated, avoiding dependencies between threads.
6. Synchronization After Tasks: Once all functional threads complete, the results are
synchronized to produce the final output, such as a fully processed image.
7. Composition Phase: After independent computations are done, their results are combined,
forming the final outcome of the entire process.
8. Efficiency Gains: In scenarios where different functions can be computed independently,
functional decomposition allows significant performance gains.
9. Limited Parallelism: Functional decomposition typically produces fewer parallel tasks
compared to domain decomposition, as it’s constrained by the number of distinct functions.
10. Thread Assignment: Each distinct operation is assigned to a separate thread or set of threads,
which helps optimize the overall task flow.
1. Distributed Execution: Aneka threads are designed for distributed environments where each
thread runs on a different node, unlike common threads that run within the same process.
2. Execution Context: Aneka threads do not share memory and resources directly with other
threads, unlike traditional threads that share the same memory space.
3. Thread Life Cycle: Aneka threads have a more complex life cycle due to their distributed
nature, including stages like file staging and queuing on remote nodes.
4. Middleware Scheduling: In Aneka, thread execution is managed by the middleware, which
schedules threads based on node availability and resource constraints.
5. Limited Synchronization: Traditional threads use synchronization mechanisms like locks
and semaphores, but Aneka minimizes these to avoid distributed deadlocks.
6. Serialization Requirement: Since Aneka threads operate in different memory spaces, they
require serialization of data and code for transmission between nodes.
7. Join Operation: Both common and Aneka threads support the join operation, allowing one
thread to wait for another thread to finish before proceeding.
8. Thread Priorities: While common threads can have different priority levels, Aneka threads
do not support priorities, treating all threads equally in scheduling.
9. State Management: In traditional systems, thread state transitions are controlled by the
developer. In Aneka, state transitions are managed by the middleware.
10. Network Dependency: Aneka threads rely on network communication for data transfer and
execution, adding latency compared to local threads that execute within the same memory
space.
1. Serialization Requirement: In distributed systems, data and code must be serialized into a
format that can be transferred over the network between nodes.
2. Shared Memory Limitation: Unlike traditional threads, Aneka threads do not share memory,
making serialization essential for data exchange.
3. Object Serialization: Objects that need to be processed by remote threads are serialized into
byte streams before being sent to remote nodes for execution.
4. State Reconstruction: When a serialized object reaches the remote node, its state is
reconstructed so that the thread can execute it as if it were local.
5. Delegates and Methods: When a thread references an instance method, the state of the
enclosing object must also be serialized for proper execution on the remote node.
6. Execution Context Transfer: The entire execution context, including the code and data, must
be serialized and transferred for remote thread execution.
7. Performance Impact: Serialization introduces overhead due to the time it takes to convert
objects to byte streams and send them across the network.
8. Deserialization: After data is transferred, it must be deserialized on the remote node, which
can also add to the overall execution time.
9. Type Serialization in .NET: The .NET framework provides built-in support for serialization,
making it easier to implement in distributed systems like Aneka.
10. Challenges with Complex Objects: Complex objects with deep hierarchies or references to
other objects can complicate serialization, requiring careful handling in distributed
applications.
AnekaApplication Class: Entry point for distributed applications using the thread model.
Components: Comprises threads for computation and a ThreadManager for execution.
Generics Support: Allows flexibility for various programming models.
Parallel Execution: Tasks run as threads on separate nodes for performance.
Cloud/Grid Integration: Designed for efficient resource use in distributed systems.
Task Scheduling: ThreadManager schedules based on resource availability.
Embarrassingly Parallel Problems: Well-suited for independent tasks.
Result Aggregation: Combines outputs from completed threads.
Fault Tolerance: Reassigns tasks if threads or nodes fail.
Data Processing Pipelines: Ideal for applications that process large datasets in parallel, such
as batch processing jobs in data analytics.
Scientific Simulations: Suitable for running multiple simulations simultaneously, reducing
time-to-results in fields like meteorology or physics.
Financial Modeling: Enables running various financial models in parallel, enhancing the
speed of risk assessments and investment strategies.
Image Processing: Can efficiently handle tasks like rendering or image analysis across
distributed nodes.
Machine Learning: Facilitates training multiple models or hyperparameter tuning by
distributing tasks.
Network Latency: Communication overhead due to network latency can affect overall
performance.
State Management: Keeping track of the state across distributed nodes can be complex,
requiring careful design.
Debugging Difficulties: Debugging distributed applications can be challenging, as traditional
debugging tools may not apply.
Resource Fragmentation: Threads may experience resource fragmentation, where available
resources are spread thinly across multiple nodes.
Consistency Issues: Ensuring data consistency in a distributed environment can be complex
and requires effective strategies.
Conclusion
Aneka's Flexibility: The thread programming model in Aneka offers significant advantages
for developers looking to harness the power of distributed computing.
Scalability and Performance: By effectively managing threads across multiple nodes, Aneka
provides scalability and performance benefits for a wide range of applications.
Emphasis on Simplicity: The model’s design focuses on simplifying the development of
distributed multithreaded applications while abstracting away many complexities.
Ongoing Evolution: As distributed computing continues to evolve, so will the tools and
methodologies available to developers, making frameworks like Aneka increasingly relevant.