PDC DataScience5A COSC222102008 MuhammadSarmadIqbal
PDC DataScience5A COSC222102008 MuhammadSarmadIqbal
Reg : COSC-222102008
Class : BS-DASC-5A
cannot afford to wait for responses, such as in event-driven architectures or real- time
systems. For example, a web server handling multiple client requests might use
asynchronous communication to ensure that the server doesn't become idle while
waiting for slow database queries or remote API calls to complete. This way, the
Trade-offs Involved:
responses.
3. Resource Utilization:
Synchronous systems can more easily propagate failures up the call stack,
failures.
Example:
back-end services like authentication, payments, and recommendations. If the payment service
front-end service to handle other operations (such as user interface updates or authentication)
while waiting for the payment confirmation. In contrast, with synchronous communication, the
front-end would block until the payment service responds, leading to poor user experience
distributed system in the event of faults? Provide examples where fault tolerance and
Concurrency control mechanisms enhance the reliability of distributed systems during faults
distributed transactions. They ensure that multiple operations on shared data occur in a
replicas.
2. Microservices Architecture:
Concurrency control prevents conflicting updates to shared state (e.g.,
writes.
nodes fail.
Together, concurrency control ensures correct, conflict-free operations, while fault tolerance
challenges.
models:
5. Framework Support: Some CPU frameworks lack GPU support. Solution: Use
6. Debugging: GPU debugging is harder. Solution: Use tools like NVIDIA Nsight.
Overcome challenges with training, using optimized libraries, migrating incrementally, and
In a heterogeneous environment where systems have varying capacities, load balancing aims
1. System Capacities:
Different nodes have varying CPU, memory, disk, and GPU capacities. Load
balancing must allocate more tasks to powerful machines and fewer to less
capable ones.
Solution: Use weighted load balancing algorithms, assigning more weight
2. Task Characteristics:
Solution: Classify tasks based on their resource needs and match them to
systems best suited to handle those needs (e.g., CPU-bound tasks go to CPU-
rich nodes).
underutilized or overloaded.
Network speed affects how quickly tasks are distributed and how fast
network capacity.
5. Fault Tolerance:
Some nodes may fail, or tasks may need to be migrated if a machine
becomes unavailable.
Static Load Balancing: Tasks are assigned at the start based on system capacity and
Q#5 Imagine you are optimizing a parallel program for performance. How would you
address the issues related to memory consistency and memory hierarchy to ensure your
When optimizing a parallel program for performance, addressing memory consistency and
memory hierarchy is crucial for ensuring scalability and efficiency. Here’s how to approach
these issues:
1. Memory Consistency:
and ensuring consistent views of memory across all threads is critical to avoid
Solution:
Synchronization Primitives: Use locks, barriers, or mutexes to
coordinate access to shared data, ensuring changes by one thread are visible
to others.
2. Memory Hierarchy:
caches, main memory), each with different access speeds. Efficient use of this
Solution:
Cache Coherency: For systems with multiple caches, ensure data is kept
modified. False sharing (where multiple threads write to different data in the
memory latency.
NUMA Awareness: In systems with Non-Uniform Memory Access
3. Parallelism Strategies:
Load Balancing: Distribute work evenly across threads to prevent some threads
Granularity: Use appropriate task granularity (not too fine or too coarse) to
optimizing for the memory hierarchy by improving data locality, reducing cache misses, and
being NUMA-aware, your parallel program can scale efficiently with more threads and larger
datasets.
Q#6 Compare and contrast the Message Passing Interface (MPI) with SIMD and MIMD
architectures.
parallel systems.
Execution Model: Processes run independently, each with its own memory
space. Communication between them is explicit and handled through message passing.
Scalability: MPI scales well across many nodes in distributed systems (e.g., HPC
clusters).
Use Case: Ideal for applications that require communication between nodes with
separate memory, such as scientific simulations, weather forecasting, and large- scale
computations.
multiplications, image processing), but less flexible for irregular data or control
flow.
Use Case: Well-suited for tasks like graphics processing, machine learning, and
distributed systems.
distributed systems for tasks with varied operations (e.g., databases, large
simulations).
Q#7 A multithreaded program you are developing suffers from race conditions due to
To avoid race conditions in your multithreaded program and ensure thread safety, you can
What: A mutex allows only one thread to access a critical section of code at a
How: Surround the critical section (where shared data is accessed or modified) with
lock() and unlock() operations. This ensures that only one thread can
2. Semaphores:
many threads can access a critical section concurrently. Unlike a mutex (which
allows only one thread), a semaphore can allow multiple threads up to a defined limit.
How: Use binary semaphores (similar to mutexes) or counting semaphores to
3. Condition Variables:
What: Condition variables allow threads to wait for certain conditions to be met
How: Threads can wait for a condition to become true (e.g., data is ready), and
another thread can signal that condition when it’s safe to proceed.
4. Atomic Operations:
How: Use atomic types like std::atomic<int> to avoid the need for explicit locks
5. Read-Write Locks:
concurrently but ensures exclusive access when writing. This is useful when reads
Q#8 Explain how parallel I/O operations can enhance the performance of data- intensive
simultaneously, reducing idle time for CPUs and improving overall throughput.
reducing the time spent waiting on slower storage devices (e.g., hard drives or
network-based storage).
across multiple processes, threads, or nodes, ensuring performance scales as data size
grows.
in Python) to allow the application to continue processing while waiting for I/O
tasks to complete.
I/O tasks, ensuring that I/O-bound operations don’t hold up the main application.
batch them into larger operations to reduce the overhead associated with each I/O
layers like Redis. This reduces the need for repeated disk access and improves
read/write performance.
5. Data Compression: Compress data before writing to storage to reduce the size of
I/O operations. While compression adds some CPU overhead, it reduces the total
6. Optimized File Formats: For large-scale data processing, use efficient file
formats like Parquet or ORC, which are designed for faster read/write operations and
memory, allowing fast access without standard I/O system calls. This is particularly
useful for handling large files efficiently (e.g., using Python’s mmap module).
8. Storage Optimization: Opt for faster storage solutions like SSDs over traditional
HDDs to reduce latency. Also, ensure data is evenly distributed across multiple
Q#9 Your team is developing a cloud-based system that must scale efficiently as the
workload increases. How would you design the scheduling and scalability
2. Auto-scaling Mechanisms
Cloud platforms (e.g., AWS, GCP, Azure) provide auto-scaling features to adjust
scaling.
3. Load Balancing
data centers to serve users from the closest location, reducing latency and balancing
first. For example, higher-priority tasks (e.g., user-facing services) should receive
cluster.
Task Queues: Implement task queues to handle tasks that are not time-sensitive.
Tools like RabbitMQ or Amazon SQS can queue tasks when the system is under heavy
network usage, etc.) and application performance. Use monitoring tools like
decisions in real-time.
historical data to forecast future traffic and pre-scale resources before a traffic spike
occurs.
storage, networking, and compute resources. This ensures high availability and failover
8. Caching Strategies
store frequently accessed data. This reduces the number of database calls and
(CDNs) to cache static assets close to the user, reducing the load on the main
Infrastructure.
Q#10 Compare the use of tools like OpenMP, Hadoop, and Amazon AWS in a
distributed computing environment. Which tool would you recommend for data- parallel
applications, and why?
2. Hadoop
Hadoop is a solid choice because it is designed specifically for processing large datasets in
parallel across multiple nodes. Its MapReduce model is well-suited for splitting large data
https://www.quora.com/What-are-the-advantages-of-Hadoop-over-openMP
https://www.techtarget.com/searchapparchitecture/tip/Synchronous-vs-
asynchronous-communication-The-differences
https://medium.com/@roopa.kushtagi/concurrency-control-mechanisms-in-
distributed-systems-4c7e510b2427
https://horasis.org/the-accelerated-computing-tipping-point-how-gpus-are-
transforming-our-digital-landscape/
https://www.sciencedirect.com/science/article/pii/S0167739X24000207
https://www.researchgate.net/publication/221201675_Improving_Parallel_IO_Pe
rformance_with_Data_Layout_Awareness
https://stackoverflow.com/questions/8340614/java-avoid-race-condition- without-
synchronized-lock
https://arshitkumar-96339.medium.com/message-passing-interface-mpi-
88ca9bb14fd8
https://www.ibm.com/docs/en/iis/11.3?topic=considerations-optimizing- parallelism