We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26
Concurrency Control in Parallel
and Distributed Computing
Detailed Overview with Algorithms, Code, and Flowcharts Introduction to Concurrency Control • Concurrency: Execution of multiple operations at the same time. • Issues: Conflicts, race conditions, and deadlocks. • Importance: Ensuring correctness in systems with parallel or distributed processes. Need for Concurrency Control • Avoiding race conditions: Prevent multiple processes from reading/writing shared data simultaneously. • Handling deadlocks: Ensuring processes don’t block each other indefinitely. • Ensuring data consistency and integrity. Parallel vs Distributed Computing • Parallel computing: Multiple processes share the same memory. • Distributed computing: Processes communicate across different machines. • Both require proper concurrency control for consistency. Challenges in Parallel Computing • Shared memory access. • Contention for resources. • Need for efficient synchronization mechanisms like locks, semaphores. Challenges in Distributed Computing • Network delays and failures. • No shared memory – data consistency must be maintained across systems. • Examples: Commit protocols, distributed locks. Concurrency Control Basics • Techniques to manage simultaneous access to resources. • Primary methods: Locking, timestamp ordering, optimistic concurrency control (OCC). • Ensures serializability, consistency, and prevents conflicts. Key Problems Addressed • Race conditions. • Deadlocks. • Starvation. • Data inconsistency and anomalies. Lock-Based Mechanisms • Exclusive locks (write locks) and shared locks (read locks). • Two-Phase Locking (2PL) ensures serializability. • Drawbacks: Deadlocks, overhead of managing locks. Two-Phase Locking (2PL) • Growing Phase: Locks are acquired, no lock is released. • Shrinking Phase: Locks are released, no new locks are acquired. • Guarantees serializability but can lead to deadlocks. Deadlock Detection and Prevention • Wait-Die and Wound-Wait schemes to prevent deadlocks. • Timeout-based schemes for deadlock detection. • Example: Banker’s Algorithm. Optimistic Concurrency Control (OCC) • Assumes low contention for resources. • Transactions execute without locks, validate for conflicts at commit. • If conflict occurs, roll back and retry. OCC Phases • Read Phase: Execute without locks. • Validation Phase: Check if conflicts exist with other transactions. • Write Phase: If validation passes, commit the transaction. Advantages and Disadvantages of OCC • Advantages: Less overhead due to fewer locks. • Disadvantages: Rollbacks can be costly if contention is high. Concurrency in Distributed Systems • No shared memory – state must be synchronized across systems. • Requires coordination between distributed processes for consistency. • Examples: Two-Phase Commit (2PC) and Three-Phase Commit (3PC) protocols. Two-Phase Commit (2PC) • Phase 1 (Prepare): Coordinator asks all participants to prepare. • Phase 2 (Commit): If all participants agree, the coordinator sends commit; otherwise, abort. • Guarantees atomicity but can block in case of failures. Three-Phase Commit (3PC) • Phase 1 (Prepare): Coordinator asks participants to prepare. • Phase 2 (Pre-Commit): If all agree, a pre- commit is sent. • Phase 3 (Commit/Abort): Final commit if everyone agrees. • Avoids blocking issues of 2PC. Concurrency Control Algorithms • Lock-based protocols (2PL, deadlock detection). • Timestamp ordering protocols. • Optimistic Concurrency Control (OCC). Timestamp Ordering Protocol • Each transaction is assigned a timestamp. • Ensures that conflicting operations are executed in the order of their timestamps. • Read and Write operations are checked based on timestamps. Multiversion Concurrency Control (MVCC) • Maintains multiple versions of the data to allow for concurrent reads and writes. • Readers get the latest committed version. • Writers work on a new version without blocking readers. Flowchart: Two-Phase Locking (2PL) • Diagram showing the growing and shrinking phases of 2PL. • Shows how locks are acquired and released in phases. Flowchart: Two-Phase Commit (2PC) Diagram representing the communication between the coordinator and participants in 2PC. Shows preparation, commit, and abort phases. Code Example: Lock-Based Mechanism (Python) import threading lock = threading.Lock() # Thread 1 with lock critical_section() Thread 2 with lock: critical_section() Code Example: Timestamp Ordering # Pseudo code for timestamp ordering protocol if (read_timestamp < write_timestamp): abort transaction else: execute transaction Code Example: OCC (Java) public boolean validate(Transaction txn) { for (Transaction t : activeTransactions) { if (conflict(txn, t)) {return false; // Conflict detected} } return true; // No conflicts } Conclusion • • Concurrency control is crucial for maintaining data consistency in both parallel and distributed systems. • • Techniques like locking, OCC, and timestamp ordering ensure serializability and atomicity. • • Efficient algorithms prevent deadlocks and minimize performance overhead.