Concurrency in Distributed Systems
Concurrency in Distributed Systems
• Part 1: Threads
• Traditional process
– One thread of control through a large, potentially sparse address
space
– Address space may be shared with other processes (shared mem)
– Collection of systems resources (files, semaphores)
• Thread (light weight process)
– A flow of control through an address space
– Each address space can have multiple concurrent control flows
– Each thread has access to entire address space
– Potentially parallel execution, minimal state (low overheads)
– May need synchronization to control access to shared variables
Lecture 5, page 5
Threads Example
Multi-threaded version
https://www.pythontutorial.net/advanced-python/
python-threading/
Lecture 5, page 6
Why Threads?
• Single threaded process: blocking system calls, no
concurrency/parallelism
• Finite-state machine [event-based]: non-blocking with
concurrency
• Multi-threaded process: blocking system calls with
parallelism
• Threads retain the idea of sequential processes with
blocking system calls, and yet achieve parallelism
• Software engineering perspective
– Applications are easier to structure as a collection of threads
• Each thread performs several [mostly independent] tasks
}
Lecture 5, page 10
Sequential Server
• Simplest model: single process, single thread
– Process incoming requests sequentially
Lecture 5, page 11
Multi-threaded Server
• Use threads for concurrent processing
• Simple model: thread per request
– For each new request: start new thread, process request, kill thread
while(1){
req = waitForRequest();// get next request in queue
// wait until one arrives
thread = createThread(); // start a new thread
thread.process(req); // assign request to thread
}
• Advantage: Newly arriving requests don’t need to wait
– Assigned to a thread for concurrent processing
• Disadvantage: frequent creation and deletion of threads
Lecture 5, page 12
Server with Thread Pool
• Use Thread Pool
– Pre-spawn a pool of threads
– One thread is dispatcher, others are worker threads
– For each incoming request, find an idle worker thread and assign
CreateThreadPool(N);
while(1){
req = waitForRequest();
thread = getIdleThreadfromPool();
thread.process(req)
}
• Advantage: Avoids thread creation overhead for each request
• Disadvantages:
– What happens when >N requests arrive at the same time?
– How to choose the correct pool size N?
Lecture 5, page 13
Lecture 5, page 14
Async Event Loop Model
• Async Event loop servers: single thread but need to process multiple
requests
– Use non-blocking (asynchronous) calls
– Asynchronous (aka, event-based) programming
– Provide concurrency similar to synchronous multi-threading but with single
thread
• https://python.plainenglish.io/build-your-own-event-loop-from-scratch-in-python-da77ef1e3c39
• https://docs.python.org/3.9/library/asyncio-task.html
Lecture 5, page 16
Process Pool Servers
• Multi-process server
– Use a separate process to handle each request
– Process Pool: dispatcher process and worker processes
– Assign each incoming request to an idle process
• Apache web server supports process pools
• Dynamic Process Pools: vary pool size based on workload
• Advantages
– Worker process crashes only impact the request, not application
– Address space isolation across workers
• Disadvantages
– Process switching is more heavy weight than thread switching
Lecture 5, page 17
Server Architecture
• Sequential
– Serve one request at a time
– Can service multiple requests by employing events and
asynchronous communication
• Concurrent
– Server spawns a process or thread to service each request
– Can also use a pre-spawned pool of threads/processes (apache)
• Thus servers could be
– Pure-sequential, event-based, thread-based, process-based
• Discussion: which architecture is most efficient?
Lecture 5, page 19
• Key issues:
User-level threads
Lecture 5, page 24
Scheduler Activation
• User-level threads: scheduling both at user and kernel levels
– user thread system call: process blocks
– kernel may context switch thread during important tasks
• Need mechanism for passing information back and forth
• Scheduler activation: OS mechanism for user level threads
– Notifies user-level library of kernel events
– Provides data structures for saving thread context
• Kernel makes up-calls : CPU available, I/O is done etc.
• Library informs kernel: create/delete threads
– N:M mapping: n user-level threads onto M kernel entities
• Performance of user-level threads with behavior of kernel threads
Lecture 5, page 25
Light-weight Processes
Process Scheduling