Threadingand Multiprocessing
Threadingand Multiprocessing
Ayushman Choudhary
Principle Engineer at VVDN Technologies
[email protected]
Agenda
Introduction to thread and process
Advantages of threading over process
Issues with threading
Python threading
Multiprocessing
Introduction
Thread : is a thread of execution in a program.
Process : is an instance of a computer program that is
being executed.
Thread share the memory and state of the parent, process
share nothing.
Process use inter-process communication to
communicate, thread do not.
Thread
A thread is a light-weight process (it’s a sequence of control
flow).
Except that it exists entirely inside a process and shares
resources.
A single process may have multiple threads of execution.
Useful when an application wants to perform many
concurrent tasks on shared data.
Think about a browser (loading pages,animations, etc.)
Process
A running program is called a "process"
Each process has memory, list of open files, stack, program counter, etc...
Normally, a process execute statements in a single sequence of control flow.
Process creation with fork(),system(), popen(), etc. These commands create
an entirely new process.
Child process runs independently of the parent.
Has own set of resources.
There is minimal sharing of information between parent and child.
Think about using the Unix shell.
Advantages of Threading
Multithreaded programs can run faster on computer systems with multiple CPUs,
because theses threads can be truly concurrent.
A program can remain responsive to input. This is true both on single and on
multiple CPUs.
Allows to do something else while one thread is waiting for an I/O task
(disk,network) to complete.
Some programs are easy to express using concurrency which leads to elegant
solutions, easier to maintain and debug.
Threads of a process can share the memory of global variables. If a global
variable is changed in one thread, this change is valid for all threads. A thread can
have local variables.
Thread Issues
Scheduling - To execute a threaded program, must rapidly switch between
threads.
This can be done by the user process (userlevel threads).
Can be done by the kernel (kernel-level threads).
Resource Sharing - Since threads share memory and other resources, must be very
careful.
Operation performed in one thread could cause problems in another.
Synchronization - Threads often need to coordinate actions.
Can get "race conditions" (outcome dependent on order of thread execution)
Often need to use locking primitives (mutual exclusion locks, semaphores, etc...)
Python Threading
Covered in seperate slide
Multiprocessing
Remember :
– Processes share nothing.
– Processes communicate over interprocess communication
channel .
This was not an issue with Threading module.
Python developers had to find a way for processes to
communicate and share date. Otherwise, The module will not
be as efficient as it is.
Exchange Object Between
Process
Two communication channel
a.) Queues
b.) Pipes
Exchange Object between
Processes
Queues :
– Returns a process shared queue.
– Any pickle-able object can pass through it.
– Thread and process safe.
Pipes :
– Returns a pair of connection objects connect by a pipe.
– Every object has send/recv methods that are used in the
communication between processes.
Sharing state between processes
Shared memory :
– Python provide two ways for the data to be stored in a
shared memory map:
• Value :– The return value is a synchronizedwrapper for the
object.
• Array :– The return value is a synchronized wrapper for the
array.