0% found this document useful (0 votes)
80 views

Concurrent Programming

The document discusses concurrent and parallel programming in Python. It covers: 1) The differences between threads and processes, with threads sharing memory and being lighter weight but more difficult to program. 2) How the Global Interpreter Lock (GIL) in Python means only one thread can run at a time, but concurrency is still useful for I/O-bound tasks. 3) Examples of using Python's threading module to create threads that sleep for varying times to demonstrate threading concepts.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views

Concurrent Programming

The document discusses concurrent and parallel programming in Python. It covers: 1) The differences between threads and processes, with threads sharing memory and being lighter weight but more difficult to program. 2) How the Global Interpreter Lock (GIL) in Python means only one thread can run at a time, but concurrency is still useful for I/O-bound tasks. 3) Examples of using Python's threading module to create threads that sleep for varying times to demonstrate threading concepts.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

http://iedf.in/index.

php/learn/courses/82-intermediate-advanced-python

Concurrent Programming

INTRODUCTION
In this section, we'll talk about multithreading as well as multiprocessing. Threads are lightweight when
compared to processes. All threads of a process share the same memory space. They are faster to create and
destroy. Threads of a process can communicate more easily via queues and events. Processes use separate
memory spaces and they communicate using system-level interprocess communication mechanisms.
Switching from one process to another is an expensive affair.

Despite these advantages of threads, multithreaded applications are more difficult to develop since all
threads share the same memory space. One thread can corrupt another's work. Threads have to coordinate,
via semaphores for example, access to shared resources. It's for this reason that Python's default
implementation CPython does not support threads that can execute at the same time. It implements
something called the Global Interpreter Lock (GIL), which basically means that CPython allows only one
thread to run at a time even if the system has multiple CPUs. It still supports multithreading at the
"application level". When multiple threads can run at the same time, it's called parallel computing. This is
possible only if the system has multiple CPUs. When multiple threads can exist at the same time but they
take turns to run, it's called concurrent computing. This begs the question, "What use is concurrency when
you have only one CPU?"

Concurrency is still useful for tasks that are not compute intensive and involve a lot of IO. For example, a
thread may be waiting for disk access or response from a web server. During this wait period, another thread
is given the chance to execute. If the program used only a single thread, other tasks will get held up because
of the wait. In fact, the CPU itself may be given to another ready-to-run process on the system. With
Python's multithreading, the underlying thread will be mapped to a ready-to-run application-level thread
when the current application-level thread gets into a wait.

We will look at the following Python modules:

 Concurrent execution: threading, multiprocessing, sched, queue


 Make system calls: subprocess

BASICS OF PYTHON'S THREADING MODULE


In Python 2, there's the thread module, which in Python 3 has been renamed to _thread. The recommended
way to write multithreaded applications is to use the threading module, which provides a higher-level
interface.

We start with a simple example where function sleeper() is the entry point for each child thread created by
the main thread. Arguments can be passed to this function when the thread is created. The function starts
execution when start() method is called on the thread object. The threads do nothing more than sleep for
predetermined periods of time. The prints help us understand when the threads start and exit the system.
What's interesting in this example is that the main thread exits after it has created all the child threads. Since
these child threads are non-daemon, the program continues to run until all threads finish. If child threads are
created as daemons, you will notice that the program exits as soon as the main thread finishes. Daemon
threads will be forcefully terminated.

import threading

import datetime
1
import time

def strtimenow():

"""Get current time as a string in hh:mm:ss format."""

return datetime.datetime.strftime(datetime.datetime.now(), "%H:%M:%S")

def threadids():

"""Get tuple of parent thread ID and current thread ID."""

return (threading.main_thread().ident, threading.current_thread().ident)

def sleeper(sleepfor=5):

print("[{:s}] [{}] [{:s}] Entering function: sleeping for {:d} secs. Main thread alive: {}".
format(\

strtimenow(), threadids(), threading.current_thread().name, sleepfor, threading.main_thr


ead().is_alive()))

time.sleep(sleepfor)

print("[{:s}] [{}] [{:s}] Exiting function. Main thread alive: {}".format(\

strtimenow(), threadids(), threading.current_thread().name, threading.main_thread().is_a


live()))

if __name__ == '__main__':

numThreads = 4

threadnames = ['AB', 'MN', 'XY', 'IJ']

sleeptimes = [4, 6, 3, 5]

for name, numsecs in zip(threadnames, sleeptimes):

t = threading.Thread(name=name, target=sleeper, args=(numsecs,), daemon=False)

t.start()

2
54

In the preceding example, we were forced to obtain the thread's context within the function. In the next
example, we take an object-oriented approach. The thread's sleep time is specified at the time of creation via
the constructor. This is saved within the thread object and used by the run() method. This method is
automatically called when the thread object's start() is called. It will be obvious that the PIDs remain the
same for all threads because they are all part of the same process. The main thread that created the child
threads will wait for children to finish using the join() method.

import threading

import datetime

import time

import os

def strtimenow():

"""Get current time as a string in hh:mm:ss format."""

return datetime.datetime.strftime(datetime.datetime.now(), "%H:%M:%S")

def pids():

"""Get tuple of parent PID and current PID."""

return (os.getppid(), os.getpid())

class SleeperThread(threading.Thread):

def __init__(self, sleepfor=5):

self.sleepfor = sleepfor

super().__init__()

def run(self):

print("[{:s}] [{}] Entering thread: ID {:d}".format(strtimenow(), pids(), self.ident))

print("[{:s}] [{}] Sleeping thread {:d} for {:d} secs".format(strtimenow(), pids(), self
.ident, self.sleepfor))

time.sleep(self.sleepfor)

3
print("[{:s}] [{}] Exiting thread: ID {:d}".format(strtimenow(), pids(), self.ident))

if __name__ == '__main__':

numThreads = 4

sleeptimes = [4, 6, 3, 5]

# Create and start the threads

print("[{:s}] [{}] Starting {:d} child threads ...".format(strtimenow(), pids(), numThreads)


)

childthrds = []

for sleepfor in sleeptimes:

childthrds.append(SleeperThread(sleepfor))

childthrds[-1].start()

print("[{:s}] [{}] Waiting for child threads ...".format(strtimenow(), pids()))

for child in childthrds:

child.join()

print("[{:s}] [{}] Child thread {:d} joined.".format(strtimenow(), pids(), child.ident))

55

In the preceding example, the main thread does not reap the child threads as soon as they are done. Reaping
is done in a linear order since join() is a blocking call. We improve this in the next example by using a
timeout. We also enhance the constructor so that extra arguments such as "name" can be passed to the thread
object.

import threading

import datetime

import time

import os

def strtimenow():

"""Get current time as a string in hh:mm:ss format."""

4
return datetime.datetime.strftime(datetime.datetime.now(), "%H:%M:%S")

def pids():

"""Get tuple of parent PID and current PID."""

return (os.getppid(), os.getpid())

class SleeperThread(threading.Thread):

def __init__(self, sleepfor=5, *args, **kwargs):

self.parent_ident = threading.current_thread().ident

self.sleepfor = sleepfor

super().__init__(*args, **kwargs)

def run(self):

print("[{:s}] [{}] [{}] [{}] Entering thread.".format(strtimenow(), pids(), self.threadi


ds(), self.name))

print("[{:s}] [{}] [{}] [{}] Sleeping thread for {:d} secs".format(strtimenow(), pids(),
self.threadids(), self.name, self.sleepfor))

time.sleep(self.sleepfor)

print("[{:s}] [{}] [{}] [{}] Exiting thread.".format(strtimenow(), pids(), self.threadid


s(), self.name))

def threadids(self):

"""Get tuple of parent thread ID and current thread ID."""

return (self.parent_ident, self.ident)

if __name__ == '__main__':

numThreads = 4

threadnames = ['AB', 'MN', 'XY', 'IJ']

sleeptimes = [4, 6, 3, 5]

5
threadid = threading.current_thread().ident

# Create and start the threads

print("[{:s}] [{}] [{:d}] Starting {:d} child threads ...".format(strtimenow(), pids(), thre
adid, numThreads))

childthrds = []

for name, sleepfor in zip(threadnames, sleeptimes):

childthrds.append(SleeperThread(sleepfor, name=name))

childthrds[-1].start()

# Waiting for children can also be done with threading.active_count() or threading.enumerate


()

print("[{:s}] [{}] [{:d}] Waiting for child threads ...".format(strtimenow(), pids(), thread
id))

while any(childthrds):

for i, child in enumerate(childthrds):

if child and child.is_alive():

child.join(timeout=0.5)

if child and not child.is_alive():

print("[{:s}] [{}] [{}] Child thread {:d} joined.".format(strtimenow(), pids(),


threadid, child.ident))

childthrds[i] = None

56

All the above examples show threads that go to sleep, which is why every thread gets a chance to run. What
happens if we have a thread that does only computations, doesn't sleep or wait for IO? Will it hog the CPU
and lock out other threads in the Python process? The following example shows that this is not the case.
Python interpreter does not do any thread scheduling but it does preempt the active thread at regular
intervals so that another thread that's ready to run will acquire the GIL. Thus, the interpreter facilitates time
slicing of threads but it's the OS that does the actual scheduling. In fact, slicing is not based on time. It's
based on number of bytecodes. This is set to a default value of 100, which can be checked
at sys.getcheckinterval().

import threading

import datetime

import time

import math

6
def strtimenow():

"""Get current time as a string in hh:mm:ss format."""

return datetime.datetime.strftime(datetime.datetime.now(), "%H:%M:%S")

def threadids():

"""Get tuple of parent thread ID and current thread ID."""

return (threading.main_thread().ident, threading.current_thread().ident)

def hogger(n):

return sum([math.cos(x) for x in range(n)])

def threadentry(func, n):

print("[{:s}] [{}] [{:s}] Entering function.".format(\

strtimenow(), threadids(), threading.current_thread().name))

result = func(n)

print("[{:s}] [{}] [{:s}] Exiting function. Result: {}".format(\

strtimenow(), threadids(), threading.current_thread().name, result))

if __name__ == '__main__':

numThreads = 4

threadnames = ['AB', 'MN', 'XY', 'IJ']

sleeptimes = [4, 6, 3, 5]

t = threading.Thread(name='Hogger', target=threadentry, args=(hogger, 100000000), daemon=Fal


se)

t.start()

7
time.sleep(1)

for name, numsecs in zip(threadnames, sleeptimes):

t = threading.Thread(name=name, target=threadentry, args=(time.sleep, numsecs), daemon=F


alse)

t.start()

57

SEMAPHORES, QUEUES AND EVENTS


Threads often need to share common resources. What happens if two threads try to access the same variable
at the same time? What if one thread is trying to modify an object while another attempts to read it? Another
scenario is when wish to limit the number of active threads. For example, a process may have 8 threads but
only 4 are allowed to access a specific shared resource at the same time. It's for these reasons semaphores
were invented. In the code below, we return to our old example of function-based threading.
Methods acquire() andrelease() do the job of synchronization. The former method will block if the
semaphore has already been acquired by n other threads, where n is an argument passed to the semaphore's
constructor.

import threading

import datetime

import time

def strtimenow():

"""Get current time as a string in hh:mm:ss format."""

return datetime.datetime.strftime(datetime.datetime.now(), "%H:%M:%S")

def threadids():

"""Get tuple of parent thread ID and current thread ID."""

return (threading.main_thread().ident, threading.current_thread().ident)

def sleeper(sem, sleepfor=5):

print("[{:s}] [{}] [{:s}] Entering function: Aquiring semaphore ...".format(\

8
strtimenow(), threadids(), threading.current_thread().name))

sem.acquire()

print("[{:s}] [{}] [{:s}] Entering function: sleeping for {:d} secs. Main thread alive: {}".
format(\

strtimenow(), threadids(), threading.current_thread().name, sleepfor, threading.main_thr


ead().is_alive()))

time.sleep(sleepfor)

print("[{:s}] [{}] [{:s}] Exiting function. Main thread alive: {}".format(\

strtimenow(), threadids(), threading.current_thread().name, threading.main_thread().is_a


live()))

sem.release()

if __name__ == '__main__':

numThreads = 4

threadnames = ['AB', 'MN', 'XY', 'IJ']

sleeptimes = [4, 6, 3, 5]

sem = threading.Semaphore(2)

for name, numsecs in zip(threadnames, sleeptimes):

t = threading.Thread(name=name, target=sleeper, args=(sem, numsecs), daemon=False)

t.start()

58

Here are a few interesting changes you can do to the example above:

 Create the semaphore with its default value and observe how the threads run: sem = threading.Semaphore()

 In the sleeper() function, release the semaphore twice. Python will not complain but this should be treated
as a bug.
 Repeat the above but change the semaphore to a bounded semaphore: sem = threading.BoundedSemaphore(2)

In fact, to ensure that semaphores can be used with context managers. This means that the following is a
better way to write the sleeper() function:

def sleeper(sem, sleepfor=5):

print("[{:s}] [{}] [{:s}] Entering function: Aquiring semaphore ...".format(\

strtimenow(), threadids(), threading.current_thread().name))


9
with sem:

print("[{:s}] [{}] [{:s}] Entering function: sleeping for {:d} secs. Main thread alive:
{}".format(\

strtimenow(), threadids(), threading.current_thread().name, sleepfor, threading.main


_thread().is_alive()))

time.sleep(sleepfor)

print("[{:s}] [{}] [{:s}] Exiting function. Main thread alive: {}".format(\

strtimenow(), threadids(), threading.current_thread().name, threading.main_thread().


is_alive()))

59

Often threads need to communicate with one another. Queues make it easier to do this. Some threads may
write to the queue while others are reading from it. Queues automatically manage the locking mechanisms.
In the example below the main thread creates a queue and adds a number of URLs to the queue. Each URL
point to an image on the internet. The main thread then starts child threads. The child threads remove items
from the queue and start processing them. When a thread sees that the queue is empty, it terminates. The
main thread waits on the queue to ensure that all items in the queue have been processed.

import threading

import queue

import requests

import os

import shutil

import datetime

class HttpImgThread(threading.Thread):

def __init__(self, urlqueue, *args, **kwargs):

super().__init__(*args, **kwargs)

self.__urlqueue = urlqueue

def run(self):

print("[{:s}] Starting ...".format(self.name))

while True:

try:

url = self.__urlqueue.get_nowait()

10
if url:

self.__download_img(url)

self.__urlqueue.task_done()

except queue.Empty:

print("[{:s}] Exiting.".format(self.name))

return

def __download_img(self, url):

"""Fetches raw contents given the image URL and saves it."""

print("[{:s}] Fetching {}".format(self.name, url))

rsp = requests.get(url, headers=self.__make_http_header(), stream=True)

imgname = os.path.basename(url)

with open(imgname, 'wb') as ofile:

print("[{:s}] Saving to {}".format(self.name, imgname))

shutil.copyfileobj(rsp.raw, ofile)

def __make_http_header(self):

return {

'Accept' : 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',

'Accept-Language' : 'en-us,en;q=0.5',

'User-Agent' : 'Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:28.0) Gecko/20100101 Fi


refox/28.0'

if __name__ == '__main__':

start_ts = datetime.datetime.now()

# Read URLs from file

with open('images.txt', 'r') as f:

urls = f.readlines()

11
# Initialize the queue

urlqueue = queue.Queue()

_ = [urlqueue.put(url.strip()) for url in urls]

# Start the threads

numthreads = 4

threads = []

for i in range(numthreads):

newThrd = HttpImgThread(urlqueue, name="Thread {:d}".format(i))

threads.append(newThrd)

newThrd.start()

# Wait on the queue

urlqueue.join()

end_ts = datetime.datetime.now()

duration = (end_ts-start_ts).total_seconds()

print("Total execution time: {:.6f} secs.".format(duration))

60

Here are a few interesting changes you can do to the example above:

 Use a LIFO queue and see how the order of processing changes: queue.LifoQueue()
 Use a priority queue and see how the order of processing changes: queue.PriorityQueue(). For this to work,
items have to added to the queue as tuples: (priority_num, url). Smaller numbers have higher priorities.
 Create the queue with a limit: queue.Queue(4). You will see that the main thread blocks indefinitely.
 Create the queue, start the threads and then add URLs to the queue. You will see that some child threads will
exit immediately. Results will also vary depending on the number of URLs.

The threads in the previous example can be said to be autonomous. They start working on items in the queue
and terminate when the queue is empty. This is possible because we know in advance what should go into
the queue. What happens if the URLs to be processed are determined dynamically after the threads have
already started? In that case, the above code will not work because threads will start and terminate as soon as
they see an empty queue. What about when the queue has a limit and the main thread attempts to put more
items into the queue? We can solve these scenarios using events.

12
import threading

import queue

import requests

import os

import shutil

import datetime

class HttpImgThread(threading.Thread):

def __init__(self, urlqueue, done_event, *args, **kwargs):

super().__init__(*args, **kwargs)

self.__urlqueue = urlqueue

self.__done_event = done_event

def run(self):

print("[{:s}] Starting ...".format(self.name))

while not self.__done_event.wait(timeout=0.5):

try:

url = self.__urlqueue.get(timeout=1)

if url:

self.__download_img(url)

self.__urlqueue.task_done()

except queue.Empty:

continue

print("[{:s}] Exiting.".format(self.name))

def __download_img(self, url):

"""Fetches raw contents given the image URL and saves it."""

print("[{:s}] Fetching {}".format(self.name, url))

rsp = requests.get(url, headers=self.__make_http_header(), stream=True)

imgname = os.path.basename(url)

13
with open(imgname, 'wb') as ofile:

print("[{:s}] Saving to {}".format(self.name, imgname))

shutil.copyfileobj(rsp.raw, ofile)

def __make_http_header(self):

return {

'Accept' : 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',

'Accept-Language' : 'en-us,en;q=0.5',

'User-Agent' : 'Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:28.0) Gecko/20100101 Fi


refox/28.0'

if __name__ == '__main__':

start_ts = datetime.datetime.now()

# Read URLs from file

with open('images.txt', 'r') as f:

urls = f.readlines()

# Create the queue and an event object

urlqueue = queue.Queue()

done_event = threading.Event()

# Start the threads

numthreads = 4

threads = []

for i in range(numthreads):

newThrd = HttpImgThread(urlqueue, done_event, name="Thread {:d}".format(i))

threads.append(newThrd)

newThrd.start()

14
# Add items to the queue

_ = [urlqueue.put(url.strip()) for url in urls]

# Wait on the queue

urlqueue.join()

done_event.set() # signal that nothing more to do

end_ts = datetime.datetime.now()

duration = (end_ts-start_ts).total_seconds()

print("Total execution time: {:.6f} secs.".format(duration))

61

TRUE MULTITHREADING
While true multithreading in the sense of parallel computing is not possible with the default Python
implementation, there are other implementations that allow this. JPython (Java) and IronPython (C#) are
alternatives that don't have the GIL. Unfortunately JPython is available only for Python 2. IronPython
supports Python 3 but community interest seems to be limited. IronPython3 also appears to be in
development mode. Going by their GitHub account, no formal releases have been made.

Building IronPython3 on Windows 10 is quite easy. Two things are needed to build it: .NET Framework,
version 3.5 SP1 and above; Microsoft Visual Studio. Using the Developer Command Prompt for VS,
navigate to the folder where IronPython has been unzipped. Type make to build. Add IronPython3 base path
+ '\bin\Debug' to your environment PATH variable. IronPython3 can then be launched by typing "ipy".
However, it's not usable! Trying to import many modules will fail!

This basically means that true multithreading is possible only with JPython or IronPython for Python 2. If
you really want to use all your CPU cores, the next best option (perhaps a better option) is to use multiple
processes.

PYTHON'S MULTIPROCESSING MODULE


Creating multiple processes is as simple as the following code:

import os, time

from multiprocessing import Pool

def sqr(x):

return os.getppid(), os.getpid(), x*x

15
if __name__ == "__main__":

with Pool(5) as p:

time.sleep(1)

print(p.map(sqr, range(3)))

62

Try a couple of interesting changes to the above code:

 Remove the sleep in the main thread. Are you getting different child PIDs? Explain.
 Without the sleep, increase the job queue to range(20). What happens to the child PIDs? Explain.

The Pool class creates a pool of processes, takes input data and assigns them to available processes in its
pool. If we wish to have greater control on individual processes, then the following example shows how to
do just that. The example also shows how processes can communicate using the multiprocessing.Queue class.

import time

import multiprocessing

def child(p2c, c2p):

while True:

msg = p2c.get()

if msg != "Here's some money. Enjoy yourself!":

reply = "I'm bored, Papa!"

print("Child:", reply)

c2p.put(reply)

else:

# Child wants to hear only one thing!

reply = "That's what I want to hear!"

print("Child:", reply)

c2p.put(reply)

break

if __name__ == '__main__':

16
p2c, c2p = multiprocessing.Queue(), multiprocessing.Queue() # two unidirectional queues

p = multiprocessing.Process(target=child, args=(p2c,c2p))

p.start()

print("Current children:", multiprocessing.active_children())

# Parent doesn't listen to child!

# Just loops through some standard statements

msgs = [

"How are you, my child?",

"What's the matter?",

"Don't you have exams?",

"Here's some money. Enjoy yourself!"

for msg in msgs:

time.sleep(1)

print("Papa:", msg)

p2c.put(msg)

ret = p.join(timeout=2)

if p.is_alive():

continue

elif ret:

print("Child exited unexpectedly. Exit code: {:d}".format(ret))

break

else:

print("Child exited normally.")

break

# It's okay to wait on a process that has already exited

p.join()

17
# Queue status

print("Parent to child: {} unread messages.", p2c.qsize())

print("Child to parent: {} unread messages.", c2p.qsize())

63

In the previous example we used two unidirectional queues. The alternative is to use
a multiprocessing.Pipe(), which encapsulate a pair of connections. By default, the connections are
bidirectional. Connections can be thought of as ends of the pipe connecting the two threads. In fact, pipes
can connect only two threads while queues can have multple producers and consumers. We rewrite the
above example in terms of a pipe.

import time

import multiprocessing

def child(pend, cend):

while True:

msg = cend.recv()

if msg != "Here's some money. Enjoy yourself!":

reply = "I'm bored, Papa!"

print("Child:", reply)

cend.send(reply)

else:

# Child wants to hear only one thing!

reply = "That's what I want to hear!"

print("Child:", reply)

cend.send(reply)

break

if __name__ == '__main__':

pend, cend = multiprocessing.Pipe() # parent and child ends

p = multiprocessing.Process(target=child, args=(pend,cend))

p.start()

18
print("Current children:", multiprocessing.active_children())

# Parent doesn't listen to child!

# Just loops through some standard statements

msgs = [

"How are you, my child?",

"What's the matter?",

"Don't you have exams?",

"Here's some money. Enjoy yourself!"

for msg in msgs:

time.sleep(1)

print("Papa:", msg)

pend.send(msg)

ret = p.join(timeout=2)

if p.is_alive():

continue

elif ret:

print("Child exited unexpectedly. Exit code: {:d}".format(ret))

break

else:

print("Child exited normally.")

break

# It's okay to wait on a process that has already exited

p.join()

64

SCHEDULING EVENTS
Event in this context could be threading.Event or anything important that happens within the system. The
scheduler schedules events to happen at a time in the future. When that event happens, an associated
19
function is called to process the event. We may call this processing function an event handler. In the field of
simulation, discrete event scheduling is perhaps the perfect example where the sched module becomes
useful. For example, if one wishes to simulate customers arriving at a post office that has three queues, the
arrivals are all events and can be managed by this module. The following example illustrates how to use
some functions of the module:

import sched, time

def event_handler(name="TASK 0"):

print("[{}] Inside handler. I am {}.".format(time.time(), name))

# Create scheduler

s = sched.scheduler(time.time, time.sleep)

now = time.time()

print("Current time: {}".format(now))

# Schedule event in relative time (now + 10 secs) with priority 1

s.enter(8, 1, event_handler) # use default name

# Schedule events in absolute time with differing priorities

future = int(now) + 4

s.enterabs(future, 2, event_handler, argument=("TASK 2",)) # positional arg

s.enterabs(future, 1, event_handler, kwargs={'name': "TASK 1"}) # keyword arg

# See what's queued up

print("Queued events: {}".format(s.queue))

print("Is the queue empty: {}".format(s.empty()))

# Start execution

s.run() # will block till all events have occurred

print("Is the queue empty: {}".format(s.empty()))


20
65

In the above example, what would happen if you try to schedule an event in the past? What would happen if
an event handler takes too long that the processing of the next event doesn't happen at the scheduled time?
These are scenarios you can try out to understand the working of the scheduler better.

MAKING SYSTEM CALLS USING SUBPROCESS


If you need to call built-in shell commands or other programs from within a Python program,
then subprocess is the module to look at. A separate process will be spawned by the OS and that process may
in turn start its own child processes to complete the task. The following code shows some example usage of
the module's functions:

import subprocess

import sys

import os

# Window dir command is built into the shell

# Output comes to stdout

subprocess.run(["dir", "/w", "*.py"], shell=True)

print("="*80)

# Connect to output and error streams that we can read

res = subprocess.run(["type", sys.argv[0], "DummyXyz"], shell=True,

stdout=subprocess.PIPE, stderr=subprocess.PIPE)

print("Call args:", res.args)

print("Return code:", res.returncode)

print("Result: {:s}".format(res.stdout.decode())) # byte -> str

print("Errors: {}".format(res.stderr.decode())) # byte -> str

print("="*80)

# Start another script and read the results

# This is not a built-in shell command: hence shell arg is not needed

with open('another.py', 'w') as f:

f.write('print("This is from another script!")')

res = subprocess.run(["python", "another.py"], stdout=subprocess.PIPE)

21
print("Result: {:s}".format(res.stdout.decode())) # byte -> str

os.remove('another.py')

print("="*80)

# Start a non-default console such as Windows PowerShell

# Get first 10 lines of this script: print with line numbers

cmd = "powershell Get-Content {} -TotalCount 10".format(sys.argv[0])

res = subprocess.run(cmd.split(), stdout=subprocess.PIPE)

for i, line in enumerate(res.stdout.decode().strip().split("\r\n")):

print("{:2d} {}".format(i+1, line))

66

We should note that the subprocess.run() module sits on top of the lower-level subprocess.Popen interface. By
design, subprocess.run() is blocking. If you require non-blocking execution, then subprocess.Popen() can be
used directly. The following shows how to use it:

import subprocess

import time

import psutil

# This must run on a folder with lots of content: many GB

cmd = "dir /w /S .."

# You can use your Task Manager (Windows) or System Monitor (Ubuntu)

# to monitor CPU load

# Recursively shows contents of the parent folder

# Terminate the child process before it has finished

# Process(es) started by the child to run the command will remain

# On some systems (Windows 10) the command will run in the child process

print("First implementation ...")

time.sleep(2)

child = subprocess.Pope n(cmd, shell=True)

22
time.sleep(1)

child.terminate()

child.wait()

# One way to terminate the entire process tree of a child

# Force the command to execute in the child process

# Note that exec is not available on Windows

# Error in initiating the child process will not terminate this process

print("Second implementation: using exec ...")

time.sleep(2)

child = subprocess.Pope n("exec " + cmd, shell=True)

time.sleep(1)

child.terminate()

child.wait()

# Another way to terminate the child's process tree

# Source: http://stackoverflow.com/questions/4789837/how-to-terminate-a-python-subprocess-launch
ed-with-shell-true

def killall(pid):

process = psutil.Process(pid)

for child in process.children(recursive=True):

child.kill()

process.kill()

print("Third implementation: using psutil ...")

time.sleep(2)

child = subprocess.Pope n(cmd, shell=True)

try:

child.wait(timeout=1)

except subprocess.TimeoutExpired:

killall(child.pid)

67

23

You might also like