0% found this document useful (0 votes)
6 views

Unit 3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Unit 3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Python Programming

UNIT – 3
Week - 7

1. Working with Modules


The sys.path is a built-in variable within the sys module that returns the list of directories that
the interpreter will search for the required module.

When a module is imported within a Python file, the interpreter first searches for the specified
module among its built-in modules. If not found it looks through the list of directories defined
by sys.path.

Note: sys.path is an ordinary list and can be manipulated.

Example 1: Listing out all the paths


Python3

importsys
print(sys.path)

Output:

Example 2: Truncating the value of sys.path


• Python3
import sys

# Removing the values


sys.path = []
# importing pandas after removingvalues
import pandas

Output:
ModuleNotFoundError: No module named 'pandas'

sys.modules return the name of the Python modules that the current shell has imported.

Example:
• Python3

import sys
print(sys.modules)

Output:

2. Reference Count

The sys.getrefcount() method is used to get the reference count for any given object. This value
is used by Python as when this value becomes 0, the memory for that particular value is deleted.

Example:
Python3

importsys

a ='Geeks'

print(sys.getrefcount(a))

Output

3. More Functions in Python sys

Function Description

sys.setrecursionlimit() method is used to set the maximum depth


sys.setrecursionlimit() of the Python interpreter stack to the required limit.

sys.getrecursionlimit() method is used to find the current


recursion

sys.getrecursionlimit() limit of the interpreter or to find the maximum depth of the


method Python interpreter stack.

It is used for implementing debuggers, profilers and coverage


tools. This is thread-specific and must register the trace using
threading.settrace(). On a higher level, sys.settrace() registers the
sys.settrace() traceback to the Python interpreter

sys.setswitchinterval() sys.setswitchinterval() method is used to set the interpreter’s


method thread switch interval (in seconds).
Function Description

It fetches the largest value a variable of data type Py_ssize_t can


sys.maxsize() store.

maxint/INT_MAX denotes the highest value that can be


sys.maxint represented by an integer.

sys.getdefaultencoding() sys.getdefaultencoding() method is used to get the current default


method string encoding used by the Unicode implementation.

4. Directory traversal tool


os.walk() method of the OS module can be used for listing out all the directories. This method
basically generates the file names in the directory tree either top-down or bottom-up. For each
directory in the tree rooted at directory top (including top itself), it yields a 3 -tuple (dirpath,
dirnames, filenames).

• dirpath: A string that is the path to the directory


• dirnames: All the sub-directories from root.
• filenames: All the files from root and directories.

Syntax: os.walk(top, topdown=True, onerror=None, followlinks=False)

Parameters:

top: Starting directory for os.walk().

topdown: If this optional argument is True then the directories are scanned from top-down
otherwise from bottom-up. This is True by default.

onerror: It is a function that handles errors that may occur.

followlinks: This visits directories pointed to by symlinks, if set to True.

Return Type: For each directory in the tree rooted at directory top (including top itself), it
yields a 3-tuple (dirpath, dirnames, filenames).
We want to list out all the subdirectories and file inside the directory Tree. Below is the
implementation.

# Python program to list out


# all the sub-directories and files

import os

# List to store all


# directories
L = []

# Traversing through Test


for root, dirs, files in os.walk('Test'):

# Adding the empty directory to list


L.append((root, dirs, files))

print("List of all sub-directories and files:")


for i in L:
print(i)

Output:

List of all sub-directories and files:


('Test', ['B', 'C', 'D', 'A'], [])

('Test/B', [], [])

('Test/C', [], ['test2.txt'])

('Test/D', ['E'], [])

('Test/D/E', [], [])

('Test/A', ['A2', 'A1'], [])

('Test/A/A2', [], [])

('Test/A/A1', [], ['test1.txt'])

The above code can be shortened using List Comprehension which is a more Pythonic way.
Below is the implementation.

# Python program to list out


# all the sub-directories and files

import os

# List comprehension to enter


# all directories to list

L = [(root, dirs, files) for root, dirs, files, in os.walk('Test')]

print("List of all sub-directories and files:")


for i in L:
print(i)

Output:

List of all sub-directories and files:

('Test', ['B', 'C', 'D', 'A'], [])

('Test/B', [], [])

('Test/C', [], ['test2.txt'])


('Test/D', ['E'], [])

('Test/D/E', [], [])

('Test/A', ['A2', 'A1'], [])

('Test/A/A2', [], [])

('Test/A/A1', [], ['test1.txt'])

5. Parallel System Tools

➢ Most computers spend a lot of time doing nothing. If you start a system monitor tool and
watch the CPU utilization, it’s rare to see one hit 100 percent, even when you are running
multiple programs.
➢ There are just too many delays built into software: disk accesses, network traffic, database
queries, waiting for users to click a button, and so on.
➢ In fact, the majority of a modern CPU’s capacity is often spent in an idle state; faster chips
help speed up performance demand peaks, but much of their power can go largely unused.
➢ Early on in computing, programmers realized that they could tap into such unused
processing power by running more than one program at the same time.
➢ By dividing the CPU’s attention among a set of tasks, its capacity need not go to waste
while any given task is waiting for an external event to occur.
➢ The technique is usually called parallel processing (and sometimes “multiprocessing” or
even “multitasking”) because many tasks seem to be performed at once, overlapping and
parallel in time.
➢ It’s at the heart of modern operating systems, and it gave rise to the notion of multiple-
active-window computer interfaces we’ve all come to take for granted.
➢ Even within a single program, dividing processing into tasks that run in parallel can make
the overall system faster, at least as measured by the clock on your wall.
➢ Just as important is that modern software systems are expected to be responsive to users
regardless of the amount of work they must perform behind the scenes.
➢ It’s usually unacceptable for a program to stall while busy carrying out a request. Consider
an email-browser user interface, for example; when asked to fetch email from a server, the
program must download text from a server over a network.
➢ If you have enough email or a slow enough Internet link, that step alone can take minutes
to finish. But while the download task proceeds, the program as a whole shouldn’t stall—
it still must respond to screen redraws, mouse clicks, and so on.
➢ Parallel processing comes to the rescue here, too. By performing such long-running tasks
in parallel with the rest of the program, the system at large can remain responsive no matter
how busy some of its parts may be.
➢ Moreover, the parallel processing model is a natural fit for structuring such programs and
others; some tasks are more easily conceptualized and coded as components running as
independent, parallel entities.
➢ There are two fundamental ways to get tasks running at the same time in Python—process
forks and spawned threads. Functionally, both rely on underlying operating system
services to run bits of Python code in parallel.
➢ Procedurally, they are very different in terms of interface, portability, and communication.
For instance, at this writing direct process forks are not supported on Windows under
standard Python (though they are under Cygwin Python on Windows).
➢ By contrast, Python’s thread support works on all major platforms.
➢ Moreover, the os.spawn family of calls provides additional ways to launch programs in a
platform-neutral way that is similar to forks, and the os.popen and os.system calls
and subprocess module.

6. Forking Processes

➢ Forked processes are a traditional way to structure parallel tasks, and they are a
fundamental part of the Unix tool set.
➢ Forking is a straightforward way to start an independent program, whether it is different
from the calling program or not.
➢ Forking is based on the notion of copying programs: when a program calls the fork routine,
the operating system makes a new copy of that program and its process in memory and
starts running that copy in parallel with the original.
➢ Some systems don’t really copy the original program (it’s an expensive operation), but the
new copy works as if it were a literal copy.
➢ After a fork operation, the original copy of the program is called the parent process, and
the copy created by os.fork is called the child process.
➢ In general, parents can make any number of children, and children can create child
processes of their own; all forked processes run independently and in parallel under the
operating system’s control, and children may continue to run after their parent exits.
➢ This is probably simpler in practice than in theory, though.
➢ The Python script in Example 5-1 forks new child processes until you type the letter q at
the console.

Example 5-1. PP4E\System\Processes\fork1.py


"forks child processes until you type 'q'"
import os
def child():
print('Hello from child', os.getpid())
os._exit(0) # else goes back to parent loop
def parent():
while True:
newpid = os.fork()
if newpid == 0:
child()
else:
print('Hello from parent', os.getpid(), newpid)
if input() == 'q': break
parent()
• Python’s process forking tools, available in the os module, are simply thin wrappers over
standard forking calls in the system library also used by C language programs.
• To start a new, parallel process, call the os.fork built-in function.
• Because this function generates a copy of the calling program, it returns a different value
in each copy: zero in the child process and the process ID of the new child in the parent.
• Programs generally test this result to begin different processing in the child only; this script,
for instance, runs the child function in child processes only.
• Because forking is ingrained in the Unix programming model, this script works well on
Unix, Linux, and modern Macs. Unfortunately, this script won’t work on the standard
version of Python for Windows today, because fork is too much at odds with the Windows
model.
• Python scripts can always spawn threads on Windows, and the multiprocessing module
described later in this chapter provides an alternative for running processes portably, which
can obviate the need for process forks on Windows in contexts that conform to its
constraints (albeit at some potential cost in low-level control).
• The script in Example 5-1 does work on Windows, however, if you use the Python shipped
with the Cygwin system (or build one of your own from source-code with Cygwin’s
libraries). Cygwin is a free, open source system that provides full Unix-like functionality
for Windows (and is described further in More on Cygwin Python for Windows).
• You can fork with Python on Windows under Cygwin, even though its behavior is not
exactly the same as true Unix forks. Because it’s close enough for this book’s examples,
though, let’s use it to run our script live:

[C:\...\PP4E\System\Processes]$ python fork1.py


Hello from parent 7296 7920
Hello from child 7920

Hello from parent 7296 3988


Hello from child 3988

Hello from parent 7296 6796


Hello from child 6796
q

These messages represent three forked child processes; the unique identifiers of all the
processes involved are fetched and displayed with the os.getpid call. A subtle point:
the child process function is also careful to exit explicitly with an os._exit call. We’ll
discuss this call in more detail later in this chapter, but if it’s not made, the child process
would live on after the child function returns (remember, it’s just a copy of the original
process).
The net effect is that the child would go back to the loop in parent and start forking children
of its own (i.e., the parent would have grandchildren). If you delete the exit call and rerun,
you’ll likely have to type more than one q to stop, because multiple processes are running
in the parent function.
In Example 5-1, each process exits very soon after it starts, so there’s little overlap in time.
Let’s do something slightly more sophisticated to better illustrate multiple forked processes
running in parallel. Example 5-2 starts up 5 copies of itself, each copy counting up to 5
with a one-second delay between iterations.
The time.sleep standard library call simply pauses the calling process for a number of
seconds (you can pass a floating-point value to pause for fractions of seconds).

Example 5-2. PP4E\System\Processes\fork-count.py


"""
fork basics: start 5 copies of this program running in parallel with
the original; each copy counts up to 5 on the same stdout stream--forks
copy process memory, including file descriptors; fork doesn't currently
work on Windows without Cygwin: use os.spawnv or multiprocessing on
Windows instead; spawnv is roughly like a fork+exec combination;
"""
import os, time
def counter(count): # run in new process
for i in range(count):
time.sleep(1) # simulate real work
print('[%s] => %s' % (os.getpid(), i))

for i in range(5):
pid = os.fork()
if pid != 0:
print('Process %d spawned' % pid) # in parent: continue
else:
counter(5) # else in child/new process
os._exit(0) # run function and exit
print('Main process exiting.') # parent need not wait

When run, this script starts 5 processes immediately and exits. All 5 forked processes check in
with their first count display one second later and every second thereafter. Notice that child
processes continue to run, even if the parent process that created them terminates:
[C:\...\PP4E\System\Processes]$ python fork-count.py
Process 4556 spawned
Process 3724 spawned
Process 6360 spawned
Process 6476 spawned
Process 6684 spawned
Main process exiting.
[4556] => 0
[3724] => 0
[6360] => 0
[6476] => 0
[6684] => 0
[4556] => 1
[3724] => 1
[6360] => 1
[6476] => 1
[6684] => 1
[4556] => 2
[3724] => 2
[6360] => 2
[6476] => 2
[6684] => 2
• The output of all of these processes shows up on the same screen, because all of them share
the standard output stream (and a system prompt may show up along the way, too).
• Technically, a forked process gets a copy of the original process’s global memory,
including open file descriptors.
• Because of that, global objects like files start out with the same values in a child process,
so all the processes here are tied to the same single stream.
• But it’s important to remember that global memory is copied, not shared; if a child process
changes a global object, it changes only its own copy. (As we’ll see, this works differently
in threads, the topic of the next section.)

7. THE FORK/EXEC COMBINATION

In Examples 5-1 and 5-2, child processes simply ran a function within the Python program
and then exited. On Unix-like platforms, forks are often the basis of starting independently
running programs that are completely different from the program that performed
the fork call.
For instance, Example 5-3 forks new processes until we type q again, but child processes
run a brand-new program instead of calling a function in the same file.
Example 5-3. PP4E\System\Processes\fork-exec.py
"starts programs until you type 'q'"
import os
parm = 0
while True:
parm += 1
pid = os.fork()
if pid == 0: # copy process
os.execlp('python', 'python', 'child.py', str(parm)) # overlay program
assert False, 'error starting program' # shouldn't return
else:
print('Child is', pid)
if input() == 'q': break

• If you’ve done much Unix development, the fork/exec combination will probably look
familiar. The main thing to notice is the os.execlp call in this code. In a nutshell, this call
replaces (overlays) the program running in the current process with a brand new program.
• Because of that, the combination of os.fork and os.execlp means start a new process and
run a new program in that process—in other words, launch a new program in parallel with
the original program.
• os.exec call formats
• The arguments to os.execlp specify the program to be run by giving command-line
arguments used to start the program (i.e., what Python scripts know as sys.argv).
• If successful, the new program begins running and the call to os.execlp itself never returns
(since the original program has been replaced, there’s really nothing to return to).
• If the call does return, an error has occurred, so we code an assert after it that will always
raise an exception if reached.
• There are a handful of os.exec variants in the Python standard library; some allow us to
configure environment variables for the new program, pass command-line arguments in
different forms, and so on.
• All are available on both Unix and Windows, and they replace the calling program (i.e.,
the Python interpreter). exec comes in eight flavors, which can be a bit confusing unless
you generalize:
os.execv(program, commandlinesequence)
• The basic “v” exec form is passed an executable program’s name, along with a list or tuple
of command-line argument strings used to run the executable (that is, the words you would
normally type in a shell to start a program).
os.execl(program, cmdarg1, cmdarg2,... cmdargN)
• The basic “l” exec form is passed an executable’s name, followed by one or more
command-line arguments passed as individual function arguments. This is the same
as os.execv(program, (cmdarg1, cmdarg2,...)).

os.execlp
os.execvp

Adding the letter p to the execv and execl names means that Python will locate the executable’s
directory using your system search-path setting (i.e., PATH).

os.execle
os.execve

Adding a letter e to the execv and execl names means an extra, last argument is a dictionary
containing shell environment variables to send to the program.

os.execvpe
os.execlpe

Adding the letters p and e to the basic exec names means to use the search path and to accept a
shell environment settings dictionary.

• So when the script in Example 5-3 calls os.execlp, individually passed parameters specify
a command line for the program to be run on, and the word python maps to an executable
file according to the underlying system search-path setting environment variable (PATH).
• It’s as if we were running a command of the form python child.py 1 in a shell, but with a
different command-line argument on the end each time.
• Spawned child program

Just as when typed at a shell, the string of arguments passed to os.execlp by the fork-exec script
in Example 5-3 starts another Python program file, as shown in Example 5-4.
Example 5-4. PP4E\System\Processes\child.py
import os, sys
print('Hello from child', os.getpid(), sys.argv[1])
• Here is this code in action on Linux. It doesn’t look much different from the
original fork1.py, but it’s really running a new program in each forked process.
• More observant readers may notice that the child process ID displayed is the same in the
parent program and the launched child.py program; os.execlp simply overlays a program
in the same process:
[C:\...\PP4E\System\Processes]$ python fork-exec.py
Child is 4556
Hello from child 4556 1

Child is 5920
Hello from child 5920 2

Child is 316
Hello from child 316 3
q

You might also like