Unit 3
Unit 3
UNIT – 3
Week - 7
When a module is imported within a Python file, the interpreter first searches for the specified
module among its built-in modules. If not found it looks through the list of directories defined
by sys.path.
importsys
print(sys.path)
Output:
• Python3
import sys
Output:
ModuleNotFoundError: No module named 'pandas'
sys.modules return the name of the Python modules that the current shell has imported.
Example:
• Python3
import sys
print(sys.modules)
Output:
2. Reference Count
The sys.getrefcount() method is used to get the reference count for any given object. This value
is used by Python as when this value becomes 0, the memory for that particular value is deleted.
Example:
Python3
importsys
a ='Geeks'
print(sys.getrefcount(a))
Output
Function Description
Parameters:
topdown: If this optional argument is True then the directories are scanned from top-down
otherwise from bottom-up. This is True by default.
Return Type: For each directory in the tree rooted at directory top (including top itself), it
yields a 3-tuple (dirpath, dirnames, filenames).
We want to list out all the subdirectories and file inside the directory Tree. Below is the
implementation.
import os
Output:
The above code can be shortened using List Comprehension which is a more Pythonic way.
Below is the implementation.
import os
Output:
➢ Most computers spend a lot of time doing nothing. If you start a system monitor tool and
watch the CPU utilization, it’s rare to see one hit 100 percent, even when you are running
multiple programs.
➢ There are just too many delays built into software: disk accesses, network traffic, database
queries, waiting for users to click a button, and so on.
➢ In fact, the majority of a modern CPU’s capacity is often spent in an idle state; faster chips
help speed up performance demand peaks, but much of their power can go largely unused.
➢ Early on in computing, programmers realized that they could tap into such unused
processing power by running more than one program at the same time.
➢ By dividing the CPU’s attention among a set of tasks, its capacity need not go to waste
while any given task is waiting for an external event to occur.
➢ The technique is usually called parallel processing (and sometimes “multiprocessing” or
even “multitasking”) because many tasks seem to be performed at once, overlapping and
parallel in time.
➢ It’s at the heart of modern operating systems, and it gave rise to the notion of multiple-
active-window computer interfaces we’ve all come to take for granted.
➢ Even within a single program, dividing processing into tasks that run in parallel can make
the overall system faster, at least as measured by the clock on your wall.
➢ Just as important is that modern software systems are expected to be responsive to users
regardless of the amount of work they must perform behind the scenes.
➢ It’s usually unacceptable for a program to stall while busy carrying out a request. Consider
an email-browser user interface, for example; when asked to fetch email from a server, the
program must download text from a server over a network.
➢ If you have enough email or a slow enough Internet link, that step alone can take minutes
to finish. But while the download task proceeds, the program as a whole shouldn’t stall—
it still must respond to screen redraws, mouse clicks, and so on.
➢ Parallel processing comes to the rescue here, too. By performing such long-running tasks
in parallel with the rest of the program, the system at large can remain responsive no matter
how busy some of its parts may be.
➢ Moreover, the parallel processing model is a natural fit for structuring such programs and
others; some tasks are more easily conceptualized and coded as components running as
independent, parallel entities.
➢ There are two fundamental ways to get tasks running at the same time in Python—process
forks and spawned threads. Functionally, both rely on underlying operating system
services to run bits of Python code in parallel.
➢ Procedurally, they are very different in terms of interface, portability, and communication.
For instance, at this writing direct process forks are not supported on Windows under
standard Python (though they are under Cygwin Python on Windows).
➢ By contrast, Python’s thread support works on all major platforms.
➢ Moreover, the os.spawn family of calls provides additional ways to launch programs in a
platform-neutral way that is similar to forks, and the os.popen and os.system calls
and subprocess module.
6. Forking Processes
➢ Forked processes are a traditional way to structure parallel tasks, and they are a
fundamental part of the Unix tool set.
➢ Forking is a straightforward way to start an independent program, whether it is different
from the calling program or not.
➢ Forking is based on the notion of copying programs: when a program calls the fork routine,
the operating system makes a new copy of that program and its process in memory and
starts running that copy in parallel with the original.
➢ Some systems don’t really copy the original program (it’s an expensive operation), but the
new copy works as if it were a literal copy.
➢ After a fork operation, the original copy of the program is called the parent process, and
the copy created by os.fork is called the child process.
➢ In general, parents can make any number of children, and children can create child
processes of their own; all forked processes run independently and in parallel under the
operating system’s control, and children may continue to run after their parent exits.
➢ This is probably simpler in practice than in theory, though.
➢ The Python script in Example 5-1 forks new child processes until you type the letter q at
the console.
These messages represent three forked child processes; the unique identifiers of all the
processes involved are fetched and displayed with the os.getpid call. A subtle point:
the child process function is also careful to exit explicitly with an os._exit call. We’ll
discuss this call in more detail later in this chapter, but if it’s not made, the child process
would live on after the child function returns (remember, it’s just a copy of the original
process).
The net effect is that the child would go back to the loop in parent and start forking children
of its own (i.e., the parent would have grandchildren). If you delete the exit call and rerun,
you’ll likely have to type more than one q to stop, because multiple processes are running
in the parent function.
In Example 5-1, each process exits very soon after it starts, so there’s little overlap in time.
Let’s do something slightly more sophisticated to better illustrate multiple forked processes
running in parallel. Example 5-2 starts up 5 copies of itself, each copy counting up to 5
with a one-second delay between iterations.
The time.sleep standard library call simply pauses the calling process for a number of
seconds (you can pass a floating-point value to pause for fractions of seconds).
for i in range(5):
pid = os.fork()
if pid != 0:
print('Process %d spawned' % pid) # in parent: continue
else:
counter(5) # else in child/new process
os._exit(0) # run function and exit
print('Main process exiting.') # parent need not wait
When run, this script starts 5 processes immediately and exits. All 5 forked processes check in
with their first count display one second later and every second thereafter. Notice that child
processes continue to run, even if the parent process that created them terminates:
[C:\...\PP4E\System\Processes]$ python fork-count.py
Process 4556 spawned
Process 3724 spawned
Process 6360 spawned
Process 6476 spawned
Process 6684 spawned
Main process exiting.
[4556] => 0
[3724] => 0
[6360] => 0
[6476] => 0
[6684] => 0
[4556] => 1
[3724] => 1
[6360] => 1
[6476] => 1
[6684] => 1
[4556] => 2
[3724] => 2
[6360] => 2
[6476] => 2
[6684] => 2
• The output of all of these processes shows up on the same screen, because all of them share
the standard output stream (and a system prompt may show up along the way, too).
• Technically, a forked process gets a copy of the original process’s global memory,
including open file descriptors.
• Because of that, global objects like files start out with the same values in a child process,
so all the processes here are tied to the same single stream.
• But it’s important to remember that global memory is copied, not shared; if a child process
changes a global object, it changes only its own copy. (As we’ll see, this works differently
in threads, the topic of the next section.)
In Examples 5-1 and 5-2, child processes simply ran a function within the Python program
and then exited. On Unix-like platforms, forks are often the basis of starting independently
running programs that are completely different from the program that performed
the fork call.
For instance, Example 5-3 forks new processes until we type q again, but child processes
run a brand-new program instead of calling a function in the same file.
Example 5-3. PP4E\System\Processes\fork-exec.py
"starts programs until you type 'q'"
import os
parm = 0
while True:
parm += 1
pid = os.fork()
if pid == 0: # copy process
os.execlp('python', 'python', 'child.py', str(parm)) # overlay program
assert False, 'error starting program' # shouldn't return
else:
print('Child is', pid)
if input() == 'q': break
• If you’ve done much Unix development, the fork/exec combination will probably look
familiar. The main thing to notice is the os.execlp call in this code. In a nutshell, this call
replaces (overlays) the program running in the current process with a brand new program.
• Because of that, the combination of os.fork and os.execlp means start a new process and
run a new program in that process—in other words, launch a new program in parallel with
the original program.
• os.exec call formats
• The arguments to os.execlp specify the program to be run by giving command-line
arguments used to start the program (i.e., what Python scripts know as sys.argv).
• If successful, the new program begins running and the call to os.execlp itself never returns
(since the original program has been replaced, there’s really nothing to return to).
• If the call does return, an error has occurred, so we code an assert after it that will always
raise an exception if reached.
• There are a handful of os.exec variants in the Python standard library; some allow us to
configure environment variables for the new program, pass command-line arguments in
different forms, and so on.
• All are available on both Unix and Windows, and they replace the calling program (i.e.,
the Python interpreter). exec comes in eight flavors, which can be a bit confusing unless
you generalize:
os.execv(program, commandlinesequence)
• The basic “v” exec form is passed an executable program’s name, along with a list or tuple
of command-line argument strings used to run the executable (that is, the words you would
normally type in a shell to start a program).
os.execl(program, cmdarg1, cmdarg2,... cmdargN)
• The basic “l” exec form is passed an executable’s name, followed by one or more
command-line arguments passed as individual function arguments. This is the same
as os.execv(program, (cmdarg1, cmdarg2,...)).
os.execlp
os.execvp
Adding the letter p to the execv and execl names means that Python will locate the executable’s
directory using your system search-path setting (i.e., PATH).
os.execle
os.execve
Adding a letter e to the execv and execl names means an extra, last argument is a dictionary
containing shell environment variables to send to the program.
os.execvpe
os.execlpe
Adding the letters p and e to the basic exec names means to use the search path and to accept a
shell environment settings dictionary.
• So when the script in Example 5-3 calls os.execlp, individually passed parameters specify
a command line for the program to be run on, and the word python maps to an executable
file according to the underlying system search-path setting environment variable (PATH).
• It’s as if we were running a command of the form python child.py 1 in a shell, but with a
different command-line argument on the end each time.
• Spawned child program
•
Just as when typed at a shell, the string of arguments passed to os.execlp by the fork-exec script
in Example 5-3 starts another Python program file, as shown in Example 5-4.
Example 5-4. PP4E\System\Processes\child.py
import os, sys
print('Hello from child', os.getpid(), sys.argv[1])
• Here is this code in action on Linux. It doesn’t look much different from the
original fork1.py, but it’s really running a new program in each forked process.
• More observant readers may notice that the child process ID displayed is the same in the
parent program and the launched child.py program; os.execlp simply overlays a program
in the same process:
[C:\...\PP4E\System\Processes]$ python fork-exec.py
Child is 4556
Hello from child 4556 1
Child is 5920
Hello from child 5920 2
Child is 316
Hello from child 316 3
q