Module 4
Module 4
Organizing Files:
• The shutil Module, Walking a Directory Tree, Compressing Files with
the zipfile Module
• Project: Renaming Files with American-Style Dates to European-Style
Dates
• Project: Backing Up a Folder into a ZIP File
Debugging:
• Raising Exceptions, Getting the Traceback as a String, Assertions,
Logging, IDLE‟s Debugger.
2
The shutil Module:
• The shutil (or shell utilities) module has functions to copy, move, rename, and delete
files in Python programs.
• import shutil
shutil.copy(source, destination)
• This function returns a string of the path of the copied file.
>>> import shutil, os
>>> os.chdir('C:\\')
>>> shutil.copy('C:\\spam.txt', 'C:\\delicious')
'C:\\delicious\\spam.txt'
>>> shutil.copy('eggs.txt', 'C:\\delicious\\eggs2.txt')
'C:\\delicious\\eggs2.txt'
3
shutil.copytree(source, destination):
will copy the folder at the path source, along with all of its files and subfolders,
to the folder at the path destination.
>>> os.chdir('C:\\')
>>> shutil.copytree('C:\\bacon', 'C:\\bacon_backup')
'C:\\bacon_backup‘
4
Moving and Renaming Files and Folders:
shutil.move(source, destination)
will move the file or folder at the path source to the path destination and will return a string of the
absolute path of the new location.
• If destination points to a folder, the source file gets moved into destination and keeps its current
filename.
5
• The folders that make up the destination must already exist, or else Python will
throw an exception.
• shutil.rmtree(path) will remove the folder at path, and all files and
folders it contains will also be deleted.
6
• Python program that was intended to delete files that have the .txt file
extension but has a typo (highlighted in bold) that causes it to delete .rxt files
instead:
import os
for filename in os.listdir():
if filename.endswith('.rxt'):
os.unlink(filename)
• If programmer had any important files ending with .rxt, they would have
been accidentally, permanently deleted. Instead, should have first run the
program like this:
import os
for filename in os.listdir():
if filename.endswith('.rxt'):
#os.unlink(filename)
print(filename)
7
Safe Deletes with the send2trash Module:
• Note: send2trash() function can only send files to the recycle bin; it cannot pull files
out of it.
8
Walking a Directory Tree:
• Want to rename every file in some
folder and also every file in every
subfolder of that folder.
9
import os
for folderName, subfolders, filenames in os.walk('C:\\delicious'):
print('The current folder is ' + folderName)
for subfolder in subfolders:
print('SUBFOLDER OF ' + folderName + ': ' + subfolder)
for filename in filenames:
print('FILE INSIDE ' + folderName + ': '+ filename)
print(‘’)
• The os.walk() function is passed a single string value: the path of a folder.
You can use os.walk() in a for loop statement to walk a directory tree, much
like how you can use the range() function to walk over a range of numbers.
• Unlike range(), the os.walk() function will return three values on each
iteration through the loop:
10
1. A string of the current folder’s name
2. A list of strings of the folders in the current folder
3. A list of strings of the files in the current folder
11
Compressing Files with the zipfile Module:
• Python programs can both create and open (or extract) ZIP files using
functions in the zipfile module.
12
>>> import zipfile, os
>>> os.chdir('C:\\') # move to the folder with example.zip
>>> exampleZip = zipfile.ZipFile('example.zip')
>>> exampleZip.namelist()
['spam.txt', 'cats/', 'cats/catnames.txt', 'cats/zophie.jpg']
>>> spamInfo = exampleZip.getinfo('spam.txt')
>>> spamInfo.file_size
13908
>>> spamInfo.compress_size
3828
>>> 'Compressed file is %sx smaller!' % (round(spamInfo.file_size / spamInfo.compress_size, 2))
'Compressed file is 3.63x smaller!'
>>> exampleZip.close()
13
•Extracting from ZIP Files:
• The extractall() method for ZipFile objects extracts all the files and folders from a
ZIP file into the current working directory.
• After running this code, the contents of example.zip will be extracted to C:\. Optionally,
you can pass a folder name to extractall() to have it extract the files into a folder other
than the current working directory.
• If the folder passed to the extractall() method does not exist, it will be created.
14
The extract() method for ZipFile objects will extract a single file
from the ZIP file.
>>> exampleZip.extract('spam.txt')
'C:\\spam.txt‘
• If this second argument is a folder that doesn’t yet exist, Python will create
the folder.
• The value that extract() returns is the absolute path to which the file was
extracted.
• >>> exampleZip.close()
15
Creating and Adding to ZIP Files:
• To create own compressed ZIP files, must open the ZipFile object in write
mode by passing 'w' as the second argument.
• When pass a path to the write() method of a ZipFile object, Python will
compress the file at that path and add it into the ZIP file.
• The second argument is the compression type parameter, which tells the
computer what algorithm it should use to compress the files; always just set
this value to zipfile.ZIP_DEFLATED. (This specifies the deflate
compression algorithm, which works well on all types of data.)
16
>>> import zipfile
>>> newZip = zipfile.ZipFile('new.zip', 'w')
>>> newZip.write('spam.txt', compress_type=zipfile.ZIP_DEFLATED)
>>> newZip.close()
• This code will create a newZIP file named new.zip that has the compressed
contents of spam.txt.
• This will write to files, write mode will erase all existing contents of a ZIP
file.
• To simply add files to an existing ZIP file, pass 'a' as the second argument
17
Debugging
Raising Exceptions:
Python raises an exception whenever it tries to execute invalid code.
18
• Here we’ve defined a boxPrint() function that takes a character, a width, and
a height, and uses the character to make a little picture of a box with that
width and height. This box shape is printed to the console.
• Say we want the character to be a single character, and the width and height
to be greater than 2. We add if statements to raise exceptions if these
requirements aren’t satisfied.
• This program uses the except Exception as err form of the except statement .
If an Exception object is returned from boxPrint() , this except statement will
store it in a variable named err.
19
def boxPrint(symbol, width, height):
if len(symbol) != 1:
raise Exception('Symbol must be a single character string.')
if width <= 2:
raise Exception('Width must be greater than 2.')
if height <= 2:
raise Exception('Height must be greater than 2.')
print(symbol * width)
for i in range(height - 2):
print(symbol + (' ' * (width - 2)) + symbol)
print(symbol * width)
for sym, w, h in (('*', 4, 4), ('O', 20, 5), ('x', 1, 3), ('ZZ', 3, 3)):
try:
boxPrint(sym, w, h)
except Exception as err:
print('An exception happened: ' + str(err))
20
When run this boxPrint.py, the output will look like this:
21
Getting the Traceback as a String:
• When Python encounters an error, it produces a treasure trove of error
information called the traceback.
• The traceback includes the error message, the line number of the line that
caused the error, and the sequence of the function calls that led to the
error.
• This sequence of calls is called the call stack. Exception Traceback (most
recent call last)
Input In [8], in <cell line: 7>()
4 def bacon():
1. def spam(): 5 raise Exception('This is the
2. bacon() error message.')
----> 7 spam() Input
3. In [8], in spam()
4. def bacon(): 1 def spam(): ---->
2 bacon() Input
5. raise Exception('This is the error message.')
In [8], in bacon()
6. 4 def bacon(): ---->
5 raise Exception('This is the
7. spam() error message.')
Exception: This is the error
message.
22
When run errorExample.py, the output will look like this:
Traceback (most recent call last):
File "errorExample.py", line 7, in <module>
spam()
File "errorExample.py", line 2, in spam
bacon()
File "errorExample.py", line 5, in bacon
raise Exception('This is the error message.')
Exception: This is the error message.
For example, instead of crashing program right when an exception occurs, Programmer can
write the traceback information to a log file and keep program running. Programmer
can look at the log file later, when programmer ready to debug program.
23
import traceback
try:
raise Exception('This is the error message.')
except:
errorFile = open('errorInfo.txt', 'w')
errorFile.write(traceback.format_exc()) // print stack traces
errorFile.close()
print('The traceback info was written to errorInfo.txt.')
Output:
116
The traceback info was written to errorInfo.txt.
• The 116 is the return value from the write() method, since 116 characters
were written to the file.
• The traceback text was written to errorInfo.txt.
24
Assertions:
• An assertion is a sanity check to make sure code isn’t doing
something obviously wrong. These sanity checks are performed by
assert statements.
25
Assertion
26
>>> podBayDoorStatus = 'open'
>>> assert podBayDoorStatus == 'open', 'The pod bay doors
need to be "open".'
>>> podBayDoorStatus = 'I\'m sorry, Dave. I\'m afraid I
can't do that.''
>>> assert podBayDoorStatus == 'open', 'The pod bay doors
need to be "open".'
Traceback (most recent call last):
File "<pyshell#10>", line 1, in <module>
assert podBayDoorStatus == 'open', 'The pod bay doors need to
be "open".'
AssertionError: The pod bay doors need to be "open".
27
Using an Assertion in a Traffic Light Simulation
• The values at these keys will be one of the strings 'green', 'yellow',
or 'red'.
28
• These two variables will be for the intersections of Market Street and 2nd
Street, and Mission Street and 16th Street. To start the project, you want to
write a switchLights() function, which will take an intersection dictionary as
an argument and switch the lights.
• At first, you might think that switchLights() should simply switch each light
to the next color in the sequence: Any 'green' values should change to
'yellow', 'yellow' values should change to 'red', and 'red' values should change
to 'green'.
29
30
• But if while writing switchLights() programmer had added an assertion to check
that at least one of the lights is always red, you might have included the following
at the bottom of the function:
• assert 'red' in stoplight.values(), 'Neither light is red! ' + str(stoplight)
• With this assertion in place, your program would crash with this error message:
31
• The important line here is the AssertionError. While your program crashing
is not ideal, it immediately points out that a sanity check failed:
• Neither direction of traffic has a red light, meaning that traffic could be going
both ways. By failing fast early in the program’s execution, you can save
yourself a lot of future debugging effort.
Disabling Assertions:
• This is good for when you have finished writing and testing your program
and don’t want it to be slowed down by performing sanity checks.
• Assertions are for development, not the final product. It should be free of
bugs and not require the sanity checks.
32
Logging:
• Logging help to understand what’s happening in program and in what order its happening.
• Python’s logging module makes it easy to create a record of custom messages that to write.
• These log messages will describe when the program execution has reached the logging function call
and list any variables have specified at that point in time.
import logging
33
import logging
logging.basicConfig(level=logging.DEBUG, format=' %(asctime)s - %(levelname)s - %(message)s')
logging.debug('Start of program')
def factorial(n):
logging.debug('Start of factorial(%s%%)' % (n))
total = 1
for i in range(n + 1):
total *= i
logging.debug('i is ' + str(i) + ', total is ' + str(total))
logging.debug('End of factorial(%s%%)' % (n))
return total
print(factorial(5))
logging.debug('End of program')
34
• We use the logging.debug() function to print log information. This debug()
function will call basicConfig(), and a line of information will be printed.
35
• The factorial() function is returning 0 as the factorial of 5, which isn’t right. The for
loop should be multiplying the value in total by the numbers from 1 to 5.
• Change the for i in range(n + 1): line to for i in range(1, n + 1):, and run the
program again. The output will look like this:
36
Don’t Debug with print():
Typing
import logging
logging.basicConfig(level=logging.DEBUG, format= '%(asctime)s - %(levelname)s - %(message)s')
is somewhat unwieldy.
Programmer may want to use print() calls instead, but don’t use
You might even accidentally remove some print() calls that were being used for
nonlog messages.
37
Logging Levels:
Logging levels provide a way to categorize your log messages by importance.
There are five logging levels, described in below table from least to most important.
Messages can be logged at each level using a different logging function.
• Function’s level keyword argument will show messages from all the logging
levels (DEBUG being the lowest level).
• But after developing program some more, may be interested only in errors.
• This will show only ERROR and CRITICAL messages and skip the
DEBUG, INFO, and WARNING messages.
39
Disabling Logging:
• After debugged program, programmer probably don’t want all these log
messages cluttering the screen.
40
>>> import logging
>>> logging.disable(logging.CRITICAL)
41
Logging to a File:
import logging
logging.basicConfig(filename='myProgramLog.txt',
level=logging.DEBUG, format=‘ %(asctime)s -
%(levelname)s - %(message)s')
42
IDLE ’s Debugger:
• The debugger is a feature of IDLE that allows to execute program one line at
a time.
• The debugger will run a single line of code and then wait for programmer to
tell it to continue. By running program “under the debugger” like this,
programmer can take as much time as want to examine the values in the
variables at any given point during the program’s lifetime.
43