0% found this document useful (0 votes)
22 views

Module 4

Uploaded by

Chinmay S
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Module 4

Uploaded by

Chinmay S
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Module-4 ( 8 Hrs)

Organizing Files:
• The shutil Module, Walking a Directory Tree, Compressing Files with
the zipfile Module
• Project: Renaming Files with American-Style Dates to European-Style
Dates
• Project: Backing Up a Folder into a ZIP File

Debugging:
• Raising Exceptions, Getting the Traceback as a String, Assertions,
Logging, IDLE‟s Debugger.

Al Sweigart,“Automate the Boring Stuff with Python”,1stEdition, No Starch Press, 2015


Chapters 9-10.
1
Organizing Files
Let us consider these things:
• Making copies of all PDF files (and only the PDF files) in every
subfolder of a folder.
• Removing the leading zeros in the filenames for every file in a
folder of hundreds of files named spam001.txt, spam002.txt,
spam003.txt, and so on
• Compressing the contents of several folders into one ZIP file
(which could be a simple backup system)

2
The shutil Module:

• The shutil (or shell utilities) module has functions to copy, move, rename, and delete
files in Python programs.

• import shutil

• Copying Files and Folders:


• The shutil module provides functions for copying files, as well as entire folders.

shutil.copy(source, destination)
• This function returns a string of the path of the copied file.
>>> import shutil, os
>>> os.chdir('C:\\')
>>> shutil.copy('C:\\spam.txt', 'C:\\delicious')
'C:\\delicious\\spam.txt'
>>> shutil.copy('eggs.txt', 'C:\\delicious\\eggs2.txt')
'C:\\delicious\\eggs2.txt'

3
shutil.copytree(source, destination):

will copy the folder at the path source, along with all of its files and subfolders,
to the folder at the path destination.

• The source and destination parameters are both strings.


• The function returns a string of the path of the copied folder.

>>> os.chdir('C:\\')
>>> shutil.copytree('C:\\bacon', 'C:\\bacon_backup')
'C:\\bacon_backup‘

4
Moving and Renaming Files and Folders:
shutil.move(source, destination)
will move the file or folder at the path source to the path destination and will return a string of the
absolute path of the new location.

• If destination points to a folder, the source file gets moved into destination and keeps its current
filename.

>>> import shutil


>>> shutil.move('C:\\bacon.txt', 'C:\\eggs')
'C:\\eggs\\bacon.txt‘
• Note: If there had been a bacon.txt file already in C:\eggs, it would have been overwritten.
• The destination path can also specify a filename.
>>> shutil.move('C:\\bacon.txt', 'C:\\eggs\\new_bacon.txt')
'C:\\eggs\\new_bacon.txt‘
• if there is no eggs folder, then move() will rename bacon.txt to a file named eggs.
• >>> shutil.move('C:\\bacon.txt', 'C:\\eggs')
'C:\\eggs'

5
• The folders that make up the destination must already exist, or else Python will
throw an exception.

>>> shutil.move('spam.txt', 'c:\\does_not_exist\\eggs\\ham')


Traceback (most recent call last):
File "C:\Python34\lib\shutil.py", line 521, in move
os.rename(src, real_dst)
FileNotFoundError: [WinError 3] The system cannot find the path
specified: 'spam.txt' -> 'c:\\does_not_exist\\eggs\\ham’

•Permanently Deleting Files and Folders:


• Delete a single file or a single empty folder with functions in the os
module, whereas to delete a folder and all of its contents, use the shutil
module.
• os.unlink(path) will delete the file at path.
• os.rmdir(path) will delete the folder at path.
This folder must be empty of any files or folders.

• shutil.rmtree(path) will remove the folder at path, and all files and
folders it contains will also be deleted.
6
• Python program that was intended to delete files that have the .txt file
extension but has a typo (highlighted in bold) that causes it to delete .rxt files
instead:
import os
for filename in os.listdir():
if filename.endswith('.rxt'):
os.unlink(filename)

• If programmer had any important files ending with .rxt, they would have
been accidentally, permanently deleted. Instead, should have first run the
program like this:

import os
for filename in os.listdir():
if filename.endswith('.rxt'):
#os.unlink(filename)
print(filename)

7
Safe Deletes with the send2trash Module:

• Python’s built-inshutil.rmtree() function irreversibly deletes files and


folders, it can be dangerous to use.
• A much better way to delete files and folders is with the third-party send2trash
module.
• Install this module by running pip install send2trash
• Using send2trash is much safer than Python’s regular delete functions, because it
will send folders and files to your computer’s trash or recycle bin instead of
permanently deleting them.

>>> import send2trash


>>> baconFile = open('bacon.txt', 'a') # creates the file
>>> baconFile.write('Bacon is not a vegetable.')
25
>>> baconFile.close()
>>> send2trash.send2trash('bacon.txt’)

• Note: send2trash() function can only send files to the recycle bin; it cannot pull files
out of it.
8
Walking a Directory Tree:
• Want to rename every file in some
folder and also every file in every
subfolder of that folder.

• That is, programmer want to walk


through the directory tree, touching
each file as you go.

• Let’s look at the C:\delicious folder


with its contents, shown in Figure 9-1.

9
import os
for folderName, subfolders, filenames in os.walk('C:\\delicious'):
print('The current folder is ' + folderName)
for subfolder in subfolders:
print('SUBFOLDER OF ' + folderName + ': ' + subfolder)
for filename in filenames:
print('FILE INSIDE ' + folderName + ': '+ filename)
print(‘’)
• The os.walk() function is passed a single string value: the path of a folder.
You can use os.walk() in a for loop statement to walk a directory tree, much
like how you can use the range() function to walk over a range of numbers.

• Unlike range(), the os.walk() function will return three values on each
iteration through the loop:

10
1. A string of the current folder’s name
2. A list of strings of the folders in the current folder
3. A list of strings of the files in the current folder

When programmer run this program, it will output the following:


The current folder is C:\delicious
SUBFOLDER OF C:\delicious: cats
SUBFOLDER OF C:\delicious: walnut
FILE INSIDE C:\delicious: spam.txt
The current folder is C:\delicious\cats
FILE INSIDE C:\delicious\cats: catnames.txt
FILE INSIDE C:\delicious\cats: zophie.jpg
The current folder is C:\delicious\walnut
SUBFOLDER OF C:\delicious\walnut: waffles
The current folder is C:\delicious\walnut\waffles
FILE INSIDE C:\delicious\walnut\waffles: butter.txt.

11
Compressing Files with the zipfile Module:
• Python programs can both create and open (or extract) ZIP files using
functions in the zipfile module.

Reading ZIP Files:


• To read the contents of a ZIP file, first create a ZipFile object (note
the capital letters Z and F).
• To create a ZipFile object, call the zipfile.ZipFile() function, passing
it a string of the .zip file’s filename.
• Note that zipfile is the name of the Python module, and ZipFile() is
the name of the function.

12
>>> import zipfile, os
>>> os.chdir('C:\\') # move to the folder with example.zip
>>> exampleZip = zipfile.ZipFile('example.zip')
>>> exampleZip.namelist()
['spam.txt', 'cats/', 'cats/catnames.txt', 'cats/zophie.jpg']
>>> spamInfo = exampleZip.getinfo('spam.txt')
>>> spamInfo.file_size
13908
>>> spamInfo.compress_size
3828
>>> 'Compressed file is %sx smaller!' % (round(spamInfo.file_size / spamInfo.compress_size, 2))
'Compressed file is 3.63x smaller!'
>>> exampleZip.close()

13
•Extracting from ZIP Files:

• The extractall() method for ZipFile objects extracts all the files and folders from a
ZIP file into the current working directory.

>>> import zipfile, os


>>> os.chdir('C:\\’) # move to the folder with example.zip
>>> exampleZip = zipfile.ZipFile('example.zip')
>>> exampleZip.extractall()
>>> exampleZip.close()

• After running this code, the contents of example.zip will be extracted to C:\. Optionally,
you can pass a folder name to extractall() to have it extract the files into a folder other
than the current working directory.
• If the folder passed to the extractall() method does not exist, it will be created.

14
The extract() method for ZipFile objects will extract a single file
from the ZIP file.

>>> exampleZip.extract('spam.txt')
'C:\\spam.txt‘

>>> exampleZip.extract('spam.txt', 'C:\\some\\new\\folders')


'C:\\some\\new\\folders\\spam.txt‘

• If this second argument is a folder that doesn’t yet exist, Python will create
the folder.

• The value that extract() returns is the absolute path to which the file was
extracted.
• >>> exampleZip.close()

15
Creating and Adding to ZIP Files:

• To create own compressed ZIP files, must open the ZipFile object in write
mode by passing 'w' as the second argument.

• When pass a path to the write() method of a ZipFile object, Python will
compress the file at that path and add it into the ZIP file.

• The write() method’s first argument is a string of the filename to add.

• The second argument is the compression type parameter, which tells the
computer what algorithm it should use to compress the files; always just set
this value to zipfile.ZIP_DEFLATED. (This specifies the deflate
compression algorithm, which works well on all types of data.)

16
>>> import zipfile
>>> newZip = zipfile.ZipFile('new.zip', 'w')
>>> newZip.write('spam.txt', compress_type=zipfile.ZIP_DEFLATED)
>>> newZip.close()

• This code will create a newZIP file named new.zip that has the compressed
contents of spam.txt.

• This will write to files, write mode will erase all existing contents of a ZIP
file.

• To simply add files to an existing ZIP file, pass 'a' as the second argument

to zipfile.ZipFile() to open the ZIP file in append mode.

17
Debugging
Raising Exceptions:
Python raises an exception whenever it tries to execute invalid code.

• Programmer can also raise your own exceptions in your code.


• Raising an exception is a way of saying, “Stop running the code in this function
and move the program execution to the except statement.”
• Exceptions are raised with a raise statement. In code, a raise statement consists of
the following:
• The raise keyword
• A call to the Exception() function
• A string with a helpful error message passed to the Exception() function

>>> raise Exception('This is the error message.')


Traceback (most recent call last):
File "<pyshell#191>", line 1, in <module>
raise Exception('This is the error message.')
Exception: This is the error message.

18
• Here we’ve defined a boxPrint() function that takes a character, a width, and
a height, and uses the character to make a little picture of a box with that
width and height. This box shape is printed to the console.

• Say we want the character to be a single character, and the width and height
to be greater than 2. We add if statements to raise exceptions if these
requirements aren’t satisfied.

• This program uses the except Exception as err form of the except statement .
If an Exception object is returned from boxPrint() , this except statement will
store it in a variable named err.

• The Exception object can then be converted to a string by passing it to str()


to produce a user friendly error message.

19
def boxPrint(symbol, width, height):
if len(symbol) != 1:
raise Exception('Symbol must be a single character string.')
if width <= 2:
raise Exception('Width must be greater than 2.')
if height <= 2:
raise Exception('Height must be greater than 2.')
print(symbol * width)
for i in range(height - 2):
print(symbol + (' ' * (width - 2)) + symbol)
print(symbol * width)

for sym, w, h in (('*', 4, 4), ('O', 20, 5), ('x', 1, 3), ('ZZ', 3, 3)):
try:
boxPrint(sym, w, h)
except Exception as err:
print('An exception happened: ' + str(err))

20
When run this boxPrint.py, the output will look like this:

21
Getting the Traceback as a String:
• When Python encounters an error, it produces a treasure trove of error
information called the traceback.
• The traceback includes the error message, the line number of the line that
caused the error, and the sequence of the function calls that led to the
error.
• This sequence of calls is called the call stack. Exception Traceback (most
recent call last)
Input In [8], in <cell line: 7>()
4 def bacon():
1. def spam(): 5 raise Exception('This is the
2. bacon() error message.')
----> 7 spam() Input
3. In [8], in spam()
4. def bacon(): 1 def spam(): ---->
2 bacon() Input
5. raise Exception('This is the error message.')
In [8], in bacon()
6. 4 def bacon(): ---->
5 raise Exception('This is the
7. spam() error message.')
Exception: This is the error
message.

22
When run errorExample.py, the output will look like this:
Traceback (most recent call last):
File "errorExample.py", line 7, in <module>
spam()
File "errorExample.py", line 2, in spam
bacon()
File "errorExample.py", line 5, in bacon
raise Exception('This is the error message.')
Exception: This is the error message.

From the traceback, error happened on line 5, in the bacon() function.


This particular call to bacon() came from line 2, in the spam() function, which in turn was
called on line 7.
In programs where functions can be called from multiple places, the call stack can help to
determine which call led to the error.

For example, instead of crashing program right when an exception occurs, Programmer can
write the traceback information to a log file and keep program running. Programmer
can look at the log file later, when programmer ready to debug program.

23
import traceback
try:
raise Exception('This is the error message.')
except:
errorFile = open('errorInfo.txt', 'w')
errorFile.write(traceback.format_exc()) // print stack traces
errorFile.close()
print('The traceback info was written to errorInfo.txt.')

Output:
116
The traceback info was written to errorInfo.txt.

• The 116 is the return value from the write() method, since 116 characters
were written to the file.
• The traceback text was written to errorInfo.txt.

24
Assertions:
• An assertion is a sanity check to make sure code isn’t doing
something obviously wrong. These sanity checks are performed by
assert statements.

• If the sanity check fails, then an AssertionError exception is raised. In


code, an assert statement consists of the following:

• The assert keyword


• A condition (that is, an expression that evaluates to True or False)
• A comma
• A string to display when the condition is False

25
Assertion

## Assertion without error message Average of Mark1: 22.0

def avg(marks): ------------------------------------------------------

assert len(marks)!=0, “Optional” AssertionError Traceback (most recent call last)

return sum(marks)/len(marks) Input In [10], in <cell line: 10>()


7 print("Average of Mark1:",avg(mark1)) 9
mark2=[] ---> 10 print("Average of
mark1=[11,22,33]
Mark2:",avg(mark2))
print("Average of Mark1:",avg(mark1))
Input In [10], in avg(marks) 2 def avg(marks):
----> 3 assert len(marks)!=0 4 return
mark2=[]
sum(marks)/len(marks)
print("Average of Mark2:",avg(mark2))
AssertionError:

26
>>> podBayDoorStatus = 'open'
>>> assert podBayDoorStatus == 'open', 'The pod bay doors
need to be "open".'
>>> podBayDoorStatus = 'I\'m sorry, Dave. I\'m afraid I
can't do that.''
>>> assert podBayDoorStatus == 'open', 'The pod bay doors
need to be "open".'
Traceback (most recent call last):
File "<pyshell#10>", line 1, in <module>
assert podBayDoorStatus == 'open', 'The pod bay doors need to
be "open".'
AssertionError: The pod bay doors need to be "open".

27
Using an Assertion in a Traffic Light Simulation

• Building a traffic light simulation program. The data structure


representing the stoplights at an intersection is a dictionary with
keys 'ns' and 'ew', for the stoplights facing north-south and
east-west, respectively.

• The values at these keys will be one of the strings 'green', 'yellow',
or 'red'.

• To start the project, write a switchLights() function, which will take


an intersection dictionary as an argument and switch the lights.

28
• These two variables will be for the intersections of Market Street and 2nd
Street, and Mission Street and 16th Street. To start the project, you want to
write a switchLights() function, which will take an intersection dictionary as
an argument and switch the lights.

• At first, you might think that switchLights() should simply switch each light
to the next color in the sequence: Any 'green' values should change to
'yellow', 'yellow' values should change to 'red', and 'red' values should change
to 'green'.

29
30
• But if while writing switchLights() programmer had added an assertion to check
that at least one of the lights is always red, you might have included the following
at the bottom of the function:
• assert 'red' in stoplight.values(), 'Neither light is red! ' + str(stoplight)
• With this assertion in place, your program would crash with this error message:

Traceback (most recent call last):


File "carSim.py", line 14, in <module>
switchLights(market_2nd)
File "carSim.py", line 13, in switchLights
assert 'red' in stoplight.values(), 'Neither light is red! ' + str(stoplight)
AssertionError: Neither light is red! {'ns': 'yellow', 'ew': 'green'}

31
• The important line here is the AssertionError. While your program crashing
is not ideal, it immediately points out that a sanity check failed:

• Neither direction of traffic has a red light, meaning that traffic could be going
both ways. By failing fast early in the program’s execution, you can save
yourself a lot of future debugging effort.

Disabling Assertions:

• Assertions can be disabled by passing the -O option when running Python.

• This is good for when you have finished writing and testing your program
and don’t want it to be slowed down by performing sanity checks.

• Assertions are for development, not the final product. It should be free of
bugs and not require the sanity checks.

32
Logging:
• Logging help to understand what’s happening in program and in what order its happening.

• Python’s logging module makes it easy to create a record of custom messages that to write.

• These log messages will describe when the program execution has reached the logging function call
and list any variables have specified at that point in time.

Using the logging Module:


• To enable the logging module to display log messages on screen as program runs, copy the
following to the top of the program (but under the #! python shebang line):

import logging

logging.basicConfig(level=logging.DEBUG, format=' %(asctime)s - %(levelname)s - %(message)s')

33
import logging
logging.basicConfig(level=logging.DEBUG, format=' %(asctime)s - %(levelname)s - %(message)s')

logging.debug('Start of program')

def factorial(n):
logging.debug('Start of factorial(%s%%)' % (n))
total = 1
for i in range(n + 1):
total *= i
logging.debug('i is ' + str(i) + ', total is ' + str(total))
logging.debug('End of factorial(%s%%)' % (n))
return total

print(factorial(5))
logging.debug('End of program')

34
• We use the logging.debug() function to print log information. This debug()
function will call basicConfig(), and a line of information will be printed.

• The output of this program looks like this:

2015-05-23 16:20:12,664 - DEBUG - Start of program


2015-05-23 16:20:12,664 - DEBUG - Start of factorial(5)
2015-05-23 16:20:12,665 - DEBUG - i is 0, total is 0
2015-05-23 16:20:12,668 - DEBUG - i is 1, total is 0
2015-05-23 16:20:12,670 - DEBUG - i is 2, total is 0
2015-05-23 16:20:12,673 - DEBUG - i is 3, total is 0
2015-05-23 16:20:12,675 - DEBUG - i is 4, total is 0
2015-05-23 16:20:12,678 - DEBUG - i is 5, total is 0
2015-05-23 16:20:12,680 - DEBUG - End of factorial(5)
0
2015-05-23 16:20:12,684 - DEBUG - End of program

35
• The factorial() function is returning 0 as the factorial of 5, which isn’t right. The for
loop should be multiplying the value in total by the numbers from 1 to 5.

• Change the for i in range(n + 1): line to for i in range(1, n + 1):, and run the
program again. The output will look like this:

2015-05-23 17:13:40,650 - DEBUG - Start of program


2015-05-23 17:13:40,651 - DEBUG - Start of factorial(5)
2015-05-23 17:13:40,651 - DEBUG - i is 1, total is 1
2015-05-23 17:13:40,654 - DEBUG - i is 2, total is 2
2015-05-23 17:13:40,656 - DEBUG - i is 3, total is 6
2015-05-23 17:13:40,659 - DEBUG - i is 4, total is 24
2015-05-23 17:13:40,661 - DEBUG - i is 5, total is 120
2015-05-23 17:13:40,661 - DEBUG - End of factorial(5)
120
2015-05-23 17:13:40,666 - DEBUG - End of program

36
Don’t Debug with print():
Typing

import logging
logging.basicConfig(level=logging.DEBUG, format= '%(asctime)s - %(levelname)s - %(message)s')

is somewhat unwieldy.

Programmer may want to use print() calls instead, but don’t use

Once programmer done debugging, end up spending a lot of time removing


print() calls from code for each log message.

You might even accidentally remove some print() calls that were being used for
nonlog messages.

37
Logging Levels:
Logging levels provide a way to categorize your log messages by importance.
There are five logging levels, described in below table from least to most important.
Messages can be logged at each level using a different logging function.

Level Logging Function Description


DEBUG logging.debug() The lowest level, Used for small details. Usually care
about messages only when diagnosing problems
INFO logging.info() Used to record information on general events in
program or confirm that things are working at their point
in the program
WARNING logging.warning() Used to indicate a potential problem that doesn’t
prevent the program from working but might do so in
the future
ERROR logging.error() Used to record on error that caused to program to fail
to do something
CRITICAL logging.critical() The highest level. Used to indicate a fatal error that
has caused or is about to caused the program to stop
running entirely 38
• The benefit of logging levels is that can change what priority of logging
message want to see.

• Passing logging.DEBUG to the basicConfig()

• Function’s level keyword argument will show messages from all the logging
levels (DEBUG being the lowest level).

• But after developing program some more, may be interested only in errors.

• In that case, programmer can set basicConfig()’s level argument to


logging.ERROR.

• This will show only ERROR and CRITICAL messages and skip the
DEBUG, INFO, and WARNING messages.

39
Disabling Logging:
• After debugged program, programmer probably don’t want all these log
messages cluttering the screen.

• The logging.disable() function disables these so that programmer don’t have


to go into program and remove all the logging calls by hand.

• Programmer pass logging.disable() a logging level, and it will suppress all


log messages at that level or lower.

• Want to disable logging entirely, just add


logging.disable(logging.CRITICAL) to program.

40
>>> import logging

>>> logging.basicConfig(level=logging.INFO, format=' %(asctime)s - %(levelname)s - %(message)s')

>>> logging.critical('Critical error! Critical error!')

2015-05-22 11:10:48,054 - CRITICAL - Critical error! Critical error!

>>> logging.disable(logging.CRITICAL)

>>> logging.critical('Critical error! Critical error!')

>>> logging.error('Error! Error!')

41
Logging to a File:

•Instead of displaying the log messages to the screen, you can


write them to a text file.

•The logging.basicConfig() function takes a filename keyword


argument.

import logging
logging.basicConfig(filename='myProgramLog.txt',
level=logging.DEBUG, format=‘ %(asctime)s -
%(levelname)s - %(message)s')
42
IDLE ’s Debugger:

• The debugger is a feature of IDLE that allows to execute program one line at
a time.

• The debugger will run a single line of code and then wait for programmer to
tell it to continue. By running program “under the debugger” like this,
programmer can take as much time as want to examine the values in the
variables at any given point during the program’s lifetime.

• This is a valuable tool for tracking down bugs.

43

You might also like