0% found this document useful (0 votes)
7 views

Module4 Organising Files and debugging

Module-4 covers organizing files using the shutil and zipfile modules, including methods for copying, moving, and compressing files. It also addresses debugging techniques such as raising exceptions, using assertions, and logging. The module includes practical projects for renaming files and backing up folders, along with detailed code examples for each topic.

Uploaded by

priyatraj04
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Module4 Organising Files and debugging

Module-4 covers organizing files using the shutil and zipfile modules, including methods for copying, moving, and compressing files. It also addresses debugging techniques such as raising exceptions, using assertions, and logging. The module includes practical projects for renaming files and backing up folders, along with detailed code examples for each topic.

Uploaded by

priyatraj04
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Module-4: Organizing Files and Debugging

Module-4: Organising Files and Debugging

Prepared by: Dr. B Latha Shankar, Associate Professor, IEM Department, SIT Tumakuru

Syllabus: Organizing Files: The shutil Module, Walking a Directory Tree, Compressing Files with the zipfile Module,
Project: Renaming Files with American-Style Dates to European-Style Dates, Project: Backing Up a Folder into a ZIP
File
Debugging: Raising Exceptions, Getting the Traceback as a String, Assertions, Logging, IDLE’s Debugger.

Textbook 1: Chapters 9-10 08 hrs.

Organizing Files
The shutil Module:
 The shutil (or shell utilities) module helps in automating the process of copying, moving,
renaming and deleting files and directories in Python.
 To use the shutil functions, import shutil.
 shutil methods:
1. shutil.copy(source, destination) :
 It will copy the file at the source path to the folder at the destination path .
 Both source and destination can be strings /Path objects.
 If destination is a filename, it will be used as the new name of the copied file.
 This function returns, a string /Path object of the copied file
 How shutil.copy() works?:
>>> import shutil, os #imports shutil and os mocules
>>> from pathlib import Path #imports pathlib module
>>> p = Path.home() # Assigns path object of home directory to variable p.
>>> shutil.copy(p / 'spam.txt', p / 'some_folder') #copies file ‘spam.txt’ at
#source to destination folder ‘some_folder’
O/P: 'C:\\Users\\Al\\some_folder\\spam.txt‘
# returns absolute path of new location of file ‘spam.txt’as a string
>>> shutil.copy(p / 'eggs.txt', p / 'some_folder/eggs2.txt')# ‘eggs2.txt’ is new
#name of copied file.
O/P: WindowsPath('C:/Users/Al/some_folder/eggs2.txt')
#returns absolute path of new location of file ‘eggs2.txt’as a string

2. shutil.copytree(source, destination):
 It will copy folder at source path, along with all of its files and subfolders, to the folder at
the destination path.
 The function returns a string of the path of the copied folder.
 How shutil.copytree() works?:
>>> import shutil, os #imports shutil and os mocules
>>> from pathlib import Path #imports pathlib module
>>> p = Path.home() # Assigns path object of home directory to variable p.
>>> shutil.copytree(p / 'spam', p / 'spam_backup') #copies folder ‘spam’ at
#source to destination folder ‘spam_backup’
O/P: WindowsPath('C:/Users/Al/spam_backup)
#returns absolute path of new location of created folder spam_backup as a string

3. shutil.move(source, destination):
 It will move the file or folder at the path source to the path destination
4-1
Module-4: Organizing Files and Debugging

 It will return a string of the absolute path of the new location.


 How shutil.move() works?:
>>> import shutil #imports shutil mocule
>>> shutil.move('C:\\bacon.txt', 'C:\\eggs') #copies folder ‘spam’ at
#source to destination folder ‘spam_backup’
O/P: 'C:\\eggs\\bacon.txt'
#returns absolute path of new location of created folder spam_backup as a string
 Two Possibilities:
o 1. C:\eggs (destination) is a folder: “Move C:\bacon.txt into the folder C:\eggs.”
o 2. C:\eggs (destination) is a file: “C:\ bacon.txt overwrites the content of C:\eggs”
- If destination is a file original data gets erased.
 The folders that make up the destination must already exist, else Python will throw an
exception:
>>> shutil.move('spam.txt', 'c:\\does_not_exist\\eggs\\ham')#Destination doesn’t exist
O/P: Traceback (most recent call last):# No destination found so throws error
--snip--
FileNotFoundError: [Errno 2] No such file or directory: 'c:\\does_not_exist\\
eggs\\ham'

4. os.unlink(path) : It will delete the file at path permanently.


5. os.rmdir(path): It will delete the folder at path. This folder must be empty of any files or
folders.
6. shutil.rmtree(path): It will remove the folder at path, and all files and folders it contains
will also be deleted.
(-Above three commands irreversibly delete files and folders, they can be dangerous to use)
7. send2trash (filename): will send folders /files to your computer’s recycle bin instead of
permanently deleting them, which can be restored later.
>>> import send2trash #imports the module ‘send2trash’
>>> baconFile = open('bacon.txt', 'a') # opens file ‘bacon.txt’ in ‘append’ mode
>>> baconFile.write('Bacon is not a vegetable.')# writes to file bacon.txt
25
>>> baconFile.close() # closes the file bacon.txt
>>> send2trash.send2trash('bacon.txt') #deletes file bacon.txt and sends it to trash
# It will not free up disk space

8. Walking a Directory Tree:


 Use: When you want to walk through the directory tree, touching each file as you go.
 Ex: want to rename every file in some folder & also every file in every subfolder of that folder.
 pass a single string value: the path of a folder, to os.walk() function
 use os.walk() in a for loop to walk a directory tree, like how range() function is used to
walk over a range of numbers.
 Unlike range(),os.walk() function will return 3 values on each iteration through the loop:
o A string of the current folder’s name
o A list of strings of the folders in current folder
o A list of strings of the files in the current folder
 current folder = folder for the current iteration of the ‘for loop’.

4-2
Module-4: Organizing Files and Debugging

import os
for folderName, subfolders, filenames in os.walk('C:\\delicious'):
print('The current folder is ' + folderName)
for subfolder in subfolders:
print('SUBFOLDER OF ' + folderName + ': ' + subfolder)
for filename in filenames:
print('FILE INSIDE ' + folderName + ': '+ filename)
print(' ')

O/P:
The current folder is C:\delicious
SUBFOLDER OF C:\delicious: cats
SUBFOLDER OF C:\delicious: walnut
FILE INSIDE C:\delicious: spam.txt

The current folder is C:\delicious\cats


FILE INSIDE C:\delicious\cats: catnames.txt
FILE INSIDE C:\delicious\cats: zophie.jpg

The current folder is C:\delicious\walnut


SUBFOLDER OF C:\delicious\walnut: waffles

The current folder is C:\delicious\walnut\waffles


FILE INSIDE C:\delicious\walnut\waffles: butter.txt.

Compressing Files with the zipfile Module:


 What is a zip file?
 ZIP file format is used for lossless data compression.
 Lossless compression algorithm allows the original data to be perfectly unzipped from the
compressed data.
 ZIP file is an ideal way to make large files smaller and keep related files together.
 Why do we need zip files?
 To reduce storage requirements.
 To improve transfer speed over internet.
 Since a ZIP file can contain multiple files and subfolders, it’s a handy way to package several
files into one
 To create and read the contents of a ZIP file:
1. First create a ZipFile object (note the capital letters Z and F). ZipFile object returns a zip file
similar to how an open() function returns File object.
2. Note that zipfile is the name of the Python module, and ZipFile() is the name of the
function.
3. In the program presented below:
 *1 Creates a zip file named ‘example.zip’ of the path ‘p’ passed to it and assigns it to
variable ‘exampleZip’
 *2  ‘namelist()’ method of ZipFile object returns all the files and folders contained in
the ZIP file as a list of strings.

4-3
Module-4: Organizing Files and Debugging

 *3.  ‘getinfo()’ object returns a ‘ZipInfo’ object about ‘ZipFile’ object. While a
‘ZipFile’ object represents an entire archive file, a ‘ZipInfo’ object holds useful information
about ‘ZipFile’ object. ZipInfo() object has attributes, such as file_size and
compress_size.

 *4.  Calculates how efficiently example.zip is compressed by dividing the original file size by
the compressed file size and prints this information.
>>> import zipfile, os Contents of
>>> from pathlib import Path example.zip
>>> p = Path.home() #Assign path of home folder to variable p
*1. >>> exampleZip = zipfile.ZipFile(p / 'example.zip')
*2. >>> exampleZip.namelist()
O/P:['spam.txt','cats/','cats/catnames.txt', 'cats/zophie.jpg']
*3. >>> spamInfo = exampleZip.getinfo('spam.txt')
>>> spamInfo.file_size # returns original file size in bytes
O/P: 13908
>>> spamInfo.compress_size # returns compressed file size in bytes
O/P: 3828
*4. >>> f'Compressed file is {round(spamInfo.file_size /
spamInfo.compress_size, 2)}x smaller!'
O/P:'Compressed file is 3.63x smaller!'
>>> exampleZip.close() # closes exampleZip

Reading ZIP Files:


 1. extractall() method: extracts all files and folders from a ZIP file into current working
directory:
>>> import zipfile, os
>>> from pathlib import Path
>>> p = Path.home()
>>> exampleZip = zipfile.ZipFile(p / 'example.zip') # Creates a zip file named
‘example.zip’ of path ‘p’ passed to it and assigns it to variable ‘exampleZip’
>>> exampleZip.extractall()#extracts all files & folders from exampleZIP file into CWD
>>> exampleZip.close()#closes exampleZip

 After running this code, the contents of ‘example.zip’ will be extracted to C:\, CWD.
 Extract the files from example.zip into a newly created ‘C:\delicious’ folder, using:
>>> exampleZip.extractall('C:\\delicious')#extracts from exampleZIP file into
‘C:\delicious’ folder

 2. extract() method: extracts a single file from the ZIP file:


>>> exampleZip.extract('spam.txt') #extracts single file ‘spam.txt’ to ‘C:\’
O/P:'C:\\spam.txt'
>>> exampleZip.extract('spam.txt', 'C:\\some\\new\\folders') #extracts ‘spam.txt’ to
# specified ‘folders’ other than CWD. If ‘folders’ doesn’t exist, Python will create it
O/P:'C:\\some\\new\\folders\\spam.txt'
>>> exampleZip.close()

Creating and Adding to ZIP Files:


 To create our own compressed ZIP files, one must open the ZipFile object in 'w‘ write mode:
>>> import zipfile
>>> newZip = zipfile.ZipFile('new.zip', 'w') #opens ‘new.zip’ zip file object in
#write mode and assigns it to a variable ‘newZip’
>>> newZip.write('spam.txt', compress_type=zipfile.ZIP_DEFLATED)
>>> newZip.close()

4-4
Module-4: Organizing Files and Debugging

 When you pass a path to the ‘write()’ method of a ZipFile object, Python will compress the file
at that path and add it into the ZIP file.
 The ‘write()’ method’s first argument is a string of the filename to add.
 The second argument is the compression type parameter, which tells the computer what
algorithm it should use to compress the files; always just set this value to
‘zipfile.ZIP_DEFLATED’
 This code will create a new ZIP file named ‘new.zip’ that has compressed contents of ‘spam.txt’
 Write mode will erase all existing contents of a ZIP file.
 Pass 'a' as the second argument to ‘zipfile.ZipFile()’ to open the ZIP file in append mode.

Debugging
Raising Exceptions:
 Even if a statement in Python is syntactically correct, it may cause an error when an attempt is
made to execute it. Errors detected during execution are called exceptions.
 Python raises an exception whenever it tries to execute invalid code.
 Python handles theses exceptions with try and except statements so that program can recover
from exceptions that were anticipated.

How to raise user-defined exceptions in code?


 Exceptions are raised with a raise statement. A raise statement consists of :
 The raise keyword
 A call to the Exception() function
 A string with a helpful error message passed to the Exception() function
Ex:
>>> raise Exception('This is the error message.')
Traceback (most recent call last):
File "<pyshell#191>", line 1, in <module>
raise Exception('This is the error message.')
Exception: This is the error message.

 If there are no try and except statements covering the raise statement that raised the exception,
the program simply crashes and displays the exception’s error message.
 A raise statement is seen inside a function and the try and except statements in the code calling
the function.
Ex: A boxPrint() function is defined that takes a character, a width, and a height, and uses the character
to make a little picture of a box with that width & height.
def boxPrint(symbol, width, height): # defines ‘boxprint’, receives 3 arguments
if len(symbol) != 1:
➊ raise Exception('Symbol must be a single character string.')
if width <= 2:
➋ raise Exception('Width must be greater than 2.')
if height <= 2:
➌ raise Exception('Height must be greater than 2.')
print(symbol * width)
for i in range(height - 2):
print(symbol + (' ' * (width - 2)) + symbol)
print(symbol * width)
for sym, w, h in (('*', 4, 4), ('O', 20, 5), ('x', 1, 3), ('ZZ', 3, 3)):
try:
4-5
Module-4: Organizing Files and Debugging

boxPrint(sym, w, h) #calls function ‘boxprint’ passing arguments sym,w, h


except Exception as err:
➍ print('An exception happened: ' + str(err))

Explanation:

 This code uses a function named ‘boxPrint()’to print box shape on screen.
 Conditions are the character should be a single character, and the width and height to be greater
than 2.
 Statements are included to raise exceptions if these requirements aren’t satisfied.
 When boxPrint() is called with various arguments, try/except will handle invalid arguments.
 If an Exception object is returned from boxPrint() ➊ ➋ ➌, except statement will store it in a
variable named err.
 We can then convert the Exception object to a string by passing it to str() to produce a
userfriendly error message ➍.

Output:
****
* *
* *
****
OOOOOOOOOOOOOOOOOOOO
O O
O O
O O
OOOOOOOOOOOOOOOOOOOO
An exception happened: Width must be greater than 2.
An exception happened: Symbol must be a single character string.

Getting the Traceback as a String


 When Python encounters an error, it produces error information called the ‘traceback’.
 Components of ‘traceback’: the error message, the line number of the line that caused the
error, and the sequence of the function calls that led to the error. This sequence of calls is called
the call stack.
Ex:
Line
1.
2. def spam():
3. bacon()
4. def bacon():
5. raise Exception('This is the error message.')
6.
7. spam()

Output:
Traceback (most recent call last):
File "errorExample.py", line 7, in <module> spam()
File "errorExample.py", line 2, in spam bacon()
File "errorExample.py", line 5, in bacon
raise Exception('This is the error message.')
Exception: This is the error message.

 From traceback, one can see that the error happened on line 5, in the bacon() function.
 This particular call to bacon() came from line 2, in the spam() function, which in turn was called
on line 7.
4-6
Module-4: Organizing Files and Debugging

 In programs where functions can be called from multiple places, call stack can help to determine
which call led to the error.
 Python displays the traceback exception as a string by calling traceback.format_exc().
 Need to import Python’s traceback module for using this function.
 For example, instead of crashing the program right when an exception occurs, write the
traceback information to a text file and keep your program running.
 You can look at the text file later, when you’re ready to debug your program.
Ex:
>>> import traceback
>>> try:
... raise Exception('This is the error message.')
except:
... errorFile = open('errorInfo.txt', 'w')
... errorFile.write(traceback.format_exc())
... errorFile.close()
... print('The traceback info was written to errorInfo.txt.')

Output:
111 # write() method returns 111, since 111 aracters were written to file.
The traceback info was written to errorInfo.txt.

 The traceback text was written to errorInfo.txt, as:


Traceback (most recent call last):
File "<pyshell#28>", line 2, in <module>
Exception: This is the error message

Assertions:
 An assertion is a sanity check to make sure code isn’t doing something obviously wrong.
 If the sanity check fails, then an AssertionError exception is raised.
 An assert statement consists of the following:
 The assert keyword
 A condition ( an expression that evaluates to True or False)
 A comma
 A string to display when the condition is False
 An assert statement says, “I assert that the condition holds true, and if not, there is a bug
somewhere, so immediately stop the program.”
Ex:
>>> ages = [26, 57, 92, 54, 22, 15, 17, 80, 47, 73]
>>> ages.sort()
>>> ages
O/P: [15, 17, 22, 26, 47, 54, 57, 73, 80, 92]
>>> assert ages[0] <= ages[-1] # Assert that the first age is <= the last age.

 The assert statement here asserts that the first item in ages should be less than or equal to the
last one.
 Because the ages[0] <= ages[-1] expression evaluates to True, the assert statement does nothing.
 Suppose if bug is present in code - ex:
 Say we accidentally called the reverse() method instead of the sort() method.
 Then assert statement raises an AssertionError:
>>> ages = [26, 57, 92, 54, 22, 15, 17, 80, 47, 73]
>>> ages.reverse()
4-7
Module-4: Organizing Files and Debugging

>>> ages
O/P: [73, 47, 80, 17, 15, 22, 54, 92, 57, 26]
>>> assert ages[0] <= ages[-1] # Assert that the first age is <= the last age.
O/P: Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AssertionError
 Assertions – Advantages:By “failing fast” using Assertion statement, the time between the
original cause of the bug and when you first notice the bug gets shortened. This will reduce
amount of code you will have to check before finding bug’s cause.
 Code should not handle assert statements with try and except; if an assert fails, program should
crash.
 Assertions are for programmer errors, not user errors.
 Assertions should only fail while the program is under development; a user should never see an
assertion error in a finished program.
 For errors that program identifies as a normal part of its operation (Ex: a file not found or the
user entering invalid data), raise an exception instead of detecting it with an assert statement.
 You shouldn’t use assert statements in place of raising exceptions, because users can choose to
turn off assertions.
 Assertions also aren’t a replacement for comprehensive testing:
 Ex.:if the previous ages was set to [10, 3, 2, 1, 20]:
 assert ages[0] <= ages[-1]
 The list was unsorted, but the assertion wouldn’t check it.

Project: Using an Assertion in a Traffic Light Simulation


# Program to build a traffic light simulation using an Assertion
 'ns‘  north-south and 'ew' east-west.
market_2nd = {'ns': 'green', 'ew': 'red'}
mission_16th = {'ns': 'red', 'ew': 'green'}
 These two variables will be for the intersections of Market Street and 2nd Street, and Mission
Street and 16th Street.
 To start with, write a switchLights() function, which will take an intersection dictionary as an
argument and switch the lights.
 The data structure representing the stoplights at an intersection is a dictionary with keys 'ns'
and 'ew', for the stoplights facing north-south and east-west, respectively.
 The values at these keys will be one of the strings 'green', 'yellow', or 'red'.
 switchLights() should simply switch each light to the next color in the sequence: Any 'green'
values should change to 'yellow', 'yellow' values should change to 'red', and 'red' values should
change to 'green'.
Ex:
def switchLights(stoplight):
for key in stoplight.keys():
if stoplight[key] == 'green':
stoplight[key] = 'yellow'
elif stoplight[key] == 'yellow':
stoplight[key] = 'red'
elif stoplight[key] == 'red':
stoplight[key] = 'green'
switchLights(market_2nd)

 Further, rest of the simulation code is written, thousands of lines long, without noticing a bug.
 When you finally run the simulation, the program doesn’t crash—but your virtual cars do!
 Since you’ve already written the rest of the program, you have no idea where the bug could be.
4-8
Module-4: Organizing Files and Debugging

 It could take hours to trace the bug back to the switchLights() function.
 If you had added an assertion to check that at least one of the lights is always red, one can save a
lot of future debugging effort :
assert 'red' in stoplight.values(), 'Neither light is red! ' + str(stoplight)

 With this assertion in place, program would crash with this error message:
Traceback (most recent call last):
File "carSim.py", line 14, in <module>
switchLights(market_2nd)
File "carSim.py", line 13, in switchLights
assert 'red' in stoplight.values(), 'Neither light is red! ' + str(stoplight)
➊ AssertionError: Neither light is red! {'ns': 'yellow', 'ew': 'green'}
 The program immediately points out that a sanity check failed, printing AssertionError ➊.
 Neither direction of traffic has a red light, meaning that traffic could be going both ways.
 By failing fast early in the program’s execution, you can save yourself a lot of future debugging
effort.

Logging
 Logging is a way to track events that occur. Essential for debugging a program while developing.
 It is a great way to understand what’s happening in program and in what order it’s happening.
 Without logging, finding the source of a problem in code may be extremely time consuming.
 Using a print() statement in code to output some variable’s value while program is running, is a
form of logging to debug code.
 Python’s logging module will describe when the program execution has reached the logging
function call and list any variables are specified at that point in time.
import logging
logging.basicConfig(level=logging.DEBUG, format=' %(asctime)s - %(levelname)s -
%(message)s')

 When Python logs an event, it creates a LogRecord object that holds information about that
event.
 The logging module’s basicConfig() function lets to specify what details about LogRecord
object user wants to see and how he wants those details displayed.

Program: A function to calculate the factorial of a number using Logging:


import logging
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(levelname)s-
%(message)s')
logging.debug('Start of program') #logging.debug() function used to print log
#information.
# debug() function will call basicConfig(), & information will be printed in
#format specified in basicConfig()
def factorial(n):
logging.debug('Start of factorial(%s)' %(n))
total = 1
for i in range(n + 1):
total *= i
logging.debug('i is ' + str(i) + ', total is ' + str(total))
logging.debug('End of factorial(%s)' % (n)) #passing arguments to logging
#process
return total
4-9
Module-4: Organizing Files and Debugging

print(factorial(5))
logging.debug('End of program')

 Use the logging.debug() function when log information needs to be printed.


 This debug() function will call basicConfig(), and a line of information will be printed.
 This information will be in the format as specified in basicConfig() and will include the
messages that are passed to debug().

Output:
2019-05-23 16:20:12,664 - DEBUG - Start of program
2019-05-23 16:20:12,664 - DEBUG - Start of factorial(5)
2019-05-23 16:20:12,665 - DEBUG - i is 0, total is 0
2019-05-23 16:20:12,668 - DEBUG - i is 1, total is 0
2019-05-23 16:20:12,670 - DEBUG - i is 2, total is 0
2019-05-23 16:20:12,673 - DEBUG - i is 3, total is 0
2019-05-23 16:20:12,675 - DEBUG - i is 4, total is 0
2019-05-23 16:20:12,678 - DEBUG - i is 5, total is 0
2019-05-23 16:20:12,680 - DEBUG - End of factorial(5)
0
2019-05-23 16:20:12,684 - DEBUG - End of program

 The factorial() function is returning 0 as factorial of 5, which isn’t right.


 Log messages displayed by logging.debug() show that the i variable is starting at 0 instead of 1.
Since zero times anything is zero, the rest of the iterations also have the wrong value for total.
 Logging messages help to figure out when things started to go wrong.
 Change the ‘for i in range(n + 1):’ line to ‘for i in range(1, n + 1):’

Output:
2019-05-23 17:13:40,650 - DEBUG - Start of program
2019-05-23 17:13:40,651 - DEBUG - Start of factorial(5)
2019-05-23 17:13:40,651 - DEBUG - i is 1, total is 1
2019-05-23 17:13:40,654 - DEBUG - i is 2, total is 2
2019-05-23 17:13:40,656 - DEBUG - i is 3, total is 6
2019-05-23 17:13:40,659 - DEBUG - i is 4, total is 24
2019-05-23 17:13:40,661 - DEBUG - i is 5, total is 120
2019-05-23 17:13:40,661 - DEBUG - End of factorial(5)
120
2019-05-23 17:13:40,666 - DEBUG - End of program

 The factorial(5) call correctly returns 120.


 The log messages showed what was going on inside the loop, which led straight to bug.

Don’t Debug with the print() Function


 To use logging facility of Python, it is needed to: ‘import logging’ module and incorporate:
‘logging.basicConfig(level=logging.DEBUG, format='% (asctime)s - %(levelname)s - %(message)s')’
 Instead one may get tempted to use much simpler print() calls.
 Once debugging is done, a lot of time is needed to remove print() calls from code for each log
message.
 Accidentally some print() calls may be removed that were being used for nonlog messages.
 The nice thing about log messages is that the program can have as many log messages as needed,
and can be disabled them later by adding a single logging.disable(logging.CRITICAL) call.
 Unlike print() statement, it is easy to switch between showing and hiding log messages.
 Log messages are intended for the programmer, not the user.
4-10
Module-4: Organizing Files and Debugging

 For messages that the user will want to see, (Ex: File not found or Invalid input, please enter a
number), use a print() call.

Logging Levels:
 Logging levels categorize log messages by importance. There are five logging levels, least to most
important:
Logging Levels in Python
Level Logging function Description
logging.debug()
DEBUG The lowest level. Used for testing
logging.info()
INFO The information level, to confirm that things are
working at their point in the program.
logging.warning()
WARNING The warning level, to indicate something could
go wrong.
logging.error()
ERROR The error level, to indicate something has gone
wrong.
logging.critical()
CRITICAL The highest level, where something has gone
wrong and might stop the program.
 Your logging message is passed as a string to these logging functions:
>>> import logging
>>> logging.basicConfig(level=logging.DEBUG, format=' %(asctime)s-%(levelname)s-
%(message)s')
>>> logging.debug('Some debugging details.')
2019-05-18 19:04:26,901 - DEBUG - Some debugging details.
>>> logging.info('The logging module is working.')
2019-05-18 19:04:35,569 - INFO - The logging module is working.
>>> logging.warning('An error message is about to be logged.')
2019-05-18 19:04:56,843 - WARNING - An error message is about to be logged.
>>> logging.error('An error has occurred.')
2019-05-18 19:05:07,737 - ERROR - An error has occurred.
>>> logging.critical('The program is unable to recover!')
2019-05-18 19:05:45,794 - CRITICAL - The program is unable to recover!

 Benefit of logging levels is that you can change what priority of logging message you want to see.
 Passing logging.DEBUG to basicConfig() function’s level keyword argument will show messages
from all the logging levels (DEBUG being lowest level)
 If you are interested only in errors, set basicConfig()’s level argument to logging.ERROR.
 This will show only ERROR and CRITICAL messages and skip the DEBUG, INFO and WARNING
messages.

>>> import logging


>>> logging.basicConfig(level=logging.INFO, format=' %(asctime)s - %(levelname)s -
%(message)s')
>>> logging.critical('Critical error! Critical error!')
2019-05-22 11:10:48,054 - CRITICAL - Critical error! Critical error!
>>> logging.disable(logging.CRITICAL)
>>> logging.critical('Critical error! Critical error!')
>>> logging.error('Error! Error!')

4-11
Module-4: Organizing Files and Debugging

 After you’ve debugged your program, you don’t want all these log messages cluttering the screen.
 Use logging.disable() to disable all messages after it /near the ‘import logging’ line in program.
 This way, you can comment out or uncomment that call to enable or disable logging messages as
needed.

Logging to a File:
 Instead of displaying the log messages on to the screen, they can be written to a text file.
 The ‘logging.basicConfig()’ function takes a filename keyword argument:
 The log messages will be saved to myProgramLog.txt.
 Advantage: While logging messages are helpful, they clutter the screen and make it hard to read
the program’s output.
 Writing the logging messages to a file will keep the screen clear and store the messages so that
you can read them in any text editor, after running the program.
import logging
logging.basicConfig(filename='myProgramLog.txt', level=logging.DEBUG, format='
%(asctime)s - %(levelname)s - %(message)s')

Mu’s Debugger
 The debugger is a a valuable tool for tracking down bugs.
 It is a feature of the Mu editor, or any other editor software that allows to execute program one
line at a time and then wait for you to tell it to continue.
 By running your program “under the debugger” like this, you can take as much time as you want
to examine values in variables at any given point during the program’s lifetime.
 To run a program under Mu’s debugger, click the Debug button in the top row of buttons, next to
the Run button.
 Along with the usual output pane at the bottom, the Debug Inspector pane will open along the
right side of the window.
 This pane lists the current value of variables in your program.
 Debugging mode also adds the following new buttons to the top of the editor: Continue, Step
Over, Step In, and Step Out. The usual Stop button is also available.
 Continue: Continue button will cause the program to execute normally until it terminates or
reaches a breakpoint. If you are done debugging and want the program to continue normally,
click the Continue button.
 Step In : Step In button will cause the debugger to execute the next line of code and then pause
again. If the next line of code is a function call, the debugger will “step into” that function and
jump to the first line of code of that function.
 Step Over: This button will execute the next line of code, similar to Step In button. If the next line
of code is a function call, the Step Over button will “step over” the code in the function. The
function’s code will be executed at full speed, and the debugger will pause as soon as the function
call returns. Ex: if the next line of code calls a spam() function but you don’t really care about code
inside this function, you can click Step Over to execute the code in the function at normal speed,
and then pause when the function returns.
 Step Out: This button will cause the debugger to execute lines of code at full speed until it returns
from the current function. If you have stepped into a function call with the Step In button and now
simply want to keep executing instructions until you get back out, click the Out button to “step
out” of the current function call.

4-12
Module-4: Organizing Files and Debugging

 Stop: If you want to stop debugging entirely and not bother to continue executing the rest of the
program, click this button. The Stop button will immediately terminate the program.

Debugging a Number Adding Program


 Open a new file editor tab and enter the following code:
 Save it as buggyAddingProgram.py & run it first without debugger enabled.
print('Enter the first number to add:')
first = input()
print('Enter the second number to add:')
second = input()
print('Enter the third number to add:')
third = input()
print('The sum is ' + first + second + third)
Output:
Enter the first number to add:
5
Enter the second number to add:
3
Enter the third number to add:
42
The sum is 5342

 The program hasn’t crashed, but the sum is wrong.


 Run the program again, under the debugger.
 When you click the Debug button, the program pauses on line 1, which is the line of code it is
about to execute.
 Click Step Over button once to execute first print() call.
 Use Step Over instead of Step In here, since you don’t want to step into the code for the
print() function.
 The debugger moves on to line 2, and highlights line 2 in the file editor. This shows where
program execution currently is.
 Click Step Over again to execute input() function call.
 The highlighting will go away while Mu waits for you to type something for the input() call into
the output pane.
 Enter 5 and press ENTER.
 The highlighting will return.
 Keep clicking Step Over, & enter 3 and 42 as next two numbers.
 When debugger reaches line 7, final print() call in program.
 The Debug Inspector pane on the right side shows that the variables are set to strings values '5',
'3', and '42' instead of integer values 5, 3, and 42, causing the bug.
 When the last line is executed, Python concatenates these strings instead of adding the numbers
together, causing the bug.
 Stepping through the program with the debugger is helpful but can also be slow. Often you’ll
want the program to run normally until it reaches a certain line of code. You can configure the
debugger to do this with breakpoints.
 A breakpoint can be set on a specific line of code and forces the debugger to pause whenever
the program execution reaches that line.
 Consider the following program, which simulates flipping a coin 1,000 times.
>>> import random
heads = 0
for i in range(1, 1001):

4-13
Module-4: Organizing Files and Debugging

➊ if random.randint(0, 1) == 1:
heads = heads + 1
if i == 500:
➋ print('Halfway done!')
print('Heads came up ' + str(heads) + ' times.')

 The random.randint(0, 1) call ➊ will return 0 half of the time and 1 the other half of the time.
 This can be used to simulate a 50/50 coin flip where 1 represents heads.
 When you run this program without the debugger, it quickly outputs: Halfway done!
 Heads came up 490 times.
 If you ran this program under the debugger, you would have to click the Step Over button
thousands of times before the program terminated.
 If you were interested in the value of heads at the halfway point of the program’s execution,
when 500 of 1,000 coin flips have been completed, then set a breakpoint on the line
print('Halfway done!') ➋.
 To set a breakpoint, click the line number in the file editor to cause a red dot to appear.
 Don’t set a breakpoint on the if statement line, since if statement is executed on every single
iteration through loop.
 The line with the breakpoint will have a red dot next to it.
 When you run the program under the debugger, it will start in a paused state at the first line, as
usual.
 But if you click Continue, the program will run at full speed until it reaches the line with the
breakpoint set on it.
 Then click Continue, Step Over, Step In, or Step Out to continue as normal.
 If you want to remove a breakpoint, click the line number again.
 The red dot will go away, and debugger will not break on that line in future.

Abstract of organising files and debugging related methods:


Sl. Command Meaning Example
No.
Organizing Files :
1. shutil.copy(source, will copy the file at the shutil.copy (p / 'spam.txt',
p / 'some_folder')
destination) source path to the
folder at the shutil.copy(p / 'eggs.txt',
destination path p / 'some_folder/eggs2.txt')
2. shutil.copytree(source, will copy folder at shutil.copytree(p / 'spam', p /
'spam_backup')
destination) source path, along
with all of its files and
subfolders, to the
folder at the
destination path
3. shutil.move(source, will move the file or shutil.move('C:\\bacon.txt',
'C:\\eggs')
destination) folder at the source
path to the
destination path
4. os.unlink(path) will delete the file at Os.unlink('C:\\bacon.txt')
path permanently
5. os.rmdir(path) Os.rmdir('C:\\some_folder\\
will delete the folder bacon.txt')
4-14
Module-4: Organizing Files and Debugging

at path, which is
empty of any files or
folders.

6. shutil.rmtree(path) shutil.move('C:\\some_folder\\
will remove the folder bacon.txt')
at path, and all files
and folders it
contains will also be
deleted.

7. send2trash: will send folders /files send2trash.send2trash('bacon.txt')


to computer’s recycle
bin instead of
permanently deleting
them, which can be
restored later
8. os.walk(path) will return 3 values: for folderName, subfolders, filenames in
current folder’s name, os.walk('C:\\delicious'):
print('The current folder is ' + folderName)
folders in current for subfolder in subfolders:
folder, files in the print('SUBFOLDER OF ' + folderName
current folder + ': ' + subfolder)
for filename in filenames:
print('FILE INSIDE ' + folderName + ':
'+ filename)
Compressing Files with the zipfile Module:
9. zipfile.ZipFile(path) Creates a zip file zipfile.ZipFile(p / 'example.zip')
object named
‘example.zip’ of the
path ‘p’ passed to it
10. ZipFileobject.namelist() method of ZipFile exampleZip.namelist()
object, that returns all
the files and folders
contained in the ZIP
file as a list of strings
11. .getinfo('spam.txt') returns a ‘ZipInfo’ exampleZip.getinfo('spam.txt')
object about ‘ZipFile’
object. ‘Zipinfo’ object
holds useful
information about
‘ZipFile’ object
Reading ZIP Files:
12. .extractall() method extracts all files and exampleZip.extractall()
folders from a ZIP file
into current working
directory
13. . extract() method extracts a single file exampleZip.extract('spam.txt')
from the ZIP file
Creating and Adding to ZIP Files:
4-15
Module-4: Organizing Files and Debugging

14. .ZipFile('new.zip', 'w') opens ‘new.zip’ zip file zipfile.ZipFile('new.zip', 'w')


object in ‘write mode’
15. .ZipFile('new.zip', 'a') opens ‘new.zip’ zip file zipfile.ZipFile('new.zip', 'a')
object in ‘append
mode’
16. ZipFile.close() closes ZipFile object backupZip.close()

Debugging:
Raising Exceptions:
17. raise Exception(‘Message') Raise an user-defined raise Exception('Symbol must be a
single character string.')
exception
18. traceback.format_exc() displays the traceback errorFile.write(traceback.format_exc())
exception as a string
19. variablename = displays the traceback
errorFile = open('errorInfo.txt', 'w')
open(filename, 'w') exception as a string errorFile.write(traceback.format_exc())
by calling
variablename.write(traceba traceback.format_exc
ck.format_exc())
() and write it to a text
file, 'errorInfo.txt'
Assertions
20. assert condition, message Condition if True does assert 'red' in stoplight.values(),
'Neither light is red! '
nothing and if False
raises an
AssertionError
Logging
21. The logging module’s logging.debug()will call logging.basicConfig(level=logging.DE
logging.basicConfig() basicConfig(), and a line BUG, format=' %(asctime)s -
%(levelname)s - %(message)s')
of information in the
format as specified in
basicConfig() will be
printed and will include
the messages that are
passed to debug().

Project: Renaming Files with American-Style Dates to European-Style Dates:


( Before continuing Refer 1. https://www.tutorialspoint.com/How-to-extract-date-from-text-using-Python-regular-
expression 2. https://blog.finxter.com/regex-match-dates/ to know more about regular expression fundamentals )

 Used to rename thousands of files with American-style dates (MM-DD-YYYY) in their names to
European-style dates (DD-MM-YYYY).
 Steps involved:
 It searches all the filenames in the cwd for American-style dates.
 When one is found, it renames file with month & day swapped to make it European-style.
o 1. Creates a regex that can identify the text pattern of American-style dates.
o 2. Call os.listdir() to find all the files in the working directory.
o 3. Loop over each filename, using the regex to check whether it has a date.
o 4. If it has a date, rename the file with shutil.move().
4-16
Module-4: Organizing Files and Debugging

 Step 1: # Code to create a Regex for American-Style Dates:


import shutil, os, re #module‘re’helps to create a regex (regular expression)
#that can identify American style MM-DD-YYYY dates.
datePattern = re.compile(r """ ^(.*?) # creates a Regex object
((0|1)?\d)- # one or two digits for the month
((0|1|2|3)?\d)- # one or two digits for the day
((19|20)\d\d) # four digits for the year
(.*?)$ # all text after the date
""", re.VERBOSE➌) #2nd argument ‘re.VERBOSE’ allows whitespace and comments in
# regex string, to make it more readable.
Explanation:
 The regular expression string begins with ^(.*?) to match any text at the beginning of the filename that
might come before date.
 The ((0|1)?\d) group matches the month. The first digit can be either 0 or 1, so the regex matches 12 for
December but also 02 for February. This digit is also optional so that the month can be 04 or 4 for April.
 The group for the day is ((0|1|2|3)?\d) and follows similar logic; 3, 03, and 31 are all valid numbers for
days. (this regex will accept some invalid dates such as 4- 31-2014, 2-29-2013, and 0-15-2014)
 The group for the year, look for years in the 20th or 21st century. This avoids from accidentally matching
nondate filenames with a date-like format, such as 10-10-1000.txt.
 The (.*?)$ part of the regex will match any text that comes after the date.

 Step 2: # Code to find all files in CWD and loop over them:
for amerFilename in os.listdir('.'):
mo = datePattern.search(amerFilename)
# Skip files without a date.
➊if mo == None:
➋ continue
# ➌ Get the different parts of the filename.
beforePart = mo.group(1)
monthPart = mo.group(2)
dayPart = mo.group(4)
yearPart = mo.group(6)
afterPart = mo.group(8)

--snip—
Explanation:

 Next, the program will have to loop over the list of filename strings returned from os.listdir() and
match them against the regex.
 Any files that do not have a date in them should be skipped.
 For filenames that have a date, the matched text will be stored in several variables.
 If the Match object returned from the search() method is None ➊, then the filename in
amerFilename does not match the regular expression.
 Continue statement ➋ will skip rest of loop and move on to next filename.
 Otherwise, the various strings matched in the regular expression groups are stored in variables
named beforePart, monthPart, dayPart, yearPart and afterPart ➌.
 The strings in these variables will be used to form the European-style filename in the next step
 To get more understanding about the groups; count up each time you encounter an opening
parenthesis.
Ex:
 datePattern = re.compile(r"""^(1) # all text before the date
4-17
Module-4: Organizing Files and Debugging

 (2 (3) )- # one or two digits for the month


 (4 (5) )- # one or two digits for the day
 (6 (7) ) # four digits for the year
 (8)$ # all text after the date
 """, re.VERBOSE)
 Here, the numbers 1 through 8 represent the groups in the regular expression.
 Making an outline of the regular expression, with just the parentheses and group numbers, can
give a clearer understanding of regex before moving on with the rest of the program.

 Step 3: # Code to form New Filename and Rename the Files:


# Form the European-style filename.
➊ euroFilename=beforePart+dayPart+ '-' +monthPart+ '-' +yearPart+afterPart
# Get the full, absolute file paths.
absWorkingDir = os.path.abspath('.')
amerFilename = os.path.join(absWorkingDir, amerFilename)
euroFilename = os.path.join(absWorkingDir, euroFilename)
# Rename the files.
➋ print(f'Renaming "{amerFilename}" to "{euroFilename}"...')
➌ shutil.move(amerFilename, euroFilename) # Make this comment, double-
# check and then uncomment after testing

Explanation:

 As the final step, concatenate the strings in the variables made in the previous step with the
European-style date: date comes before month.
 Store the concatenated string in a variable named euroFilename ➊.
 Then, pass the original filename ‘amerFilename’ and the new ‘euroFilename’ variable to the
‘shutil.move()’ function to rename the file ➌.
 Before running this program, comment out ‘shutil.move()’ call and run it so that it only prints the
filenames that are to be renamed ➋.
 This ensures you to double-check that the files will be renamed correctly.
 Then you can uncomment the ‘shutil.move()’ and run the program again to actually rename the
files.
 If this double checking is not done, you may accidentally rename the files that are not to be
renamed.

# Full program again at one place:


# Renames filenames with American MM-DD-YYYY date format to European DD-MM-YYYY.

import shutil, os, re


# Create a regex that matches files with the American date format.
datePattern = re.compile(r"""^(.*?) # all text before the date
((0|1)?\d)- # one or two digits for the month
((0|1|2|3)?\d)- # one or two digits for the day
((19|20)\d\d) # four digits for the year
(.*?)$ # all text after the date
""", re.VERBOSE)

# Loop over the files in the working directory.


for amerFilename in os.listdir('.'):
4-18
Module-4: Organizing Files and Debugging

mo = datePattern.search(amerFilename)

# Skip files without a date.


if mo == None:
continue

# Get the different parts of the filename.


beforePart = mo.group(1)
monthPart = mo.group(2)
dayPart = mo.group(4)
yearPart = mo.group(6)
afterPart = mo.group(8)

# Form the European-style filename.


euroFilename = beforePart + dayPart + '-' + monthPart + '-' + yearPart + afterPart

# Get the full, absolute file paths.


absWorkingDir = os.path.abspath('.')
amerFilename = os.path.join(absWorkingDir, amerFilename)
euroFilename = os.path.join(absWorkingDir, euroFilename)

# Rename the files.


print(f 'Renaming "{amerFilename}" to "{euroFilename}"...'))
shutil.move(amerFilename, euroFilename)

Project: Backing Up a Folder into a ZIP File


(This is also done in lab session)
Step 1: Figure Out the ZIP File’s Name
# Copies an entire folder &its contents into a ZIP file whose filename increments.
import zipfile, os
def backupToZip(folder):
# Back up the entire contents of "folder" into a ZIP file.
folder = os.path.abspath(folder) # make sure folder is absolute
# Figure out the filename this code should use based on
# what files already exist.
➋ number = 1
➌ while True:
zipFilename = os.path.basename(folder) + '_' + str(number) + '.zip'
if not os.path.exists(zipFilename):
break # breaks when non-existent file is found
number = number + 1
print('Done.')
backupToZip('C:\\delicious')

Explanation:

 The code for this program will be placed into a function named backupToZip()
 The first part, naming the ZIP file, uses the base name of the absolute path of folder. Ex: If the
folder being backed up is C:\delicious (delicious is basename here), the ZIP file’s name should be
delicious_N.zip, where N = 1 is the first time you run the program, N = 2 is the second time, and
so on.
4-19
Module-4: Organizing Files and Debugging

 You can determine what N should be by checking whether delicious_1.zip already exists, then
checking whether delicious_2.zip already exists, and so on.
 Use a variable named ‘number’ for ‘N’ ➋, and keep incrementing it inside the loop that calls
os.path.exists() to check whether the file exists ➌.
 The first nonexistent filename found will cause the loop to break, since it will have found the
filename of the new zip.

Step 2: Create the New ZIP File


--snip--
while True:
zipFilename = os.path.basename(folder) + '_' + str(number) + '.zip'
if not os.path.exists(zipFilename):
break
number = number + 1
# Create the ZIP file.
print(f'Creating {zipFilename}...')
➊ backupZip = zipfile.ZipFile(zipFilename, 'w')
print('Done.')
backupToZip('C:\\delicious')
Explanation:

 Create the ZIP file, using ‘zipfile.ZipFile()’ to actually create ZIP file ➊.
 Store the new ZIP file’s name in the zipFilename variable.
 Pass 'w' as second argument so that the ZIP file is opened in write mode.

Step 3: Walk the Directory Tree and Add to the ZIP File
--snip--
# Walk the entire folder tree and compress the files in each folder.
➊ for foldername, subfolders, filenames in os.walk(folder):
print(f'Adding files in {foldername}...')
# Add the current folder to the ZIP file.
➋ backupZip.write(foldername)
# Add all the files in this folder to the ZIP file.
➌ for filename in filenames:
newBase = os.path.basename(folder) + '_'
if filename.startswith(newBase) and filename.endswith('.zip'):
continue # don't back up the backup ZIP files
backupZip.write(os.path.join(foldername, filename))
backupZip.close()
print('Done.')
backupToZip('C:\\delicious')

Explanation:

 Now you need to use the ‘os.walk()’ function to do the work of listing every file in the folder
and its subfolders.
 You can use os.walk() in a for loop ➊, and on each iteration it will return the iteration’s current
folder name, the subfolders in that folder, & filenames in that folder.
 In the for loop, the folder is added to the ZIP file ➋.
 The nested for loop can go through each filename in the ‘filenames’ list ➌.
 Each of these is added to the ZIP file, except for previously made backup ZIPs.
 When you run this program, it will produce output that will look like this:
Output:
4-20
Module-4: Organizing Files and Debugging

Creating delicious_1.zip...
Adding files in C:\delicious...
Adding files in C:\delicious\cats...
Adding files in C:\delicious\waffles...
Adding files in C:\delicious\walnut...
Adding files in C:\delicious\walnut\waffles...
Done.

 Second time you run it, it will put all the files in C:\delicious into a ZIP file named delicious_2.zip,
and so on.

# Full program again at one place:


# Copies an entire folder &its contents into a ZIP file whose filename increments.
import zipfile, os
def backupToZip(folder):
# Back up the entire contents of "folder" into a ZIP file.
folder = os.path.abspath(folder) # make sure folder is absolute
# Figure out the filename this code should use based on what files already exist.
number = 1
while True:
zipFilename = os.path.basename(folder) + '_' + str(number) + '.zip'
if not os.path.exists(zipFilename):
break # breaks when non-existent file is found
number = number + 1
print('Done.')

# Create the ZIP file.


print(f 'Creating {zipFilename}...')
backupZip = zipfile.ZipFile(zipFilename, 'w')
print('Done.')
# Walk the entire folder tree and compress the files in each folder.
for foldername, subfolders, filenames in os.walk(folder):
print(f ' Adding files in {foldername}...')
# Add the current folder to the ZIP file.
backupZip.write(foldername)
# Add all the files in this folder to the ZIP file.
for filename in filenames:
newBase = os.path.basename(folder) + '_'
if filename.startswith (newBase) and filename.endswith('.zip'):
continue # don't back up the backup ZIP files
backupZip.write(os.path.join(foldername, filename))
backupZip.close()
print('Done.')
backupToZip('C:\\delicious')

4-21

You might also like