0% found this document useful (0 votes)
4 views

python module 4

The document provides an overview of the shutil module in Python, detailing functions for copying, moving, renaming, and deleting files and folders. It also introduces the send2trash module for safer deletions, the os.walk() function for traversing directory trees, and the zipfile module for handling ZIP files. Additionally, it includes projects for renaming files with American-style dates to European-style dates and backing up a folder into a ZIP file.

Uploaded by

Shaheena Begum
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

python module 4

The document provides an overview of the shutil module in Python, detailing functions for copying, moving, renaming, and deleting files and folders. It also introduces the send2trash module for safer deletions, the os.walk() function for traversing directory trees, and the zipfile module for handling ZIP files. Additionally, it includes projects for renaming files with American-style dates to European-style dates and backing up a folder into a ZIP file.

Uploaded by

Shaheena Begum
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Organizing Files

The shutil Module


The shutil (or shell utilities) module has functions to let you copy, move,rename, and
delete files in your Python programs. To use the shutil functions, you will first need to
use import shutil.

Copying Files and Folders


The shutil module provides functions for copying files, as well as entire folders. Calling
shutil.copy(source, destination) will copy the file at the path source to the folder at the
path destination. (Both source and destination are strings.) If destination is a filename, it
will be used as the new name of the copied file. This function returns a string of the path
of the copied file. Enter the following into the interactive shell to see how shutil.copy()
works:
>>> import shutil, os
>>> os.chdir('C:\\')
u >>> shutil.copy('C:\\spam.txt', 'C:\\delicious')
'C:\\delicious\\spam.txt'
v >>> shutil.copy('eggs.txt', 'C:\\delicious\\eggs2.txt')
'C:\\delicious\\eggs2.txt'

The first shutil.copy() call copies the file at C:\spam.txt to the folder C:\delicious. The
return value is the path of the newly copied file. Note that since a folder was specified as
the destination, the original spam.txt filename is used for the new, copied file’s filename.
The second shutil.copy() call also copies the file at C:\eggs.txt to the folder C:\delicious
but gives the copied file the name eggs2.txt.

>>> import shutil, os


>>> os.chdir('C:\\')
>>> shutil.copytree('C:\\bacon', 'C:\\bacon_backup')
'C:\\bacon_backup'
The shutil.copytree() call creates a new folder named bacon_backup with the same
content as the original bacon folder. You have now safely backed up your precious,
precious bacon.

Moving and Renaming Files and Folders


Calling shutil.move(source, destination) will move the file or folder at the path source to
the path destination and will return a string of the absolute path of the new location.
If destination points to a folder, the source file gets moved into destination and keeps its
current filename. For example,
>>> import shutil
>>> shutil.move('C:\\bacon.txt', 'C:\\eggs')
'C:\\eggs\\bacon.txt'

Assuming a folder named eggs already exists in the C:\ directory, this shutil.move() calls
says, “Move C:\bacon.txt into the folder C:\eggs.” If there had been a bacon.txt file
already in C:\eggs, it would have been overwritten. Since it’s easy to accidentally
overwrite files in this way, you should take some care when using move().destination
path can also specify a filename. In the following example, the source file is moved and
renamed.
>>> shutil.move('C:\\bacon.txt', 'C:\\eggs\\new_bacon.txt')
'C:\\eggs\\new_bacon.txt'

This line says, “Move C:\bacon.txt into the folder C:\eggs, and while you’re at it, rename
that bacon.txt file to new_bacon.txt.” Both of the previous examples worked under the
assumption that there was a folder eggs in the C:\ directory. But if there is no eggs folder,
then move() will rename bacon.txt to a file named eggs.
>>> shutil.move('C:\\bacon.txt', 'C:\\eggs')
'C:\\eggs'

Here, move() can’t find a folder named eggs in the C:\ directory and so assumes that
destination must be specifying a filename, not a folder. So the bacon.txt text file is
renamed to eggs

Finally, the folders that make up the destination must already exist, or else Python will
throw an exception. Enter the following into the interactive shell:
>>> shutil.move('spam.txt', 'c:\\does_not_exist\\eggs\\ham')
Traceback (most recent call last):
File "C:\Python34\lib\shutil.py", line 521, in move
os.rename(src, real_dst)
FileNotFoundError: [WinError 3] The system cannot find the path specified:
'spam.txt' -> 'c:\\does_not_exist\\eggs\\ham'

Permanently Deleting Files and Folders


You can delete a single file or a single empty folder with functions in the os module,
whereas to delete a folder and all of its contents, you use the shutil module.
• Calling os.unlink(path) will delete the file at path.
• Calling os.rmdir(path) will delete the folder at path. This folder must be empty of any
files or folders.
• Calling shutil.rmtree(path) will remove the folder at path, and all files and folders it
contains will also be deleted.

Be careful when using these functions in your programs! It’s often a good idea to first
run your program with these calls commented out and with print() calls added to show
the files that would be deleted.
import os
for filename in os.listdir():
if filename.endswith('.rxt'):
os.unlink(filename)

If you had any important files ending with .rxt, they would have been accidentally,
permanently deleted. Instead, you should have first run the program like this:
import os
for filename in os.listdir():
if filename.endswith('.rxt'):
#os.unlink(filename)
print(filename)
Now the os.unlink() call is commented, so Python ignores it. Instead, you will print the
filename of the file that would have been deleted. Running this version of the program
first will show you that you’ve accidentally told the program to delete .rxt files instead
of .txt files.

Safe Deletes with the send2trash Module


Since Python’s built-in shutil.rmtree() function irreversibly deletes files and folders, it
can be dangerous to use. A much better way to delete files and folders is with the third-
party send2trash module. You can install this module by running pip install send2trash
from a Terminal window.
Using send2trash is much safer than Python’s regular delete functions,because it will
send folders and files to your computer’s trash or recycle bin instead of permanently
deleting them. If a bug in your program deletes something with send2trash you didn’t
intend to delete, you can later restore it from the recycle bin.

>>> import send2trash


>>> baconFile = open('bacon.txt', 'a') # creates the file
>>> baconFile.write('Bacon is not a vegetable.')
25
>>> baconFile.close()
>>> send2trash.send2trash('bacon.txt')

Walking a Directory Tree


Say you want to rename every file in some folder and also every file in every subfolder
of that folder. That is, you want to walk through the directory tree,touching each file as
you go. Writing a program to do this could get tricky; fortunately, Python provides a
function to handle this process for you.

Here is an example program that uses the os.walk() function on the directory tree from
Figure
import os
for folderName, subfolders, filenames in os.walk('C:\\delicious'):
print('The current folder is ' + folderName)
for subfolder in subfolders:
print('SUBFOLDER OF ' + folderName + ': ' + subfolder)
for filename in filenames:
print('FILE INSIDE ' + folderName + ': '+ filename)
print('')
The os.walk() function is passed a single string value: the path of a
folder. You can use os.walk() in a for loop statement to walk a directory tree, much like
how you can use the range() function to walk over a range of numbers. Unlike range(),
the os.walk() function will return three values on each iteration through the loop:
1. A string of the current folder’s name
2. A list of strings of the folders in the current folder
3. A list of strings of the files in the current folder

Just like you can choose the variable name i in the code for i in range(10):, you can also
choose the variable names for the three values listed earlier. I usually use the names
foldername, subfolders, and filenames. When you run this program, it will output the
following:
The current folder is C:\delicious
SUBFOLDER OF C:\delicious: cats
SUBFOLDER OF C:\delicious: walnut
FILE INSIDE C:\delicious: spam.txt
The current folder is C:\delicious\cats
FILE INSIDE C:\delicious\cats: catnames.txt
FILE INSIDE C:\delicious\cats: zophie.jpg
The current folder is C:\delicious\walnut
SUBFOLDER OF C:\delicious\walnut: waffles
The current folder is C:\delicious\walnut\waffles
FILE INSIDE C:\delicious\walnut\waffles: butter.txt.

Compressing Files with the zipfile Module


ZIP files (with the .zip file extension), which can hold the compressed contents of many
other files. Compressing a file reduces its size, which is useful when transferring it over
the Internet. And since a ZIP file can also contain multiple files and subfolders, it’s a
handy way to package several files into one. This single file, called an archive file, can
then be, say, attached to an email. Your Python programs can both create and open (or
extract) ZIP files using functions in the zipfile module

Reading ZIP Files


To read the contents of a ZIP file, first you must create a ZipFile object (note the capital
letters Z and F). ZipFile objects are conceptually similar to the File objects you saw
returned by the open() function, They are values through which the program interacts
with the file.
To create a ZipFile object, call the zipfile.ZipFile() function, passing it a string of the .zip
file’s filename. Note that zipfile is the name of the Python module, and ZipFile() is the
name of the function.
>>> import zipfile, os
>>> os.chdir('C:\\') # move to the folder with example.zip
>>> exampleZip = zipfile.ZipFile('example.zip')
>>> exampleZip.namelist()
['spam.txt', 'cats/', 'cats/catnames.txt', 'cats/zophie.jpg']
>>> spamInfo = exampleZip.getinfo('spam.txt')
>>> spamInfo.file_size
13908
>>> spamInfo.compress_size
3828
>>> 'Compressed file is %sx smaller!' % (round(spamInfo.file_size / spamInfo
.compress_size, 2))
'Compressed file is 3.63x smaller!'
>>> exampleZip.close()

A ZipFile object has a namelist() method that returns a list of strings for all the files and
folders contained in the ZIP file. These strings can be passed to the getinfo() ZipFile
method to return a ZipInfo object about that particular file. ZipInfo objects have their
own attributes, such as file_size and compress_size in bytes, which hold integers of the
original file size and compressed file size, respectively. While a ZipFile object represents
an entire archive file, a ZipInfo object holds useful information about a single file in the
archive.

Extracting from ZIP Files


The extractall() method for ZipFile objects extracts all the files and folders from a ZIP
file into the current working directory.
>>> import zipfile, os
>>> os.chdir('C:\\') # move to the folder with example.zip
>>> exampleZip = zipfile.ZipFile('example.zip')
>>> exampleZip.extractall()
>>> exampleZip.close()

After running this code, the contents of example.zip will be extracted to C:\. Optionally,
you can pass a folder name to extractall() to have it extract the files into a folder other
than the current working directory. If the folder passed to the extractall() method does
not exist, it will be created. For instance, if you replaced the call at u with
exampleZip.extractall('C:\\delicious'), the code would extract the files from example.zip
into a newly created C:\delicious folder.
The extract() method for ZipFile objects will extract a single file from the ZIP file.
Continue the interactive shell example:
>>> exampleZip.extract('spam.txt')
'C:\\spam.txt'
>>> exampleZip.extract('spam.txt', 'C:\\some\\new\\folders')
'C:\\some\\new\\folders\\spam.txt'
>>> exampleZip.close()

The string you pass to extract() must match one of the strings in the list returned by
namelist()

Creating and Adding to ZIP Files


To create your own compressed ZIP files, you must open the ZipFile object in write
mode by passing 'w' as the second argument. (This is similar to opening a text file in
write mode by passing 'w' to the open() function.)
>>> import zipfile
>>> newZip = zipfile.ZipFile('new.zip', 'w')
>>> newZip.write('spam.txt', compress_type=zipfile.ZIP_DEFLATED)
>>> newZip.close()

This code will create a new ZIP file named new.zip that has the compressed contents of
spam.txt. Keep in mind that, just as with writing to files, write mode will erase all
existing contents of a ZIP file. If you want to simply add files to an existing ZIP file, pass
'a' as the second argument to zipfile.ZipFile() to open the ZIP file in append mode.

Project: Renaming Files with American-Style Dates to European-Style Dates


Say your boss emails you thousands of files with American-style dates (MM-DD-YYYY)
in their names and needs them renamed to European style dates (DD-MM-YYYY). This
boring task could take all day to do by hand! Let’s write a program to do it instead.

Here’s what the program does:


• It searches all the filenames in the current working directory for American-style dates.
• When one is found, it renames the file with the month and day swapped
to make it European-style.
This means the code will need to do the following:
• Create a regex that can identify the text pattern of American-style dates.
• Call os.listdir() to find all the files in the working directory.
• Loop over each filename, using the regex to check whether it has a date.
• If it has a date, rename the file with shutil.move().
For this project, open a new file editor window and save your code as
renameDates.py.

Step 1: Create a Regex for American-Style Dates


The first part of the program will need to import the necessary modules and create a
regex that can identify MM-DD-YYYY dates.

Step 2: Identify the Date Parts from the Filenames


Next, the program will have to loop over the list of filename strings returned from
os.listdir() and match them against the regex. Any files that do not have a date in them
should be skipped. For filenames that have a date, the matched text will be stored in
several variables

Step 3: Form the New Filename and Rename the Files


As the final step, concatenate the strings in the variables made in the previous step with
the European-style date: The date comes before the month.

import shutil, os, re


# Create a regex that matches files with the American date format.
datePattern = re.compile(r"""^(.*?) # all text before the date
((0|1)?\d)- # one or two digits for the month
((0|1|2|3)?\d)- # one or two digits for the day
((19|20)\d\d) # four digits for the year
(.*?)$ # all text after the date
w """, re.VERBOSE)

# Loop over the files in the working directory.


for amerFilename in os.listdir('.'):
mo = datePattern.search(amerFilename)
# Skip files without a date.
if mo == None:
continue
w # Get the different parts of the filename.
beforePart = mo.group(1)
monthPart = mo.group(2)
dayPart = mo.group(4)
yearPart = mo.group(6)
afterPart = mo.group(8)

# Form the European-style filename.


euroFilename = beforePart + dayPart + '-' + monthPart + '-' + yearPart +
afterPart
# Get the full, absolute file paths.
absWorkingDir = os.path.abspath('.')
amerFilename = os.path.join(absWorkingDir, amerFilename)
euroFilename = os.path.join(absWorkingDir, euroFilename)
# Rename the files.
print('Renaming "%s" to "%s"...' % (amerFilename, euroFilename))
#shutil.move(amerFilename, euroFilename) # uncomment after testing

Project: Backing Up a Folder into a ZIP File

import os
import sys
import pathlib
import zipfile

dirName = input("Enter Directory name that you want to backup : ")


if not os.path.isdir(dirName):

print("Directory", dirName, "doesn’t exists")


sys.exit(0)

curDirectory = pathlib.Path(dirName)
with zipfile.ZipFile("myZip.zip", mode="w") as archive:

for file_path in curDirectory.rglob("*"):

archive.write(file_path, arcname=file_path.relative_to(curDirectory))

if os.path.isfile("myZip.zip"):
print("Archive", "myZip.zip", "created successfully")
else:
print("Error in creating zip archive")

Output
Enter Directory name that you want to backup : zipDemo
Archive myZip.zip created successfully
Debugging

Raising Exceptions
Python raises an exception whenever it tries to execute invalid code. Python handle’s
exceptions with try and except statements so that your program can recover from
exceptions that you anticipated. But you can also raise your own exceptions in your code.
Raising an exception is a way of saying, “Stop running the code in this function and
move the program execution to the except statement.”
Exceptions are raised with a raise statement. In code, a raise statement consists of the
following:
• The raise keyword
• A call to the Exception() function
• A string with a helpful error message passed to the Exception() function

For example, enter the following into the interactive shell:


>>> raise Exception('This is the error message.')
Traceback (most recent call last):
File "<pyshell#191>", line 1, in <module>
raise Exception('This is the error message.')
Exception: This is the error message.
If there are no try and except statements covering the raise statement that raised the
exception, the program simply crashes and displays the exception’s error message.

def boxPrint(symbol, width, height):


if len(symbol) != 1:
raise Exception('Symbol must be a single character string.')
if width <= 2:
raise Exception('Width must be greater than 2.')
if height <= 2:
raise Exception('Height must be greater than 2.')
print(symbol * width)
for i in range(height - 2):
print(symbol + (' ' * (width - 2)) + symbol)
print(symbol * width)
for sym, w, h in (('*', 4, 4), ('O', 20, 5), ('x', 1, 3), ('ZZ', 3, 3)):
try:
boxPrint(sym, w, h)
except Exception as err:
print('An exception happened: ' + str(err))

Here we’ve defined a boxPrint() function that takes a character, a width, and a height,
and uses the character to make a little picture of a box with that width and height. This
box shape is printed to the console.
Getting the Traceback as a String
When Python encounters an error, it produces a treasure trove of error information called
the traceback. The traceback includes the error message, the line number of the line that
caused the error, and the sequence of the function calls that led to the error. This
sequence of calls is called the call stack. Open a new file editor window in IDLE, enter
the following program, and save it as errorExample.py:

def spam():
bacon()
def bacon():
raise Exception('This is the error message.')
spam()

When you run errorExample.py, the output will look like this:
Traceback (most recent call last):
File "errorExample.py", line 7, in <module>
spam()
File "errorExample.py", line 2, in spam
bacon()
File "errorExample.py", line 5, in bacon
raise Exception('This is the error message.')
Exception: This is the error message.

From the traceback, you can see that the error happened on line 5, in the bacon() function.
This particular call to bacon() came from line 2, in the spam() function, which in turn
was called on line 7. In programs where functions can be called from multiple places, the
call stack can help you determine which call led to the error.
The traceback is displayed by Python whenever a raised exception goes unhandled. But
you can also obtain it as a string by calling traceback.format_exc(). This function is
useful if you want the information from an exception’s traceback but also want an except
statement to grace fully handle the exception. You will need to import Python’s
traceback module before calling this function.
For example, instead of crashing your program right when an exception occurs, you can
write the traceback information to a log file and keep your program running. You can
look at the log file later, when you’re ready to debug your program. Enter the following
into the interactive shell:
>>> import traceback
>>> try:
raise Exception('This is the error message.')
except:
errorFile = open('errorInfo.txt', 'w')
errorFile.write(traceback.format_exc())
errorFile.close()
print('The traceback info was written to errorInfo.txt.')

116
The traceback info was written to errorInfo.txt.

The 116 is the return value from the write() method, since 116 characters were written to
the file. The traceback text was written to errorInfo.txt.
Assertions
An assertion is a sanity check to make sure your code isn’t doing something obviously
wrong. These sanity checks are performed by assert statements. If the sanity check fails,
then an AssertionError exception is raised. In code, an assert statement consists of the
following:
• The assert keyword
• A condition (that is, an expression that evaluates to True or False)
• A comma
• A string to display when the condition is False

For example, enter the following into the interactive shell:


>>> podBayDoorStatus = 'open'
>>> assert podBayDoorStatus == 'open', 'The pod bay doors need to be "open".'
>>> podBayDoorStatus = 'I\'m sorry, Dave. I\'m afraid I can't do that.''
>>> assert podBayDoorStatus == 'open', 'The pod bay doors need to be "open".'

Traceback (most recent call last):


File "<pyshell#10>", line 1, in <module>
assert podBayDoorStatus == 'open', 'The pod bay doors need to be "open".'
AssertionError: The pod bay doors need to be "open".

Here we’ve set podBayDoorStatus to 'open', so from now on, we fully expect the value
of this variable to be 'open'. In a program that uses this variable, we might have written a
lot of code under the assumption that the value is 'open'—code that depends on its being
'open' in order to work as we expect. So we add an assertion to make sure we’re right to
assume podBayDoorStatus is 'open'. Here, we include the message 'The pod bay doors
need to be "open".' so it’ll be easy to see what’s wrong if the assertion fails

Logging
If you’ve ever put a print() statement in your code to output some variable’s value while
your program is running, you’ve used a form of logging to debug your code. Logging is a
great way to understand what’s happening in your program and in what order its
happening. Python’s logging module makes it easy to create a record of custom messages
that you write. These log mes sages will describe when the program execution has
reached the logging function call and list any variables you have specified at that point in
time. On the other hand, a missing log message indicates a part of the code was
skipped and never executed.

Using the logging Module


To enable the logging module to display log messages on your screen as your program
runs, copy the following to the top of your program (but under the #! python shebang
line):

import logging
logging.basicConfig(level=logging.DEBUG, format=' %(asctime)s - %(levelname)s
- %(message)s')

You don’t need to worry too much about how this works, but basically, when Python
logs an event, it creates a LogRecord object that holds information about that event. The
logging module’s basicConfig() function lets you specify what details about the
LogRecord object you want to see and how you want those details displayed.

Say you wrote a function to calculate the factorial of a number. In mathematics, factorial
4 is 1 × 2 × 3 × 4, or 24. Factorial 7 is 1 × 2 × 3 × 4 × 5 × 6 × 7, or 5,040. Open a new
file editor window and enter the following code. It has a bug in it, but you will also enter
several log messages to help yourself figure out what is going wrong.

import logging
logging.basicConfig(level=logging.DEBUG, format=' %(asctime)s - %(levelname)s
- %(message)s')
logging.debug('Start of program')
def factorial(n):
logging.debug('Start of factorial(%s%%)' % (n))
total = 1
for i in range(n + 1):
total *= i
logging.debug('i is ' + str(i) + ', total is ' + str(total))
logging.debug('End of factorial(%s%%)' % (n))
return total
print(factorial(5))
logging.debug('End of program')

Here, we use the logging.debug() function when we want to print login formation. This
debug() function will call basicConfig(), and a line of information will be printed. This
information will be in the format we specified in basicConfig() and will include the
messages we passed to debug(). The print(factorial(5)) call is part of the original program,
so the result is dis played even if logging messages are disabled.

The output of this program looks like this:

You might also like