Py4Inf 07 Files
Py4Inf 07 Files
Chapter 7
http://www.py4inf.com/code/mbox-short.txt
Opening a File
• Before we can read the contents of the file we must tell Python which
file we are going to work with and what we will be doing with the file
• filename is a string
• mode is optional and should be 'r' if we are planning reading the file
and 'w' if we are going to write to the file.
http://docs.python.org/lib/built-in-funcs.html
What is a Handle?
>>> fhand = open('mbox.txt')
>>> print fhand
<open file 'mbox.txt', mode 'r' at 0x1005088b0>
When Files are Missing
• Remember - a sequence is an
ordered set
Counting Lines in a File
fhand = open('mbox.txt')
• Open a file read-only count = 0
for line in fhand:
• Use a for loop to read each count = count + 1
line
print 'Line Count:', count
• Count the lines and print out
the number of lines python open.py
Line Count: 132045
Reading the *Whole* File
fhand = open('mbox-short.txt')
for line in fhand:
• We can put an if statement in if line.startswith('From:') :
our for loop to only print print line
lines that meet some criteria
OOPS!
What are all these blank
From: [email protected]
lines doing here?
From: [email protected]
From: [email protected]
From: [email protected]
...
OOPS!
What are all these blank
From: [email protected]\n
lines doing here?
\n
From: [email protected]\n
The print statement adds a \n
newline to each line. From: [email protected]\n
\n
From: [email protected]\n
Each line from the file also
...
has a newline at the end.
Searching Through a File (fixed)
fhand = open('mbox-short.txt')
for line in fhand:
• We can strip the whitespace line = line.rstrip()
from the right hand side of if line.startswith('From:') :
the string using rstrip() from print line
the string library
fhand = open('mbox-short.txt')
for line in fhand:
line = line.rstrip()
• We can convienently
# Skip 'uninteresting lines'
skip a line by using the
if not line.startswith('From:') :
continue statement
continue
# Process our 'interesting' line
print line
Using in to select lines
fhand = open('mbox-short.txt')
• We can look for a string for line in fhand:
line = line.rstrip()
anywhere in a line as our
selection criteria if not '@uct.ac.za' in line :
continue
print line
python search6.py
Enter the file name: mbox-short.txt
There were 27 subject lines in mbox-short.txt
fname = raw_input('Enter the file name: ')
try:
fhand = open(fname)
Bad File except:
print 'File cannot be opened:', fname
Names exit()
count = 0
for line in fhand:
if line.startswith('Subject:') :
count = count + 1
print 'There were', count, 'subject lines in', fname
Enter the file name: mbox.txt
There were 1797 subject lines in mbox.txt
• Reading a file line-by-line with a for • Reading a file and splitting lines
loop
• Reading file names
• Reading the whole file as a string
• Dealing with bad files
• Searching for lines