File Processing
File Processing
“File Processing”
1
Outline
• Introduction
• Opening a file
• Reading from a file
• Writing to a file
• Closing a file
• Reading from the Web
2
Introduction
• File Processing: is the process of reading or
writing to a file and manipulating its data.
• File processing operations include:
Opening a file
Reading from a file
Writing to a file
Closing the file
3
Opening a File
• Before we can read the contents of the file we must tell
Python which file we are going to work with and what we will
be doing with the file
• This is done using the open() function
• open() returns a “file handle” - a variable used to perform
operations on the file
• Kind of like “File -> Open” in a Word Processor
4
What is a Handle?
5
Opening a file
• To open a file we use function
open(<file name>, <open mode>)
• Name is a string with the actual file name on the disk.
• Open mode can be
Read “r”
Write “w”
Append “a”
depending on whether we are reading, writing or appending in
the file.
• Example:
>>> f = open ("C:\\file1.txt",'r')
<open file 'file1.txt', mode 'r' at 6
>>> print f 0x00000000021BEA50 >
>>> type (f)
Opening a file
• To open a file we use function
open(<file name>, <open mode>)
• Name is a string with the actual file name on the disk.
• Open mode can be
Read “r”
Write “w”
Append “a”
depending on whether we are reading, writing or appending in
Unless the file already exists,
the file. you will get an error:
• Example:
>>> fi = open ("C:\\file1.txt",'r')
>>> print fi <open file 'file1.txt', mode 'r' at 7
0x00000000021BEA50 >
>>> type (fi)
<type 'file'>
Opening a file
• To open a file we use function
open(<file name>, <open mode>)
• Name is a string with the actual file name on the disk.
• Open mode can be
Read “r”
Write “w”
Append “a”
depending on whether we are reading, writing or appending in
the file.
• Example:
>>> fo = open ("C:\\file2.txt",’w')
if the file does not exist, it will be created; 8
>>> print fo If the file already exists, it will overwrite its contents
>>> type (fo)
Opening a file
• To open a file we use function
open(<file name>, <open mode>)
• Name is a string with the actual file name on the disk.
• Open mode can be
Read “r”
Write “w”
Append “a”
depending on whether we are reading, writing or appending in
the file.
• Example:
>>> fo = open ("C:\\file2.txt",’a')
if the file does not exist, it will be created;
>>> print fo If the file already exists, it will point at the beginning of its 9
>>> type (fo) contents waiting for your commands (read or write).
If you want to add then you need to move to its end to add
<type 'file'>
content.
Reading from a file
• How to read data from the file ?
• There are 3 different methods to read data from files
readline () returns the next line in the file as a string.
readlines () returns all lines in the file as a list of strings
(lines)
read() returns the entire file content as one string.
• Example:
>>> fin = open ( 'file1.txt' , 'r' )
If the file is empty, readline() and
>>> Line = fin.readline() readlines() will return an emty
>>> Lines = fin.readlines() string and empty list.
10
file1.txt
• Assume the following contents in file1.txt
11
The newline
Character
• We use a special character
to indicate when a line >>> stuff = 'X\nY’
ends called the "newline" >>> print (stuff)
X
• We represent it as \n in
Y
strings >>> len (stuff)
• Newline is still one 3
character - not two
12
A closer look at reading lines
Removes any white spaces at the
beginning or end of the string.
13
A closer look at reading lines
Removes any white spaces at the
beginning or end of the string.
14
Reading from a file (cont.)
• read() : reads the entire contents of the file
content as a string including “newline”
characters “\n”.
15
Reading from a file (cont.)
• Seek (offset) method: sets the file's current
position at the offset (character position).
Will start reading a line
from the 4th character in
the file.
17
Searching Through a File
• We can put an if
statement in our for loop
to only print lines that
meet some criteria fhand = open('mbox-
short.txt')
for line in fhand:
if
line.startswith('From:') :
print (line)
18
Searching Through a File
(fixed)
fhand = open('mbox-short.txt')
• We can strip the for line in fhand:
whitespace from the right line = line.rstrip()
hand side of the string if line.startswith('From:') :
using rstrip() from the print (line)
string library
• The newline is considered
From: [email protected]
"white space" and is From: [email protected]
stripped From: [email protected]
From: [email protected]
....
19
Using in to select lines
21
Closing files
• Whether you are reading or writing you always need to
close a file after you’re done processing it .
• In some cases, not properly closing a file could result in
data loss.
• To close a file , use the close( ) method
Example:
>>> f.close()
22
Writing to a file
• To write to a file , use “w” open mode
What will happen to the file in “w” mode ??
File exists: clear the file’s contents.
File doesn’t exists: a new one is created.
• Example:
>>> f =open ("file2.txt",'w')
>>> f.write ("Hello")
When you are done writing, you
>>> f.close() have to close the file to see the
updates 23
Writing to a file (Cont.)
• What if you want to write data to your file? Lets
take the example of wanting to write numbers!
• Example:
>>> f =open ("file2.txt",'w')
>>> f.write (7)
TypeError: expected a character buffer object
• Use the string formatting operators %
• Example:
>>> f =open ("file2.txt",'w')
>>> n= 7 24
>>> f.write ("this is number %d" %(n))
Writing to a file (Cont.)
• To append the file content without deleting the
original content, use “a” open mode
• Example:
>>> f =open ("file2.txt",'a') “Hi” will be added to the file
>>> f.write ("Hi") content without deleting the
previous data.
>>> f.close()
• To write a new line in the file, use “ \n”
Example:
>>> f =open ("file2.txt",'a')
>>> f.write ("\n Welcome")
“Welcome” will be added in a
>>> f.close() new line. 25
The flush method
• Data is usually buffered in memory before its
actually written to a file.
• You will notice that you will only see what you have
written to file after closing it.
• If you want to force the contents of the buffer to be
written to the file, you can use the flush method.
• Example:
• f =open ("file2.txt",‘w') Open the file now. You’ll see
that its empty
• f.write(“hello")
• f.flush() Now, re-open the file. You’ll 26
see ‘hello’ has been written
Reading from URLs!!
import urllib
url = "http://www.nileu.edu.eg/"
web_page = urllib.urlopen(url)
for line in web_page:
line = line.strip()
print line
web_page.close()
27