0% found this document useful (0 votes)
17 views

Pythonlearn 07 Files

This document discusses file processing and reading files in Python. It begins by explaining that a text file can be thought of as a sequence of lines, with each line ending in a newline character. It then covers opening a file using the open() function, which returns a file handle that can be used to read or write to the file. The rest of the document discusses reading the file line by line as a sequence, with each line represented as a string.

Uploaded by

Hưng Minh Phan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Pythonlearn 07 Files

This document discusses file processing and reading files in Python. It begins by explaining that a text file can be thought of as a sequence of lines, with each line ending in a newline character. It then covers opening a file using the open() function, which returns a file handle that can be used to read or write to the file. The rest of the document discusses reading the file line by line as a sequence, with each line represented as a string.

Uploaded by

Hưng Minh Phan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

10/01/21

Software What
It is time to go find some
Next? Data to mess with!
Input Central
and Output Processing Files R
Devices Unit Us

Reading Files Secondary


Memory
if x < 3: print
Chapter 7
Main From [email protected] Sat Jan 5 09:14:16 2008
Memory Return-Path: <[email protected]>
Date: Sat, 5 Jan 2008 09:12:18 -0500To:
[email protected]:
[email protected]: [sakai] svn commit: r39772 -

Python for Everybody content/branches/Details:


http://source.sakaiproject.org/viewsvn/?view=rev&rev=39772
...
www.py4e.com

1 2

File Processing Opening a File


A text file can be thought of as a sequence of lines • Before we can read the contents of the file, we must tell Python
From [email protected] Sat Jan 5 09:14:16 2008
which file we are going to work with and what we will be doing
Return-Path: <[email protected]> with the file
Date: Sat, 5 Jan 2008 09:12:18 -0500
To: [email protected]
From: [email protected] • This is done with the open() function
Subject: [sakai] svn commit: r39772 - content/branches/

Details: http://source.sakaiproject.org/viewsvn/?view=rev&rev=39772
• open() returns a “file handle” - a variable used to perform
operations on the file
http://www.py4e.com/code/mbox-short.txt
• Similar to “File -> Open” in a Word Processor

3 4
10/01/21

Using open() What is a Handle?


>>> fhand = open('mbox.txt')
fhand = open('mbox.txt', 'r') >>> print(fhand)
<_io.TextIOWrapper name='mbox.txt' mode='r' encoding='UTF-8'>
• handle = open(filename, mode)

• returns a handle use to manipulate the file

• filename is a string

• mode is optional and should be 'r' if we are planning to


read the file and 'w' if we are going to write to the file

5 6

When Files are Missing The newline Character


>>> stuff = 'Hello\nWorld!'
>>> fhand = open('stuff.txt') >>> stuff
• We use a special character 'Hello\nWorld!'
Traceback (most recent call last):
File "<stdin>", line 1, in <module> called the “newline” to indicate >>> print(stuff)
when a line ends Hello
FileNotFoundError: [Errno 2] No such file or World!
directory: 'stuff.txt' • We represent it as \n in strings >>> stuff = 'X\nY'
>>> print(stuff)
X
• Newline is still one character - Y
not two >>> len(stuff)
3

7 8
10/01/21

File Processing File Processing


A text file can be thought of as a sequence of lines A text file has newlines at the end of each line

From [email protected] Sat Jan 5 09:14:16 2008 From [email protected] Sat Jan 5 09:14:16 2008\n
Return-Path: <[email protected]> Return-Path: <[email protected]>\n
Date: Sat, 5 Jan 2008 09:12:18 -0500 Date: Sat, 5 Jan 2008 09:12:18 -0500\n
To: [email protected] To: [email protected]\n
From: [email protected] From: [email protected]\n
Subject: [sakai] svn commit: r39772 - content/branches/ Subject: [sakai] svn commit: r39772 - content/branches/\n
\n
Details: http://source.sakaiproject.org/viewsvn/?view=rev&rev=39772 Details: http://source.sakaiproject.org/viewsvn/?view=rev&rev=39772\n

9 10

File Handle as a Sequence


• A file handle open for read can
be treated as a sequence of
Reading Files in Python strings where each line in the xfile = open('mbox.txt')
file is a string in the sequence for cheese in xfile:
print(cheese)
• We can use the for statement
to iterate through a sequence

• Remember - a sequence is an
ordered set

11 12
10/01/21

Counting Lines in a File Reading the *Whole* File


fhand = open('mbox.txt') >>> fhand = open('mbox-short.txt')
• Open a file read-only count = 0 We can read the whole >>> inp = fhand.read()
for line in fhand: file (newlines and all) >>> print(len(inp))
• Use a for loop to read each line count = count + 1 94626
into a single string
print('Line Count:', count) >>> print(inp[:20])
From stephen.marquar
• Count the lines and print out
the number of lines
$ python open.py
Line Count: 132045

13 14

Searching Through a File OOPS!


From: [email protected]
fhand = open('mbox-short.txt') What are all these blank
We can put an if statement in
for line in fhand: lines doing here? From: [email protected]
our for loop to only print lines if line.startswith('From:') :
that meet some criteria print(line) From: [email protected]

From: [email protected]
...

15 16
10/01/21

OOPS! Searching Through a File (fixed)


What are all these blank From: [email protected]\n fhand = open('mbox-short.txt')
• We can strip the whitespace for line in fhand:
lines doing here? \n
From: [email protected]\n from the right-hand side of line = line.rstrip()
if line.startswith('From:') :
• Each line from the file \n the string using rstrip() from print(line)
has a newline at the end From: [email protected]\n the string library
\n
From: [email protected]\n From: [email protected]
• The print statement adds • The newline is considered
\n From: [email protected]
a newline to each line “white space” and is
... From: [email protected]
stripped From: [email protected]
....

17 18

Skipping with continue Using in to Select Lines


fhand = open('mbox-short.txt')
We can look for a string for line in fhand:
line = line.rstrip()
fhand = open('mbox-short.txt') anywhere in a line as our if not '@uct.ac.za' in line :
We can conveniently for line in fhand: selection criteria continue
skip a line by using the line = line.rstrip() print(line)
if not line.startswith('From:') :
continue statement continue
From [email protected] Sat Jan 5 09:14:16 2008
print(line) X-Authentication-Warning: set sender to [email protected] using –f
From: [email protected]
Author: [email protected]
From [email protected] Fri Jan 4 07:02:32 2008
X-Authentication-Warning: set sender to [email protected] using -f...

19 20
10/01/21

Prompt for
fname = input('Enter the file name: ') fname = input('Enter the file name: ')
try:

Bad File
fhand = open(fname)
count = 0 fhand = open(fname)

File Name
for line in fhand: except:
print('File cannot be opened:', fname)

Names
if line.startswith('Subject:') :
count = count + 1 quit()
print('There were', count, 'subject lines in', fname)
count = 0
for line in fhand:
if line.startswith('Subject:') :
count = count + 1
print('There were', count, 'subject lines in', fname)
Enter the file name: mbox.txt
There were 1797 subject lines in mbox.txt
Enter the file name: mbox.txt
Enter the file name: mbox-short.txt There were 1797 subject lines in mbox.txt
There were 27 subject lines in mbox-short.txt
Enter the file name: na na boo boo
File cannot be opened: na na boo boo

21 22

Summary Acknowledgements / Contributions


These slides are Copyright 2010- Charles R. Severance ...
(www.dr-chuck.com) of the University of Michigan School of
• Secondary storage • Searching for lines Information and open.umich.edu and made available under a
Creative Commons Attribution 4.0 License. Please maintain this
last slide in all copies of the document to comply with the
attribution requirements of the license. If you make a change,
• Opening a file - file handle • Reading file names feel free to add your name and organization to the list of
contributors on this page as you republish the materials.

• File structure - newline character • Dealing with bad files Initial Development: Charles Severance, University of Michigan
School of Information

… Insert new Contributors and Translators here


• Reading a file line by line with a
for loop

23 24

You might also like