Text Files
What is a text file?
●
A text file is a file containing characters, structured as
individual lines of text.
●
Contains printable characters and nonprintable
characters(such as \n)
●
Can be directly viewed and created using a text editor.
●
Your Python programs can output data to a text file.
●
The data in a text file can be viewed as characters, words,
numbers, or lines of text, depending on the text file’s
format and on the purposes for which the data are used.
●
When the data are numbers (either integers or floats), they
must be separated by whitespace characters—spaces,
tabs, and newlines—in the file.
●
All data output to or input from a text file must be strings.
●
So, numbers must be converted to strings before output,
and these strings must be converted back to numbers after
input.
Opening a file
●
Python provides the open() function which accepts two
arguments, file name and access mode in which the file is
accessed.
●
This function returns a file object which can be used to
perform various operations like reading, writing, etc.
●
The syntax to use the open() function is
file object = open(<file-name>, <access-mode>)
●
Various File opening modes are ( refer the attached
doc)
Example
#opens the file f2.txt in read mode
fileptr=open("f2.txt","r") # Opening the file
if fileptr:
print(" file is opened successfully")
#closing the file
fileptr.close() # Closing the file
Output:
File is opened successfully
- Once all the operations are done on the file, we must close it
The close() method
●
To close a file.
●
Any unwritten information gets destroyed once the close()
method is called on a file object.
●
It is good practice to close the file once all the operations
are done.
●
The syntax to use the close() method is
fileobject.close()
Reading the file
●
The read() method reads a string from the file.
●
It can read the data in the text as well as binary format.
●
The syntax of the read() method is given below.
fileobj.read(<count>)
●
Count is the number of bytes to be read from the file
starting from the beginning of the file.
●
If the count is not specified, then it may read the content of
the file until the end.
Example
fileptr=open("f1.txt","r") #open the file in read mode
content=fileptr.read() #stores the data into a variable
#reads the full file as count is not specified
print(type(content)) #print the type of data stored in the file
print(content) #print the contents
fileptr.close() #closing the file
Output
<class 'str'>
#file f1.txt
1 aaa
2 bbb
3 ccc
Read Lines of the file
●
Python allows us to read the file line by line by
using a function readline().
●
The readline() method reads the lines of the file
from the beginning,
- i.e., if we use the readline() method two times,
then we can get the first two lines of the file.
Example
#pgm to open the file in read mode ... line by line using
#readline
fileptr=open("f1.txt","r")
content=fileptr.readline() #stores the data into a variable
print(content)
content=fileptr.readline()
print(content)
fileptr.close() #closing the file
Output
1 aaa
2 bbb
Looping through the file
●
By looping through the lines of the file, we can read the
whole file.
Example
fileptr=open("f1.txt","r") #open file in read mode
for i in fileptr:
print(i)
fileptr.close() #close the file
Output
1 aaa
2 bbb
3 ccc
Writing into a file
●
To write some text to a file, we need to open the file using
the open method with one of the following access modes.
1) a : It will append the existing file.
The file pointer is at the end of the file.
It creates a new file if no such file exists.
2) w: It will overwrite the file if the file exists.
The file pointer is at the beginning of the file.
It creates a new file if no such file exists.
Example : Use of mode ‘w’
fp=open('newFile.txt','w')
fp.write("Example for w mode") #write into the file
fp.close() #close the file
fp=open("newFile.txt","r") #display the contents
for i in fp:
print(i)
fp.close()
Output
Example for w mode
Example : Use of mode ‘a’
fptr=open("newFile.txt","a") #open thefile in append mode
fptr.write("\n how r u ") #Append to the file
fptr.close() #close the file
#read the file newFile to check whether appended or not
fp=open("newFile.txt","r")
for i in fp:
print(i)
fp.close()
**********************************************************************
Note:Creating a new file
●
The new file can be created by using one of the following
access modes with the function open().
-- a: It creates a new file with the specified name if no such file
exists.
It appends the content to the file if the file already exists with
the specified name.
-- w: It creates a new file with the specified name if no such file
exists.
It overwrites the existing file.
Example (same as writing into a file)
So to create a file,
●
first use the open() in ‘w’ or ’a’ mode
Example: f = open("myfile.txt", 'w')
●
Second, write into the file using write()
Example : f.write("Hello \n How are you”)
●
Finally use close() to close the file.
Example : f.close()
Writing Numbers to a Text file
●
The file method write, expects a string as an argument.
●
Therefore, other types of data, such as integers or
floating-point numbers, must first be converted to strings
before being written to an output file.
●
In Python, the values of most data types can be converted
to strings by using the str function.
●
The resulting strings are then written to a file with a space
or a newline as a separator character.
●
Example : Writing integers from 0 t0 99 into a text file
f = open("integers.txt", 'w')
for count in range(100):
f.write(str(count)+'\n')
f.close()
●
If you do not convert into string, it will show error.
●
Now the file integers.txt will have the content as 0 to 99
Note: Reading Numbers from a Text file
●
All of the file input operations return data to the program as
strings.
●
If these strings represent other types of data, such as
integers or floating-point numbers, the programmer must
convert them to the appropriate types before manipulating
them further.
●
In Python, the string representations of integers and
floating-point numbers can be converted to the numbers
themselves by using the functions int and float ,
respectively.
●
When reading data from a file, another important
consideration is the format of the data items in the file.
●
While writing we may be using newline characters.
●
During input, these data can be read with a simple for
loop.
●
This loop accesses a line of text on each pass.
●
To convert this line to the integer contained in it, the
programmer runs the string method strip to remove the
newline
●
Then runs the int function to obtain the integer value.
●
Example: Finding sum of numbers in a text file
f = open("integers.txt", 'r')
Sum = 0
for line in f:
line = line.strip() # removes the newline char
number = int(line) # convert string to integer
Sum =Sum+ number
print("The sum is", Sum)
●
Obtaining numbers from a text file in which they are
separated by spaces is a bit trickier.
●
One method proceeds by reading lines in a for loop, as
before.
●
But each line now can contain several integers separated
by spaces.
●
You can use the string method split to obtain a list of the
strings representing these integers, and then process each
string in this list with another for loop.
●
Example:
f = open("integers.txt", 'r')
Sum = 0
for line in f:
wordlist = line.split()
for word in wordlist:
number = int(word)
Sum = Sum+ number
print("The sum is", Sum)
●
Note that the line does not have to be stripped of the
newline, because split takes care of that automatically.
**************************************************************
Manipulating Files and Directories
●
The file system of a computer allows you to create folders
or directories, within which you can organize files and
other directories.
●
The complete set of directories and files forms a tree-like
structure, with a single root directory at the top and
branches down to nested files and subdirectories.
●
The following shows a portion of a file system, with
directories named lambertk, parent, current, sibling, and
child.
●
Each of the last four directories contains a distinct file
named Myfile.txt.
●
When you launch Python, either from the terminal or from
IDLE, the shell is connected to a current working directory .
●
At any point during the execution of a program, you can
open a file in this directory just by using the file’s name.
●
However, you can also access any other file or directory
within the computer’s file system by using a pathname .
●
A file’s pathname specifies the chain of directories needed
to access a file or directory.
●
When the chain starts with the root directory, it’s called an
absolute pathname .
●
When the chain starts from the current working directory, it’s
called a relative pathname.
●
An absolute pathname consists of one or more directory
names, separated by the '/' character (for a Unix-based
systen and macOS) or '\'character(for a Windows-based
system).
●
The root directory is the leftmost name and the target
directory or file name is the rightmost name.
●
An absolute pathname on Unix-based systems must begin
with the '/' character, and
with a disk drive letter in Windows-based systems.
●
If you are mentioning a pathname in a Python string, you
must escape each '\' character with another '\' character.
●
For example, on a macOS file system, if Users is the root
directory above lambertk in the previous figure, then
/Users/lambertk/parent/current/child/Myfile.txt
is the absolute path to the file named Myfile.txt in the child
directory.
●
On the C: drive of a Windows file system, the same
pathname would be
C:\Users\lambertk\parent\current\child\Myfile.txt
●
Now we can use an absolute pathname to open a file
anywhere in the file system.
Example:
f =open("/Users/lambertk/parent/current/child/Myfile.txt", 'r')
●
You can abbreviate a path by providing a relative pathname.
●
Pathnames to files in directories below the current working
directory begin with a subdirectory name and are completed
with names and separator symbols on the way to the target
filename.
●
Paths to items in the other parts of the file system require you to
specify a move “up” to one or more ancestor directories, by
using the .. symbol between the separators.
●
To open the file named Myfile.txt in the child, parent, and sibling
directories, where current is the current working directory, you
could use relative pathnames as follows:
childFile = open("child/Myfile.txt", 'r')
parentFile = open("../Myfile.txt", 'r')
SiblingFile = open("../sibling/Myfile.txt",'r')
-- Note that relative pathnames do not begin with the separator
symbol.
Note:
●
When designing Python programs that interact with files,
it’s a good idea to include error recovery.
●
For example, before attempting to open a file for input, the
programmer should check to see if a file with the given
pathname exists on the disk.
OS and SYS Modules
●
The os and sys modules provide numerous tools to deal
with filenames, paths and directories.
●
The os module contains two sub-modules - os.sys (same
as sys ) and os.path that are dedicated to the system and
directories respectively.
●
These modules are wrappers for platform-specific
modules,
--> so functions like os.path.split work on UNIX, Windows,
Mac OS, and any other platform supported by Python.
The os Module:
●
Povides us with functions that are involved in directory and file
processing operations.
●
Manipulating Directories:
a) getcwd() - returns the current directory (in unicode format with
getcwdu() ).
-- The current directory can be changed using chdir()
-- syntax is os.getcwd()
Example:
>>> import os
>>> print(os.getcwd())
output : /home/casp
●
b) chdir()
- syntax is os.chdir(path)
c)listdir() - returns the contents of a directory.
- syntax is os.listdir()
d) mkdir() - creates a new directory
- syntax is os.mkdir(“directory name”)
e) rmdir() - delete an empty directory
- syntax is os.rmdir(“directory name”)
●
Manipulating files
a) remove() - Removes a file
- syntax is os.remove(“filename”)
b) rename() - To rename a file.
- syntax is os.rename(“current-name”, “new-name”)
●
User id and processes
a) os.getuid() - returns the current process’s user id.
b) os.getgid() - returns the current process’s group id.
c) os.geteuid() and os.getegid() - returns the effective user
id and effective group id
d) os.getpid() - returns the current process id
e) os.getppid() - returns the parent’s process id
sys Module
●
When starting a Python shell, Python provides 3 file objects
called standard input, standard output and standard error.
●
There are accessible via the sys module.
sys.stderr, sys.stdin, sys.stdout
●
The sys.argv is used to retrieve user argument when your
module is executable.
●
sys.path that tells you where Python is searching for modules
on your system.
- syntax is os.sys.path
●
sys.platform returns the platform version (e.g., linux2)
●
sys.version returns the python version
●
sys.version_info returns a named tuple
●
*******************************************************************