File Handling Notes
File Handling Notes
File Handling Notes
File is a named location on disk to store related information. It is used to permanently store data in
a non-volatile memory (e.g. hard disk).
Since, random access memory (RAM) is volatile which loses its data when computer is turned off,
we use files for future use of the data.
When we want to read from or write to a file we need to open it first. When we are done, it needs
to be closed, so that resources that are tied with the file are freed.
Data File Operations:
In Python, a file operation takes place in the following order.
1. Open a file
2. Read or write (perform operation)
3. Close the file
File Types:
Python allows us to create and manage two types of files:
Text File
Binary File
Text Files: A text file consists of a sequence of lines. A line is a sequence of characters (ASCII or
UNICODE), stored on permanent storage media. In text file, each line is terminated by a special
character, known as End of Line (EOL).
Binary Files: Binary files are used to store binary data such as images, video files, audio files etc. In
binary file, there is no delimiter for a line. Also no character translations can be carried out in a
binary file. As a result, binary files are easier and much faster than text files for carrying out reading
and writing operations on data.
Opening and Closing Files:
To handle data files in Python, we need to have a file variable or file object or file handle. Object
can be created by using open() function or file() function.
Open()- Opening a File
Open() function takes the name of the file as the first argument. The second argument indicates
the mode of accessing the file.
The syntax for open is:
File_object = open (file_name, access_mode)
Close() – Closing a File
The Close() method of a file object flushes any unwritten information and closes the file object,
after which no more writing can be done.
The syntax for close is:
File_object.close()
Various properties of File Object
Once open() is successful and file object gets created, we can retrieve various details related to that
file using its associated properties:
a. name:- Name of the opened file
b. mode:- Mode in which the file gets opened
c. closed:- returns Boolean value, which indicates whether the file is closed or not.
d. readable:- returns Boolean value, which indicates whether the file is readable or not.
Program to use these above properties:
file1=open("student.txt","r")
print("File Name: ",file1.name)
print("File Mode: ",file1.mode)
print("Is File Readable: ",file1.readable())
print("Is File Closed: ",file1.closed)
file1.close()
print("Is File Closed: ",file1.closed)
File Modes
There are different methods (modes) for opening a file:
Mode Description
r Opens a file for reading only. The pointer is placed at the beginning of the file.
This is the default mode. If the specified file does not exist, it will generate
FileNotFoundError.
rb Opens a file for reading only in binary format. The pointer is placed at the
beginning of the file. This is the default mode.
r+ Opens a file for both reading and writing. The file pointer will be at the
beginning of the file.
rb+ Opens a file for both reading and writing in binary format. The file pointer will
be at the beginning of the file.
w Opens a file for writing only. The pointer is placed at the beginning of the file .
Overwrites the file if the file exists. If the file does not exist, creates a new file
for writing.
wb Opens a file for writing only in binary format. The pointer is placed at the
beginning of the file. Overwrites the file if the file exists. If the file does not
exist, creates a new file for writing.
w+ Opens a file for both writing and reading. The pointer is placed at the
beginning of the file Overwrite the existing file if the file exists. If the file does
not exist, creates a new file for writing and reading.
wb+ Opens a file for both writing and reading in binary format. The pointer is
placed at the beginning of the file. Overwrite the existing file if the file exists.
If the file does not exist, creates a new file for writing and reading.
a Opens a file for appending. The pointer is placed at the end of the file if the
file exists. If the file does not exist, it creates a new file for writing.
ab Opens a file for appending in binary format. The pointer is placed at the end of
the file if the file exists. If the file does not exist, it creates a new file for
writing.
a+ Opens a file for both appending and reading. The file pointer is placed at the
end of the file if the file exists. If the file does not exist, it creates a new file for
reading and writing.
ab+ Opens a file for both appending and reading in binary format. The file pointer
is placed at the end of the file if the file exists. If the file does not exist, it
creates a new file for reading and writing.
Reading from a file
Python provides various methods for reading data from a file.
a. read():- To read the entire data from the file.
b. read(n):- To read the first ‘n’ characters from the file.
c. readline():- To read only one line from the file. It will return a line read, as a string from the file.
d. readlines():- To read all lines from the file into a list. This method will return a list of strings,
each separated by \n.
Writing to File
We can write character data into a file in Python using the following methods:
a. write(string)
b. writelines(sequence of lines)
Write(): This method writes any string to an open file and returns the number of characters
written. This method does not add a newline character (‘\n’) to the end of the string.
Program to create a file demo.txt using write() function.
file1=open("demo2.txt","w")
file1.write("This is the first line\n")
file1.write("This is the second line\n")
file1.write("This is the third line\n")
file1.write("This is the fourth line\n")
file1.close()
writelines() :- For writing a string at a time, we use write() method. It can’t be used for writing a
list, tuple etc. into a file. Sequence data type including strings can be written using writelines()
method in a file.
Program to write data to a file using writelines() method
file1=open("demo3.txt","w")
list1=["Physics\n","Chemistry\n","Maths\n","Computer Science\n"]
file1.writelines(list1)
file1.close()
Here we created a list and passed that list in wrtelines() function
With statement: Apart from using open() function for creation of file, with statement can also be
used for the same purpose.
The above program can be written using with block as:
with open("demo4.txt","w") as file1:
list1=["Physics\n","Chemistry\n","Maths\n","Computer Science"]
file1.writelines(list1)
print("is the file closed",file1.closed)
print("is the file closed",file1.closed)
Here we don’t need to close the file, because inside “with” block, the file is opened and outside
that block, the file will be closed automatically.
6. Write a program that read characters from the keyboard one by one. All lowercase characters
get stored inside the file LOWER, all upper case characters get stored inside the file UPPER and all
other characters get stored inside the file OTHERS.
file1=open("lower.txt","a")
file2=open("upper.txt","a")
file3=open("other.txt","a")
ch='y'
while ch=='y' or ch=='Y':
data=input("Enter character")
if data.islower():
file1.write(data)
elif data.isupper():
file2.write(data)
else:
file3.write(data)
ch=input("Press y/Y for continue")
file1.close()
file2.close()
file3.close()
7: Write a function in Python that counts the number of “Me” or “My” words present in a text
file “STORY.TXT”.
If the “STORY.TXT” contents are as follows:
My first book was Me and My Family. It gave me chance to be Known to the world.
The output of the function should be:
Count of Me/My in file: 3
def displayMeMy():
count=0
file1=open("story.txt","r")
for line in file1:
words=line.split()
for i in words:
if i=="Me" or i=="My":
count+=1
file1.close()
print("Count of Me in file:",count)
displayMeMy()
8: Write a function in Python that counts the number of “Me” and “My” words present in a text
file “STORY.TXT” separately.
If the “STORY.TXT” contents are as follows:
My first book was Me and My Family. It gave me chance to be Known to the world.
The output of the function should be:
Count of Me in file: 1
Count of My in file: 2
def displayMeMy():
count1=count2=0
file1=open("story.txt","r")
for line in file1:
words=line.split()
for i in words:
if i=="Me":
count1+=1
elif i=="My":
count2+=1
file1.close()
print("Count of Me in file:",count1)
print("Count of My in file:",count2)
displayMeMy()
9: Write a function AMCount() in Python, which should read each character of a text file
STORY.TXT, should count and display the occurance of alphabets A and M (including small cases a
and m too).
Example:
If the file content is as follows:
Updated information
As simplified by official websites.
The Count() function should display the output as:
A or a: 4
M or m :2
def displayMeMy():
count1=count2=0
file1=open("story.txt","r")
for line in file1:
words=line.split()
for word in words:
for j in word:
if j=="A" or j=='a':
count1+=1
elif j=="M" or j=='m':
count2+=1
file1.close()
print("Count of A or a in file:",count1)
print("Count of M or m in file:",count2)
displayMeMy()
RANDOM ACCESS IN FILES USING TELL() AND SEEK()
Till now in all our programs we laid stress on the sequential processing of data in a text and binary
file. But files in Python allow random access of the data as well using built-in methods seek() and
tell().
seek()—seek() function is used to change the position of the file handle (file pointer) to a given
specific position. File pointer is like a cursor, which defines from where the data has to be read or
written in the file.
Syntax: f.seek(offset, from_what), where f is file pointer
Parameters:
Offset: Number of positions to move forward
from_what: It defines point of reference.
Returns: Return the new absolute position.
The reference point is defined by the "from_what" argument. It can have any of the 3 values:
0: sets the reference point at the beginning of the file, which is by default.
1: sets the reference point at the current file position.
2: sets the reference point at the end of the file
Note: Reference point at current position / end of file cannot be set in text mode except when
offset is equal to 0.
Python file method seek() sets the file’s current position at the offset. This argument is optional
and defaults to 0, which means absolute file positioning. Other values are: 1, which signifies seek is
relative (may change) to the current position and 2 means seek is relative to the end of file. There is
no return value.
For example:
f.seek(20) will give the position or file number where the file pointer has been placed.
This statement shall move the file pointer to 20th byte in the file no matter where you are.
f.seek(–10,1) from current position, move 10 bytes backward
f.seek(10,1) from current position, move 10 bytes forward
f.seek(–20,1) from current position, move 20 bytes backward
f.seek(10,0) from beginning of file, move 10 bytes forward
tell()—tell() returns the current position of the file read/write pointer within the file.
Its syntax is:
f.tell() #where f is file pointer
Object serialization is the process of converting state of an object into byte stream. This byte
stream can further be stored in any file-like object such as a disk file or memory stream.
Python refers to serialization and deserialization by terms pickling and unpickling respectively.
Python provides three different modules which allow us to serialize and deserialize objects :
1. Marshal Module
2. Pickle Module
3. JSON Module
1. Marshal Module: It is the oldest module among these three. It is mainly used to read and write
the compiled byte code of Python modules. Even we can use marshal to serialize Python objects,
but it is not recommended to use. It is mainly used by the interpreter and the official
documentation warns that the python maintainers may modify the format in backward-
incompatible ways.
Pickle Module: It is another way to serialize and deserialize Python objects. It serializes the Python
object in a binary format, due to which it is not human-readable. It is faster and it also works with
custom-defined objects. The Python pickle module is a better choice for serialization and
deserialization of python objects. If you don’t need a human-readable format or if you need to
serialize custom objects then it is recommended to use the pickle module.
3. JSON Module: It is a newly created module. It allows us to work with standard JSON files. JSON is
a widely used format for data exchange and it is very convenient. It is human-readable and
language-independent, and it’s lighter than XML. Using the JSON module we can serialize and
deserialize several standard Python types like bool, dict, int, float, list, string, tuple, none etc. The
JSON module and XML are a good choice if we want to interoperability among different languages.
import pickle
def Add_Item():
file1=open("stock.dat","wb")
rec=[]
while True:
ItemNo=int(input("enter ItemNo"))
Item_Name=input("enter Item_Name")
CostPrice=int(input("enter CostPrice"))
SellingPrice=int(input("enter SellingPrice"))
rec.append([ItemNo,Item_Name,CostPrice,SellingPrice])
choice=input('Press y to continue ')
if choice=="y":
continue
else:
break
pickle.dump(rec,file1)
print("data is saved")
Add_Item()
def display():
file1=open("stock.dat","rb")
while True:
try:
data=pickle.load(file1)
for i in data:
print(i)
except EOFError:
break
def Count_Item(Item_Name):
file1=open("stock.dat","rb")
count=0
while True:
try:
data=pickle.load(file1)
for i in data:
if Item_Name==i[1]:
count+=1
print(i)
except EOFError:
break
if count==0:
print("No Item found")
else:
print("No of items found",count)
Item_Name=input("enter Item_Name")
Count_Item(Item_Name)
Relative and Absolute Paths
We all know that the files are kept in directory which are also known as folders.
Every running program has a current directory, which is generally a default directory and python
always see the default directory first.
OS module provides many such functions which can be used to work with files and directories. OS
means Operating System. Os.getcwd() returns the name of the current directory.
import os
print(os.getcwd())
//cwd stands for current working directory.
A Relative path starts from the current directory, whereas an absolute path starts from the
topmost directory in the file system.
Example:
Create a file myfile.txt with some data on desktop.
Now create a folder named python on desktop.
Now write a python program to read the data from the file myfile.txt.
file1=open("myfile.txt","r")
for data in file1:
print(data)
//It will show an error:
FileNotFoundError: [Errno 2] No such file or directory: 'myfile.txt'.
Here the path is relative.
Now again change the same program with the absolute path.
file1=open("C:\\Users\\navjeet ji\\Desktop\\myfile.txt","r")
for data in file1:
print(data)
//It will print the data.
NOTE: The Python program and the external file must be in the same directory, else we will need
to enter the entire path.
Python CSV
CSV (Comma Separated Values) is a simple flat file in a human readable format which is extensively
used to store tabular data, in a spreadsheet or database. A CSV file stores tabular data (numbers
and text) in plain text.
To represent a CSV file, it must be saved with the .csv file extension.
WHY USE CSV?
With the use of social networking sites and its various associated applications being extensively
used requires the handling of huge data. The problem arises as to how to handle and organize this
large unstructured data as shown in the figure given below.
The solution to the above problem is CSV .Thus, the CSV organizes data into a structured form and,
hence, the proper and systematic organization of this large amount of data is done by CSV. Since
CSV files formats are of plain text format, it makes it very easy for website developers to create
applications that implement CSV.
The several advantages that are offered by CSV files are as follows:
• CSV is faster to handle.
• CSV is smaller in size.
• CSV is easy to generate and import onto a spreadsheet or database.
• CSV is human readable and easy to edit manually.
• CSV is simple to implement and parse.
• CSV is processed by almost all existing applications.
CSV FILE HANDLING IN PYTHON
For working with CSV files in Python, there is an inbuilt module called CSV. It is used to read and
write tabular data in CSV format.
Therefore, to perform read and write operation with CSV file, we must import CSV module.
CSV module can handle CSV files correctly regardless of the operating system on which the files
were created.
Along with this module, open() function is used to open a CSV file, and return file object. We load
module in the usual way using import:
>>> import csv
Like other files (text and binary) in Python, there are two basic operations that can be carried out
on a CSV file.
1. Reading a CSV
2. Writing to a CSV.
writer(): This function in csv module returns a writer object that converts data into a delimited
string and stores in a file object. The function needs a file object created with open() function and
with write permission as a parameter. Every row written in the file issues a newline character by
default. To prevent additional line between rows, newline parameter is set to ''.
import csv
file1 = open("file_name.csv","w", newline='')
obj = csv.writer(file1)
writerow():
This function allow us to write a list of fields to the file. The field can be string or number or both.
Also while using writerow(), you do not need to add a new line cgaracter to indicate the end of the
line.
obj.writerow(["SN", "SNAME", "SUBJECTS"])
writerows():
This function writes each sequence in a list as a comma separated line of items in the file.
obj.writerows(data)
reader():
This function returns a reader object which is an iterator of lines in the csv file. We can use a for
loop to display lines in the file. The file should be opened in 'r' mode.
OR
import csv
file1=open('class12.csv','w',newline='')
obj=csv.writer(file1)
data=[]
header=['Roll No','Student Name','Class','Section']
data.append(header)
for i in range(3):
rollno=input("enter your roll no")
sname=input("enter your name")
sclass=input("enter your class")
section=input("enter your section")
record=[rollno,sname,sclass,section]
data.append(record)
for j in data:
obj.writerow(j)
file1.close()
# Code to read CSV file
import csv
file1=open('student.csv', 'r')
data = csv.reader(file1)
for i in data:
print(i)