LECTURE – VI
BY: ABHISHEK BHARDWAJ
PGT- COMPUTER SCIENCE
Introduction to File Handling
A File in itself is a bunch of bytes stored on some storage device
like hard-disk, thumb-drive etc.
Need of Data Files:-
Program Execute Output
1. When, we write a program, it executes and we see output
(Temporarily) and we think that our program has successfully
been executed.
2. Once, we close our IDLE, we are unable to find output until
we follow Step 1 again from execution part (Run).
3. If, we consider the example of a banking system, this
practice is not acceptable, because each and every data is
important.
Introduction to File Handling
From previous example, it is clear that we must store the
output of our program.
Data Base
Program Execute Output STORE
Data
Files
We may store data pertaining to a specific application, in Data
Bases or in Data Files for later use. First we will discuss about
storing of data in Data Files.
Data File Handling means how we will store our data in files?
Data Files
The data files can be stored in two ways:
Text Files Binary Files
Text Files :- A text file stores information in the form of a stream
of ASCII or Unicode characters (the one which is default for
programming platform).
In text files, each line of text is terminated, (delimited) with a
special character known as EOL (End of Line) character.
Some internal translation take place when this EOL character
(e.g. when we press enter the next line will be our input area) is
read or written.
In python, by default, EOL character is the new line character
(‘\n’) or carriage-return (moving the cursor to the beginning of
the line), newline combination (‘\r\n’)
Data Files
The text files can be of following types :
Regular Text files : These are the text files which store the text in
the same form as typed. Here EOL is translated and ends a line.
File Extension .txt
Delimited Text files : A specific character is stored to separate the
values. e.g. a tab (TSV- tab separated value files) or a comma
(CSV- comma separated value file) after every value.
Regular text file content : I am simple text.
TSV file content : I am simple text.
CSV file content : I, am, simple, text.
NOTE : Some setup files (e.g. Initialization .INI files) and rich text format files
(.RTF files) are also text files.
Data Files
Binary Files : stores the information in the form of a stream of
bytes.
It has the information in same format in which the information
is held in memory.
File contents are raw (without translations or no specific
encoding).
No delimiter (a blank space, comma, or other character or
symbol that indicates the beginning or end of a character string,
word, or data item) for a line.
As no translation occur in binary files, these files are faster and
easier for a program to read and write than are text files.
The text files can be opened in any text editor and are in
human readable form, while binary files are not in human
readable form.
Difference between Text Files and Binary Files
Ser Text Files Binary Files
A text file stores information in Stores the information in the form
1. the form of a stream of ASCII or of a stream of bytes.
Unicode characters.
Each line of text is terminated, No Delimiter for a line.
(delimited) with a special
2.
character known as EOL (End of
Line) character.
Some internal translation take As no translation occur in binary
place when EOL character is files, these files are faster and easier
3.
Read/ Write. for a program to read and write than
are text files.
The text files can be opened in The binary files are not in human
4. any text editor and are in readable form.
human readable form.
Working With Data Files
The most basic file manipulation tasks include adding,
modifying or deleting data in a file.
Any one or combination of operations may be performed : -
Reading data from files
Writing data to files
Appending data to files
NOTE : In order to work with a file first open it in a specific mode.
File Access Modes
Text File Binary File
Description Notes
Mode Mode
Default Mode ; File must already exist, otherwise Python
‘r’ ‘rb’ read only will raise I/O Error.
If the file does not exist, file is created.
‘w’ ‘wb’ write only If the file exist, truncate existing data. So, this mode
must be used with caution.
File is in write only mode.
If file exists, data in file is retained and new data being
‘a’ ‘ab’ append written will be appended to the end.
If the file does not exist, file is created.
read and File must exist otherwise error is raised.
‘r+’ ‘r+b’ or rb+ Both reading and writing operations can take place.
write
‘w+b’ or write and If the file does not exist, file is created.
‘w+’ If file exist, file is truncated.
wb+ read Both reading and writing operations can take place.
If the file does not exist, file is created.
write and If file exist, data in file is retained and new data is
‘a+’ ‘a+b’ or ab+ appended.
read Both reading and writing operations can take place.
Opening and Closing Files
Open() function as per one of the following syntaxes :-
<file_objectname>=open(<filename>)
<file_objectname>=open(<filename>, <mode>)
e.g. myfile=open(“student.txt”)
A file-object is also known as file-handle, is a reference to a file
on disk. It opens it & makes it available for different tasks.
[Python will look this file in current working directory (Directory
in which, we store our program or module file)]
Opened file is attach to its file object e.g. myfile (file object)
Default mode of opened files is read mode (or we may assign
mode as “r” for read mode)
NOTE : In read mode, the given file must exist in the folder,
otherwise Python will raise FileNotFound Error.
Opening and Closing Files
<file_objectname>=open(<filename>, <mode>)
myfile=open(“student.txt”, “r”)
myfile1=open(“student.txt”, “w”)
myfile2=open(“e:\\main\\student.txt”, “w”)
Path : Python will look in E: drive\main folder
myfile3=open(r “e:\main\student.txt”, “r”)
The \\ or prefix r in front of a string makes it raw string that means there is no
special meaning attached to any character.
f=open(“c:\temp\data.txt”, r) In this example \t will be treated as tab character.
Opening and Closing Files
File objects are used to read and write data to a file on disk.
The file object is used to obtain a reference to the file on disk
and open it for a number of different tasks.
All the functions we perform on a data file are performed
through file-objects.
File mode governs the type of operations (e.g.
read/write/append) possible in the opened file i.e. it refers to
how the file will be used once it’s opened?
close() method is used to close a file. In Python, files are
automatically closes at the end of the program but it is good
practice to close files explicitly. Because if program exits
unexpectedly there is a danger that data may not have been
written to the file!
Opening and Closing Files
A close() function breaks the link of file-object and the file on
the disk. After close(), no tasks can be performed on that file
through the file-object (file-handle).
<file_object>.close()
file3.close()
NOTE : open() is a built-in function (used stand-alone), while
close() is a method used with file-handle object.
Working with Text Files
Reading from Text Files
Ser Method Syntax Description
Reads at most n bytes; if no n
is specified, reads the entire file.
1. read() <fileobject>.read ([n]) Returns the read bytes in the
form of a string.
reads a line of input; if n is
specified reads at most n bytes.
Returns the read bytes in the
2. readline() <fileobject>.readline ([n]) form of a string ending with
In(line) character or returns a
blank string if no more bytes are
left for reading in the file.
Reads all lines and returns
3. readlines() <fileobject>.readlines () them in a list.
Reading a file’s first 30 bytes and printing it.
Text File : ssc.txt Code Snippet 1
Code Snippet 2
Output
If the ssc.txt is in same folder, in which
the program file is.
If the ssc.txt is stored in some other
drive/ location.
Reading n bytes and then reading more bytes from
the last position read.
Text File : ssc.txt Code Snippet
Output
Reading a file entire content.
Text File : ssc.txt Code Snippet
Output
Reading a file’s first three lines- line by line.
Text File : ssc.txt Code Snippet
Output
Reading a complete file – line by line.
Text File : ssc.txt Code Snippet
Output
Displaying the size of a file after removing EOL (\n)
characters, leading and trailing white spaces and blank lines.
Text File : ssc.txt Code Snippet
Output
Reading a complete file in a List.
Text File : ssc.txt Code Snippet
read() and readline() read bytes and
return them in string.
readlines() reads lines and return
them in List.
Output
Write a program to display the size of a file in bytes.
Text File : ssc.txt Code Snippet
Output
Write a program to display the number of lines in the file.
Text File : ssc.txt Code Snippet
Output
Working with Text Files
Writing onto Text Files
Ser Method Syntax Description
Writes string str to file
1. write() <fileobject>.write (str)
referred by <fileobject>.
Writes all strings in list L
2. writelines() <fileobject>.writelines (L) as lines to file referenced by
<fileobject>.
Create a file to hold data of 5 student names.
Code Snippet
Output
Create a file to hold data of 5 names separated as lines.
Code Snippet
Output
Creating a file with some names separated by newline
characters without using write() function.
Code Snippet
Output
The flush() Function
When we write onto a file using any of the write functions,
Python holds everything to write in the file in buffer and pushes
it onto actual file on storage device a later time.
flush() function can be used to force Python to write the
contents of buffer onto storage.
Python automatically flushes the file buffers when closing them
i.e. this function is implicitly called by the close() function.
But it flush the data before closing any file.
The syntax to use flush() function is:
<fileobject>.flush()
Write a program to get roll numbers, names and marks of
the students of a class (prompt user) and store these details
in a file called “Marks.txt”.
Code Snippet
In this program values are
separate by comma.
This is called CSV format
(Comma Separated Values)
At Runtime Inputs by User Output
Write a program to add two more students’ details to the
file “Marks.txt” created in last program.
Code Snippet
Open file
in append
“a” mode,
as old data
must be
retained.
At Runtime Inputs by User Output
Write a program to display the contents of file “Marks.txt”
created in last two programs.
Code Snippet
Input File Output
Here extra space among lines
is due to print() function &
‘\n’.
If you don’t want this, use
end=“ ” in print function.
Read, Write and Search CSV (Comma Separated Value) Files
CSV files are delimited files that store tabular data (data stored
in rows and columns).
The separator character of CSV files is called a delimiter. Default
and most popular delimiter is comma.
Other are tab (\t), colon (:), pipe (|) and semi-colon (;)
characters.
Since CSV files are text files, we may apply text file procedures
on these and then split values using split() function, but using csv
module in Python we may handle CSV files.
The csv module of Python provides functionality to read and
write tabular data in CSV format.
Two specific types of objects – the reader and writer objects to
read and write into CSV files.
Why CSV files are popular?
Easier to create.
Preferred export and import format for databases and
spreadsheets.
Capable of storing large amounts of data.
Opening and Closing CSV Files:
obj=open(“student.csv”, “w”) obj.close()
CSV file opened in write mode with the file handle as obj
CSV file is closed
in the same
fobj=open(“student.csv”, “r”) manner as any
other file.
CSV file opened in read mode with the file handle as fobj
Writing in CSV files.
MEMORY
csv.writerow() is
used to write
onto the writer
object
csv.writer object
It converts the user data Delimited Data
csv.writerow()
into csv writable form, i.e.
Input User Data
delimited string form as
per csv settings.
CSV File on
ROLE OF THE CSV WRITER OBJECT
storage disk
FUNCTIONS
csv.writer() returns a writer object which writes data into CSV file
<writerobject>.writerow() writes one row of data onto the writer object.
<writerobject>.writerows() writes multiple rows of data onto the writer object.
Reading in CSV files.
MEMORY fetch one row Iterable
at a time from
reader object
using a loop One row
csv.reader object of data
It parses the delimited csv Loop for reading One row
CSV File on file data and loads it into of data
storage disk an iterable.
One row
of data
ROLE OF THE CSV READER OBJECT
FUNCTION
returns a reader object which loads data from CSV file into an iterable after
csv.reader()
parsing delimited data.
Python Program to Write, Read and Search into CSV file
1. If you use with open() then there is
1 2 no need to use close() in the end of
the program.
3
4 2. fobj is file handle/ object/ pointer
for opening the file.
3,5. writerow() takes only one
5 argument , so take a list to enter
multiple values.
4. True means loop will execute until
6 we terminate it.
6. break to terminate loop.
7. next() will skip first line and
searching will start from line no. 2.
First line is skipped because, while
having condition i[2]>=90, there is no
7
number for comparison in first line.
8 ‘Roll_No’, ’Name’, ’Total_Marks’
8. i[2] will check index no. 2 means,
value no. 3 in list.
Note : Additional parameter newline=‘’
NOTE: The csv.writer writes \n into the file directly. so
open file with the additional parameter newline='' (empty
string) instead.
If we do not use this extra parameter, then when we open
our CSV file, it will show alternative lines blank (means
data in first row then second row blank, data in third row
and fourth row blank and so on).
Python Program to Write record of students and Search it
into CSV file by roll no. given by user.
Python Program to Write record of students and Search into
CSV file to Print record of student having ‘MAX’ marks.
Binary Files in Python
Stores the information in the form of a stream of bytes.
No Delimiter for a line.
As no translation occur in binary files, these files are faster
and easier for a program to read and write than are text
files.
The binary files are not in human readable form.
Binary Files in Python
As data in Binary Files are stored in Stream of Bytes, so it is
necessary to store non-simple objects like dictionaries,
tuples, lists in such a way so that their structure/ hierarchy is
maintained.
For this purpose, objects are often serialized and then
stored in binary files.
Pickling / Serialisation
Structure
Byte Stream
(List/ Dictionary) Unpickling / De-Serialisation
“The pickle module implements a fundamental, but powerful algorithm
for serializing and de-serializing a Python object structure.”
Pickling and Unpickling
Pickling/ Serialisation : is the process of converting Python
object hierarchy into a byte stream so that it can be written
into a file.
Unpickling/ De-Serialisation : is the inverse of Pickling where
a byte stream is converted into an object hierarchy.
Unpickling produces the exact replica of the original object.
In order to work with the pickle module, import it.
import pickle
pickle.dump(Structure, file_object) Structure_var =pickle.load(file_object)
To write on binary file. dump() load() To read from binary file.
Working with pickle module
Process of working with binary files :
(i) Import pickle module.
(ii) Open binary file in the required file mode (read or write
mode).
(iii) Process binary file by writing/ reading objects using
pickle module’s methods.
(iv) Once done, close the file.
Writing into Binary File (Structure : List)
Program Code
Output (File saved in the directory where Python program was saved)
Reading in Binary File (Structure : List)
Program Code
Output
Reading and Writing into Binary File (Structure : Dictionary)
Program Code
Output
Reading ,Writing (Multiple Records) & Searching into Binary
File (Structure : Nested List)
Continue…..
Reading ,Writing (Multiple Records) & Searching into Binary
File (Structure : Nested List)
Continue…..
Reading ,Writing (Multiple Records) & Searching into Binary
File (Structure : Nested List)
Output
Setting Offsets in a File
The functions (read/ write) which we have used till now
are used to access the data sequentially from a file.
But if we want to access data in a random fashion, then
Python gives us seek() and tell() functions to do so.
tell() : This function returns an integer that specifies the
current position of the file object in the file.
The position so specified is the byte position from the
beginning of the file till the current position of the file
object.
The syntax of using tell() is: file_object.tell()
Setting Offsets in a File
seek(): This method is used to position the file object at
a particular position in a file.
Syntax: file_object.seek(offset [, reference_point])
file_object.seek(offset , from_what)
Offset is the number of bytes (Characters) by which the
file object is to be moved.
reference_point indicates the starting position of the
file object.
0 - Beginning of the File
1 - Current position of the File
2 - End of File
By default, the value of reference_point is 0, i.e. the
offset is counted from the beginning of the file.
Program to know the Position of your File Pointer
Text File : ssc.txt Code Snippet
Output
Program to know the Position of your File Pointer
Text File : ssc.txt Code Snippet
Output
Program to know the Position of your File Pointer
Text File : ssc.txt Code Snippet
Output
Program to know the Position of your File Pointer
Text File : ssc.txt Code Snippet
Output
Program to know the Position of your File Pointer
Text File : ssc.txt Code Snippet
Output
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet
1
2
Output
1 By default read mode keep the
file pointer in the starting of the
file.
2 Default value of from_what/
reference point is also ‘0’, which
also keeps the file pointer in the
starting of the File.
Difference between read() and seek()
A read call will read the specified amount of bytes from
a "file".
Read call will also advance the position of the offset
according to how much bytes it read.
Seeking is the equivalent of just scrolling the bar to
whatever position you want.
It doesn't read anything in-between the jump.
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet
Output
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet
Output
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet
Output
While opening the text file, mode
‘rb’ has been used in place of ‘r’
because Python versions above 3.0
show io.UnsupportedOperation
error if any other reference_point is
used in place of default ‘0’ .
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet
Output
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet
Output
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet
Output
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet
Output