LECTURE – VI
BY: ABHISHEK BHARDWAJ
PGT- COMPUTER SCIENCE
Introduction to File Handling
A File in itself is a bunch of bytes stored on some storage
device like hard-disk, thumb-drive etc.
Need of Data Files:-
Program Execute Output
1.When, we write a program, it executes and we see
output (Temporarily) and we think that our program has
successfully been executed.
2.Once, we close our IDLE, we are unable to find output
until we follow Step 1 again from execution part (Run).
3.If, we consider the example of a banking system,
this practice is not acceptable, because each and every
data is important.
Introduction to File Handling
From previous example, it is clear that we must store the
output of our program.
Data Base
Program Execute Output STORE
Data
Files
We may store data pertaining to a specific application, in Data
Bases or in Data Files for later use. First we will discuss about
storing of data in Data Files.
Data File Handling means how we will store our data in files?
Data Files
The data files can be stored in two ways:
Text Files Binary
Files
Text Files :- A text file stores information in the form of a stream
of ASCII or Unicode characters (the one which is default for
programming platform).
In text files, each line of text is terminated, (delimited) with
a special character known as EOL (End of Line) character.
Some internal translation take place when this EOL
character (e.g. when we press enter the next line will be our
input area) is read or written.
In python, by default, EOL character is the new line
character (‘\n’) or carriage-return (moving the cursor to the
beginning of the line), newline combination (‘\r\n’)
Data Files
The text files can be of following types :
Regular Text files : These are the text files which store the text in
the same form as typed. Here EOL is translated and ends a line.
File Extension .txt
Delimited Text files : A specific character is stored to separate the
values. e.g. a tab (TSV- tab separated value files) or a comma
(CSV- comma separated value file) after every value.
Regular text file content : I am simple text.
TSV file content : I am simple
CSV file content : text. I, am, simple, text.
NOTE : Some setup files (e.g. Initialization .INI files) and rich text format files
(.RTF files) are also text files.
Data Files
Binary Files : stores the information in the form of a stream of
bytes.
It has the information in same format in which the
information is held in memory.
File contents are raw (without translations or no
specific encoding).
No delimiter (a blank space, comma, or other character
or symbol that indicates the beginning or end of a character
string, word, or data item) for a line.
As no translation occur in binary files, these files are faster
and easier for a program to read and write than are text files.
The text files can be opened in any text editor and are
in human readable form, while binary files are not in
human readable form.
Difference between Text Files and Binary Files
Ser Text Files Binary Files
A text file stores information in Stores the information in the form
1. the form of a stream of ASCII or of a stream of bytes.
Unicode characters.
Each line of text is terminated, No Delimiter for a line.
(delimited) with a special
2. character known as EOL (End of
Line) character.
Some internal translation take As no translation occur in binary
place when EOL character is files, these files are faster and easier
3. Read/ Write. for a program to read and write than
are text files.
The text files can be opened in The binary files are not in
4. any text editor and are in human readable form.
human readable form.
Working With Data Files
The most basic file manipulation tasks include adding,
modifying or deleting data in a file.
Any one or combination of operations may be performed : -
Reading data from files
Writing data to files
Appending data to files
NOTE : In order to work with a file first open it in a specific mode.
File Access Modes
Text File Binary File
Mode Mode Description Notes
Default Mode ; File must already exist, otherwise Python
‘r’ ‘rb’ read only will raise I/O Error.
If the file does not exist, file is created.
‘w’ ‘wb’ write only If the file exist, truncate existing data. So, this
mode must be used with caution.
File is in write only mode.
If file exists, data in file is retained and new data being
‘a’ ‘ab’ append written will be appended to the end.
If the file does not exist, file is created.
read and File must exist otherwise error is raised.
‘r+’ ‘r+b’ or rb+ write Both reading and writing operations can take place.
‘w+b’ or write and If the file does not exist, file is created.
‘w+’ If file exist, file is truncated.
wb+ read Both reading and writing operations can take place.
If the file does not exist, file is created.
write and If file exist, data in file is retained and new data
‘a+’ ‘a+b’ or ab+ read is appended.
Both reading and writing operations can take
place.
Opening and Closing Files
Open() function as per one of the following syntaxes :-
<file_objectname>=open(<filename>)
<file_objectname>=open(<filename>, <mode>)
e.g. myfile=open(“student.txt”)
A file-object is also known as file-handle, is a reference to a
file on disk. It opens it & makes it available for different tasks.
[Python will look this file in current working directory (Directory
in which, we store our program or module file)]
Opened file is attach to its file object e.g. myfile (file object)
Default mode of opened files is read mode (or we may assign
mode as “r” for read mode)
NOTE: In read mode, the given file must exist in the folder,
otherwise Python will raise FileNotFound Error.
Opening and Closing Files
<file_objectname>=open(<filename>, <mode>)
myfile=open(“student.txt”, “r”)
myfile1=open(“student.txt”, “w”)
myfile2=open(“e:\\main\\student.txt”, “w”)
Path : Python will look in E: drive\main folder
myfile3=open(r “e:\main\student.txt”, “r”)
The \\ or prefix r in front of a string makes it raw string that means there is
no special meaning attached to any character.
f=open(“c:\temp\data.txt”, r) In this example \t will be treated as tab
character.
Opening and Closing Files
File objects are used to read and write data to a file on disk.
The file object is used to obtain a reference to the file on
disk and open it for a number of different tasks.
All the functions we perform on a data file are
performed through file-objects.
File mode governs the type of operations
(e.g. read/write/append) possible in the opened file i.e. it
refers to how the file will be used once it’s opened?
close() method is used to close a file. In Python, files
are automatically closes at the end of the program but it is
good practice to close files explicitly. Because if program
exits unexpectedly there is a danger that data may not have
been written to the file!
Opening and Closing Files
A close() function breaks the link of file-object and the file
on the disk. After close(), no tasks can be performed on that
file through the file-object (file-handle).
<file_object>.close()
file3.close()
NOTE : open() is a built-in function (used stand-alone), while
close() is a method used with file-handle object.
Working with Text Files
Reading from Text Files
Ser Method Syntax Description
Reads at most n bytes; if no
n is specified, reads the entire
1. read() <fileobject>.read ([n]) file.
Returns the read bytes in
the form of a string.
reads a line of input; if n is
specified reads at most n bytes.
Returns the read bytes in
the form of a string ending with
2. readline() <fileobject>.readline ([n]) In(line) character or returns a
blank string if no more bytes are
left for reading in the file.
Reads all lines and
3. readlines() <fileobject>.readlines () returns them in a list.
Reading a file’s first 30 bytes and printing it.
Text File : ssc.txt Code Snippet 1
Code Snippet 2
Output
If the ssc.txt is in same folder, in which
the program file is.
If the ssc.txt is stored in some other
drive/ location.
Reading n bytes and then reading more bytes from
the last position read.
Text File : ssc.txt Code Snippet
Output
Reading a file entire content.
Text File : ssc.txt Code Snippet
Output
Reading a file’s first three lines- line by line.
Text File : ssc.txt Code Snippet
Output
Reading a complete file – line by line.
Text File : ssc.txt Code Snippet
Output
Displaying the size of a file after removing EOL (\n)
characters, leading and trailing white spaces and blank lines.
Text File : ssc.txt Code Snippet
Output
Reading a complete file in a List.
Text File : ssc.txt Code Snippet
read() and readline() read bytes and
return them in string.
readlines() reads lines and
return
them in List.
Output
Write a program to display the size of a file in bytes.
Text File : ssc.txt Code Snippet
Output
Write a program to display the number of lines in the file.
Text File : ssc.txt Code Snippet
Output
Working with Text Files
Writing onto Text Files
Ser Method Syntax Description
Writesstring str to
1. write() <fileobject>.write (str) file referred by <fileobject>.
Writes all strings in list
2. writelines() <fileobject>.writelines (L) L as lines to file referenced
by
<fileobject>.
Create a file to hold data of 5 student names.
Code Snippet
Output
Create a file to hold data of 5 names separated as lines.
Code Snippet
Output
Creating a file with some names separated by newline
characters without using write() function.
Code Snippet
Output
The flush() Function
When we write onto a file using any of the write
functions, Python holds everything to write in the file in buffer
and pushes it onto actual file on storage device a later time.
flush() function can be used to force Python to write
the contents of buffer onto storage.
Python automatically flushes the file buffers when closing them
i.e. this function is implicitly called by the close() function.
But it flush the data before closing any file.
The syntax to use flush() function is:
<fileobject>.flush()
Write a program to get roll numbers, names and marks of
the students of a class (prompt user) and store these details
in a file called “Marks.txt”.
Code Snippet
In this program values are
separate by comma.
This is called CSV format
(Comma Separated Values)
At Runtime Inputs by User Output
Write a program to add two more students’ details to the
file “Marks.txt” created in last program.
Code Snippet
Open file
in append
“a” mode,
as old data
must be
retained.
At Runtime Inputs by User Output
Write a program to display the contents of file “Marks.txt”
created in last two programs.
Code Snippet
Input File Output
Here extra space among lines
is due to print() function &
‘\n’.
If you don’t want this, use
end=“ ” in print function.
Read, Write and Search CSV (Comma Separated Value) Files
CSV files are delimited files that store tabular data (data
stored in rows and columns).
The separator character of CSV files is called a delimiter.
Default and most popular delimiter is comma.
Other are tab (\t), colon (:), pipe (|) and semi-colon
(;) characters.
Since CSV files are text files, we may apply text file
procedures on these and then split values using split() function,
but using csv module in Python we may handle CSV files.
The csv module of Python provides functionality to read
and write tabular data in CSV format.
Two specific types of objects – the reader and writer objects
to read and write into CSV files.
Why CSV files are popular?
Easier to create.
Preferred export and import for databases and
spreadsheets.
format
Capable of storing large amounts of data.
Opening and Closing CSV Files:
obj=open(“student.csv”, “w”) obj.close()
CSV file opened in write mode with the file handle as obj
CSV file is closed
in the same
fobj=open(“student.csv”, “r”) manner as any
other file.
CSV file opened in read mode with the file handle as fobj
Writing in CSV files.
MEMORY
csv.writerow() is
used to write
onto the writer
object
csv.writer object
It converts the user data Delimited Data
csv.writerow()
into csv writable form, i.e.
Input User Data
delimited string form as
per csv settings.
CSV File on
ROLE OF THE CSV WRITER OBJECT
storage disk
FUNCTIONS
csv.writer() returns a writer object which writes data into CSV file
<writerobject>.writerow() writes one row of data onto the writer object.
<writerobject>.writerows() writes multiple rows of data onto the writer object.
Reading in CSV files.
MEMORY fetch one row Iterable
at a time from
reader object
using a loop One row
csv.reader object of data
It parses the delimited csv Loop for reading One row
CSV File on file data and loads it into of data
storage disk an iterable.
One row
of data
ROLE OF THE CSV READER OBJECT
FUNCTION
returns a reader object which loads data from CSV file into an iterable
csv.reader()
after parsing delimited data.
Python Program to Write, Read and Search into CSV file
1. If you use with open() then there is
1 2 no need to use close() in the end of
the program.
3
4 2. fobj is file handle/ object/ pointer
for opening the file.
3,5. writerow() takes only one
5 argument , so take a list to enter
multiple values.
4. True means loop will execute until
6 we terminate it.
6. break to terminate loop.
7. next() will skip first line
and searching will start from line
no. 2. First line is skipped because,
while having condition i[2]>=90, there
7
is no number for comparison in first
8 line. ‘Roll_No’, ’Name’, ’Total_Marks’
8. i[2] will check index no. 2 means,
value no. 3 in list.
Note : Additional parameter newline=‘’
NOTE: The csv.writer writes \n into the file directly. so
open file with the additional parameter newline='' (empty
string) instead.
If we do not use this extra parameter, then when we open
our CSV file, it will show alternative lines blank (means
data in first row then second row blank, data in third row
and fourth row blank and so on).
Python Program to Write record of students and Search it
into CSV file by roll no. given by user.
Python Program to Write record of students and Search into
CSV file to Print record of student having ‘MAX’ marks.
Binary Files in Python
Stores the information in the form of a stream of bytes.
No Delimiter for a line.
As no translation occur in binary files, these files are
faster and easier for a program to read and write than
are text files.
The binary files are not in human readable form.
Binary Files in Python
As data in Binary Files are stored in Stream of Bytes, so it
is necessary to store non-simple objects like
dictionaries, tuples, lists in such a way so that their
structure/ hierarchy is maintained.
For this purpose, objects are often serialized and
then stored in binary files.
Pickling / Serialisation
Structure
Byte Stream
(List/ Dictionary) Unpickling / De-Serialisation
“The pickle module implements a fundamental, but powerful algorithm
for serializing and de-serializing a Python object structure.”
Pickling and Unpickling
Pickling/ Serialisation : is the process of converting Python
object hierarchy into a byte stream so that it can be written
into a file.
Unpickling/ De-Serialisation : is the inverse of Pickling where
a byte stream is converted into an object hierarchy.
Unpickling produces the exact replica of the original object.
In order to work with the pickle module, import it.
import pickle
pickle.dump(Structure, file_object) Structure_var =pickle.load(file_object)
To write on binary file. dump() load() To read from binary file.
Working with pickle module
Process of working with binary files :
(i) Import pickle module.
(ii) Open binary file in the required file mode (read or
write mode).
(iii) Process binary file by writing/ reading objects
using pickle module’s methods.
(iv) Once done, close the file.
Writing into Binary File (Structure : List)
Program Code
Output (File saved in the directory where Python program was
saved)
Reading in Binary File (Structure : List)
Program Code
Output
Reading and Writing into Binary File (Structure : Dictionary)
Program Code
Output
Reading ,Writing (Multiple Records) & Searching into Binary
File (Structure : Nested List)
Continue…..
Reading ,Writing (Multiple Records) & Searching into Binary
File (Structure : Nested List)
Continue…..
Reading ,Writing (Multiple Records) & Searching into Binary
File (Structure : Nested List)
Output
Setting Offsets in a File
The functions (read/ write) which we have used till
now are used to access the data sequentially from a file.
But if we want to access data in a random fashion,
then Python gives us seek() and tell() functions to do so.
tell() : This function returns an integer that specifies the
current position of the file object in the file.
The position so specified is the byte position from
the beginning of the file till the current position of the
file object.
The syntax of using tell() is: file_object.tell()
Setting Offsets in a File
seek(): This method is used to position the file object at
a particular position in a file.
Syntax: file_object.seek(offset [, reference_point])
file_object.seek(offset , from_what)
Offset is the number of bytes (Characters) by
which the
file object is to be moved.
reference_point indicates the starting
position of the
file object.
1 - Beginning of the File
2 - Current position of the File
3 - End of File
By default, the value of reference_point is 0, i.e. the
Program to know the Position of your File Pointer
Text File : ssc.txt Code Snippet
Output
Program to know the Position of your File Pointer
Text File : ssc.txt Code Snippet
Output
Program to know the Position of your File Pointer
Text File : ssc.txt Code Snippet
Output
Program to know the Position of your File Pointer
Text File : ssc.txt Code Snippet
Output
Program to know the Position of your File Pointer
Text File : ssc.txt Code Snippet
Output
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet
1
2
Output
1By default read mode keep
the file pointer in the starting of
the file.
2Default value of
from_what/ reference point is
also ‘0’, which also keeps the
file pointer in the starting of the
File.
Difference between read() and seek()
A read call will read the specified amount of bytes
from a "file".
Read call will also advance the position of the
offset
according to how much bytes it read.
Seekingis the equivalent of just scrolling the bar to
whatever position you want.
It doesn't read anything in-between the jump.
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet
Output
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet
Output
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet
Output
While opening the text file, mode
‘rb’ has been used in place of ‘r’
because Python versions above 3.0
show io.UnsupportedOperation
error if any other reference_point is
used in place of default ‘0’ .
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet
Output
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet
Output
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet
Output
Program to Position your File Pointer at Specified Location in
File
Text File : ssc.txt Code Snippet
Output