Lesson-5-File Handling-CSV Files
Lesson-5-File Handling-CSV Files
1
Difference between Comma-Separated Values (CSV) and Excel
sheets(XLS) file formats
Excel CSV
Excel is a binary file that holds CSV format is a plain text format
information about all the worksheets with a series of values separated by
in a file, including both content and commas
formatting
XLS files can only be read by CSV can be opened with any text
applications that have been especially editor in Windows like notepad, MS
written to read their format, and can Excel, OpenOffice, etc.
only be written in the same way.
Excel is a spreadsheet that saves files CSV is a format for saving tabular
into its own proprietary format viz. information into a delimited text file
xls or xlsx with extension .csv
Excel consumes more memory while Importing CSV files can be much
importing data faster, and it also consumes less
memory
2
format, it makes it very easy for website developers to create applications that
implement CSV.
CSV files are commonly used because they are easy to read and manage, small
in size, and fast to process/transfer. Because of these salient features, they are
frequently used in software applications, ranging anywhere from online e-
commerce stores to mobile apps to desktop tools.
For example, Magento, an e-commerce platform, is known for its support of
CSV.
Thus, in a nutshell, the several advantages that are offered by CSV files are as
follows:
CSV is faster to handle.
CSV is smaller in size.
CSV is easy to generate and import onto a spreadsheet or database.
CSV is human readable and easy to edit manually.
CSV is simple to implement and parse ( resolve into component parts )
CSV is processed by almost all existing applications.
CSV files have been used extensively in e-commerce applications
because they are considered very easy to process.
Creating CSV Normal File
To create a CSV file in Notepad,
First open a new file using File →New or ctrl +N.
Then enter the data you want the file to contain, separating each value with a
comma and each row with a new line.
For example consider the following details:
Topic1,Topic2,Topic3
one,two,three
Example1,Example2,Example3
3
We can then open the same using Microsoft Excel or any other spreadsheet
program. Here the file has been opened using Microsoft Excel. It would
create a table of data similar to the following:
In the above CSV file, the fields of data were separated by commas. But what
happens if the data itself contains commas in it? If the fields of data in the CSV
file contain commas, it can be protected by enclosing those data fields in
double-quotes (“). The commas that are part of the data and will be kept
separate from the commas which delimit the fields themselves.
4
Creating CSV File that contains comma with data
For example, let’s say that one of our fields contain commas in the description.
If the data looked like the below example:
As we can see, only the fields that contain commas are enclosed in quotes. To
open this in MS Excel, it looks as shown below:
5
Creating CSV file that contains double quotes with data
If the fields contain double-quotes as part of their data, the internal quotation
marks need to be doubled so that they can be interpreted correctly. For
Example, given the following data:
6
The output will be
2. The last record in the file may or may not have an ending line break.
For example:
3. There may be an optional header line appearing as the first line of the file
with the same format as normal record lines. The header will contain
names corresponding to the fields in the file and should contain the same
7
number of fields as the records in the rest of the file. For example:
field_name1,field_name2,field_name3
4. Within the header and each record, there may be one or more fields,
separated by commas. Spaces are considered part of a field and should
not be ignored. The last field in the record must not be followed by a
comma. For example:
Red , Blue
5. Each field may or may not be enclosed in double quotes. If fields are not
enclosed with double quotes, then double quotes may not appear inside
the fields.
For example:
Note
8
The last row in the above example begins with two commas because the
first two fields of that row were empty in our spreadsheet. Don't delete
them — the two commas are required so that the fields correspond from
row to row. They cannot be omitted.
To create a CSV file using Microsoft Excel, launch Excel and then open
the file you want to save in CSV format. For example, below is the data
contained in the sample Excel worksheet:
Once the data is entered in the worksheet, select File → Save As option,
and for the “Save as type option”, select CSV (Comma delimited) or type
the file name along with extension .csv.
9
Save as dialog box
After you save the file, you are free to open it up in a text editor to view it
or to edit it manually. Its contents will resemble the following:
10
Microsoft Excel to open a CSV file
Alternatively, Microsoft Excel can be opened and in the menu bar, select
File → Open, and select the CSV file. If the file is not listed, make sure to
change the file type to be opened to Text Files (*.prn, *.txt, *.csv).
For working with CSV files in Python, there is an inbuilt module called CSV. It
is used to read and write tabular data in CSV format. Therefore, to perform read
and write operation with CSV file, we must import CSV module.
CSV module can handle CSV files correctly regardless of the operating system
on which the files were created. Along with this module, open() function is used
to open a CSV file, and return file object. We load the module in the usual way
using import:
11
1. Reading a CSV
2. Writing to a CSV.
12
The file is saved in the same folder as the Python program.
Write a program to read the contents of “student.csv” file.
import csv
f=open("C:\\Users\\Udhaya Khumari\\AppData\\Local\\Programs\\Python\\
Python36-32\\student.csv")
csv_reader=csv.reader(f)
for row in csv_reader:
print(row)
f.close()
OR
import csv
f=open(r"C:\Users\Udhaya Khumari\AppData\Local\Programs\Python\
Python36-32\student.csv")
csv_reader=csv.reader(f)
for row in csv_reader:
print(row)
f.close()
13
OR
import csv
f=open(r"C:\Users\Udhaya Khumari\AppData\Local\Programs\Python\
Python36-32\student.csv",'r')
csv_reader=csv.reader(f)
for row in csv_reader:
print(row)
f.close()
Note
r – With r we can given single slashes in path names , which makes the string a
raw string, ie., there is no special meaning attached for any character.
Output
Every record is stored in reader object in the form of a List. In the above code,
we first open the CSV file in READ mode. The file object is named as f. The
file object is converted to csv.reader object. Save the csv.reader object as
csv_reader. The reader object is used to read records as lists from a csv file.
Now, we iterate through all the rows using a for loop. When we try to print each
row, one can find that row is nothing but a list containing all the field values.
Thus, all the records are displayed as lists separated by comma.
Similarly open the Notepad and enter the data for student.csv, which will be the
equivalent for student.xls .
In student.csv (notepad) file, first line is the header and remaining lines are the
data/records. The fields are separated by comma, or we may say the separator
14
character. In general, the separator character is called a delimiter, and the
comma is not the only one used. Other popular delimiters include (\t) tab,
colon(:), semicolon(;)
15
#To find the Python path
import sys
locate_python = sys.exec_prefix
print(locate_python)
Output
C:\Python368
import csv
f=open("C:\Python368\EX-1.csv")
csv_reader=csv.reader(f)
for row in csv_reader:
print(row)
f.close()
Output
16
Write a program to read the contents of the file student.csv using with
open()
import csv
with open(r"C:\Users\Udhaya Khumari\AppData\Local\Programs\
Python\Python36-32\student.csv",'r') as csv_file:
csv_reader=csv.reader(csv_file)
rows=[] # list to store the file data
for rec in csv_file:
rows.append(rec)
print(rows)
Output
The above modified code uses “with open()” function, the only difference being
that the file being opened using with open() gets automatically closed after the
program execution gets over, unlike open() where we need to give close()
statement explicitly.
17
Save the file as given.
18
import csv
f=open(r"C:\Users\Udhaya Khumari\AppData\Local\Programs\Python\
Python36-32\shop.csv",'r')
csv_reader=csv.reader(f)
for row in csv_reader:
print(row)
f.close()
Output
['Item Name', 'Cost - RS', 'Quantity', 'Profit']
['Keyboard', '480', '12', '1152']
['Monitor', '5200', '10', '10400']
['Mouse', '200', '50', '2000']
['', '', 'Total Profit', '13552']
19
Now use the same program and output.
import csv
f=open(r"C:\Users\Udhaya Khumari\AppData\Local\Programs\Python\
Python36-32\student.csv",'r')
csv_reader=csv.reader(f)
c=0
rows=next(csv_reader)
print(rows)
for row in csv_reader:
c=c+1
print("\n No.of records :",c)
f.close()
Output
20
In the above program, a special type of object is created to access the CSV file
(reader object), which is csv_reader using the reader() function. The reader
object is an iterable that gives us access to each line of the CSV file as a list of
fields. The function next() is used to directly point to this list of fields to read
the next line in the CSV file. .next() method returns the current row and
advances the iterator to the next row. The variable ‘c’ is used as a counter
variable to count the number of rows/records present in this file, which is finally
printed and thus the output is so obtained. One of the important observations
from the output is the number of records which are being displayed as 9 instead
of 8. This is so because the header (first line) in the student csv file is also
treated as a record only. This limitation is overcome in the next implementation.
The next() method returns the current row and advances the iterator to the next
row – Explained later.
Write a program to count the exact number of records present in the csv
file excluding the header.
import csv
f=open('employee_det.csv')
csv_reader=csv.reader(f)
csvrows=[]
value=0
for row in csv_reader:
if csv_reader.line_num == 1: # skip first row
continue
csvrows.append(row)
value=len(list(csvrows))
print("\n No.of records :",value)
print(csvrows)
f.close()
Output
No.of records : 5
[['1', 'Amit', '6000'], ['2', 'Suresh Kumar', '8000'], ['3', 'Gabbar', '75000'],
['4', 'Aman', '80000'], ['5', 'Jacky', '60000']]
21
In the above program we have used line_num object of CSV file. Our
csv_reader_object has a method called line_num that returns the number of
lines in the CSV. Then, if statement checks if the line is first line or not. If the
condition is true, i.e., if it is the header line, then it is ignored using continue
statement and the counting of records is resumed from second line onwards.
Also, line_num object always stores the current line in consideration and, hence,
the correct output for 10 records is so obtained.
Note
line_num is nothing but a counter which returns the number of rows which have
been iterated
import csv
f=open(r"C:\Users\Udhaya Khumari\AppData\Local\Programs\Python\
Python36-32\student.csv",'r')
csv_reader=csv.reader(f)
for row in csv_reader:
print(','.join(row))
f.close()
Output
Name,Class ,Marks
Anish,XII,90
Akash,XII,98
Duruv,XI,67
Heera,XII,87
Vimal,XI,54
Gini,XII,45
Amit,XI,65
Kamal,XI,43
In the above program, we have used a new function join(). join() is a string
method that joins all values of each row with comma separator. Thus, all the
records are displayed as a string separated by a comma separator and not as a
list and hence the output is so obtained.
22
Write a program to search the record of a particular student from CSV file
on the basis of inputted name.
import csv
f=open(r"C:\Users\Udhaya Khumari\AppData\Local\Programs\Python\
Python36-32\student.csv",'r')
csv_reader=csv.reader(f)
name=input("Enter the name to be searched for:")
for row in csv_reader:
if row[0] == name:
print(row)
f.close()
Output
Write a program to search the record of a particular student from CSV file
on the basis of inputted class.
import csv
f=open(r"C:\Users\Udhaya Khumari\AppData\Local\Programs\Python\
Python36-32\student.csv",'r')
csv_reader=csv.reader(f)
Class=input("Enter the class to be searched for:")
for row in csv_reader:
if row[1] == Class:
print(row)
f.close()
Output
23
Write a program to search the record of a particular student from CSV file
on the basis of inputted marks. Display an appropriate message if mark is
not found.
import csv
f=open(r"C:\Users\Udhaya Khumari\AppData\Local\Programs\Python\
Python36-32\student.csv",'r')
csv_reader=csv.reader(f)
found=0
m=input("Enter the marks to be searched for:")
for row in csv_reader:
if row[2] == m: #both in string format
found=1
print(row)
if found==0:
print("Record not found")
f.close()
Output
Steps
The csv module’s reader and writer objects read and write sequences.
24
The writer object presents two functions, namely writerow() and writerows() .
The difference between them, is that the first function will only write one row,
and the function writerows() writes several rows at once.
To write to a CSV file in Python, we can use the csv.writer() function. The
csv.writer() function returns a writer object that converts the user's data into a
delimited string. This string can later be used to write into CSV files using the
writerow() function.
In order to write to a CSV file, we create a special type of object to write to the
CSV file "writer object", which is defined in the CSV module, and which we
create using the writer() function. The writerow() method allows us to write a
list of fields to the file. The fields can be strings or numbers or both. Also, while
using writerow(), we do not need to add a new line character (or other EOL
indicator) to indicate the end of the line, writerow() does it by default.
import csv
#field names
fields=['Name','Class','Year','Percent']
#data rows of csv files
rows=[
['Rohit','XII','2003','92'],
['Shourya','XI','2004','82'],
['Deep','XII','2002','82'],
['Pranathi','XI','2006','85'],
['Lakshaya','XII','2005','72']]
fname="D:\Details2.csv"
with open(fname,'w',newline='') as f:
#by default, newline is '\r\n'
#creating a csv writer object
csv_w=csv.writer(f,delimiter=',')
#writing the fields once
csv_w.writerow(fields)
for i in rows:
#writing the data rowwise
csv_w.writerow(i)
25
print("File created")
Output
File created
Right click on the file and select open with option and select Excel.
26
In the above program, the very first line is for importing csv file into the
program. Next, whatever are the column headings for the data are mentioned as
a list in variable called fields. All the data stored inside these fields are placed
inside variable rows. Now give the name of the file, let us say, student.csv and
that will be created and stored inside the current working directory or the path
that is mentioned (as given in the example D:/) for the attribute "filename". ‘w’
stands for write mode and we are using the file by opening it using "with open",
since using with open does not require file to be closed explicitly. The next
statement comprises the most important function used for writing onto csv file,
viz., csv.writer(), to obtain a writer object and store it in the variable csv_w as
the name of the variable and this is the CSV object. writer() takes the name of
file object ‘f’ as the argument. By default, the delimiter is comma (,).
writerow(fields) is going to write the fields which are the column headings into
the file and have to be written only once. Using for loop, rows are traversed
from the list of rows from the file. writerow(i) is writing the data row-wise in
the for loop and in the last the file is automatically closed. Also, while giving
csv.writer(), the delimiter taken is comma. The delimiter can be changed
whenever and wherever required by changing the argument passed to delimiter
attribute. For example, delimiter = "|" (pipe symbol). Any character can be used
as a delimiter and if nothing is given, comma is placed by default. writerow()
method is used to write each row. In this program, a for loop is used for writing
data row-wise onto the file using writerow() method.
27
The for loop can be avoided and all rows/records can be written in one go. This
can be done by using writerows() method. writerows() writes all the rows in one
go, hence there is no need for a for loop and iterations.
import csv
#field names
fields=['Name','Class','Year','Percent']
#data rows of csv files
rows=[
['Rohit','XII','2003','92'],
['Shourya','XI','2004','82'],
['Deep','XII','2002','82'],
['Pranathi','XI','2006','85'],
['Lakshaya','XII','2005','72']]
fname="D:\Details3.csv"
with open(fname,'w',newline='') as f:
#by default, newline is '\r\n'
#creating a csv writer object
csv_w=csv.writer(f,delimiter=',')
#writing the fields once
csv_w.writerow(fields)
#writing all the rows in one go
csv_w.writerows(rows)
print("All rows written in one go")
Output
28
Write a program to write data onto a CSV file ( university_records.csv )
using writerows() method.
import csv
# field names
fields = ['Name', 'Branch', 'Year', 'CGPA']
# data rows of csv file
rows = [ ['Nikhil', 'COE', '2', '9.0'],
['Sanchit', 'COE', '2', '9.1'],
['Aditya', 'IT', '2', '9.3'],
['Sagar', 'SE', '1', '9.5'],
['Prateek', 'MCE', '3', '7.8'],
['Sahil', 'EP', '2', '9.1']]
29
csvwriter.writerows(rows)
import csv
with open('D:\myfile1.csv','a') as csvfile:
mywriter=csv.writer(csvfile,delimiter='|')
ans='y'
while ans.lower()=='y':
eno=int(input("Enter empno:"))
ename=input("Enter name:")
sal=int(input("Enter salary:"))
mywriter.writerow([eno,ename,sal])
ans=input("Want to add more records?")
Now open the file in notepad – select D: drive and open in notepad.
30
Output
31
Python program to read the file
import csv
f=open("D:\myfile1.csv")
csv_reader=csv.reader(f)
for row in csv_reader:
print('|'.join(row))
f.close()
Output
3|Jinu|20000
4|Hema|56000
5|Tinu|98760
Write a program to search for any employee number from the file myfile1
and display the details of the employee. If the employee number is not
existing display an appropriate message.
import csv
f=open("D:\myfile1.csv")
csv_reader=csv.reader(f,delimiter='|')
found=0
empno=input("Enter the employee number to be searched for:")
for row in csv_reader:
if len(row)!=0:
if row[0] == empno: #both in string format
found=1
print(row)
if found==0:
print("Record not found")
f.close()
Output
32
Write a program to search for any employee number from the file myfile1
and display the details of the employee. If the employee number is not
existing display an appropriate message. The program should continue
until the user wants to.
import csv
f=open("D:\myfile1.csv")
csv_reader=csv.reader(f,delimiter='|')
while True:
found=False
empno=input("Enter the employee number to be searched for:")
for row in csv_reader:
if len(row)!=0:
if row[0] == empno: #both in string format
print("Name:",row[1])
print("Salary:",row[2])
found=True
break
if not found:
print("Record not found")
ans=input("Search more?")
if ans=='n':
break
f.close()
Output
33
next() function
The next function returns the next item from the iterator.
Syntax
next(iterator,default)
The next() function returns the next item from the iterator.
If the iterator is exhausted, it returns the default value passed as an
argument.
If the default parameter is omitted and the iterator is exhausted, it raises
StopIteration exception.
Example - 1
# Output: 5
print(next(random_iterator))
# Output: 9
print(next(random_iterator))
# Output: 'cat'
print(next(random_iterator))
Output
34
<list_iterator object at 0x03C9A650>
5
9
cat
Traceback (most recent call last):
File"C:\Users\Udhaya Khumari\AppData\Local\Programs\Python\
Python36-32\p111.py", line 18, in <module>
print(next(random_iterator))
StopIteration
Example – 2
Output
256
32
82
Example – 3
35
item = next(number)
print(item)
# third item
item = next(number)
print(item)
# fourth item
item = next(number) # error, no item is present
print(item)
Output
256
32
82
Traceback (most recent call last):
File "C:/Users/Udhaya
Khumari/AppData/Local/Programs/Python/Python36-32/p111.py", line
14, in <module>
item = next(number) # error, no item is present
StopIteration
Example – 4
Output
36
256
javatpoint
82
No item is present
Note
37