CSV File Handling
CSV File Handling
A CSV file (Comma Separated Values file) is a type of plain text file that uses specific
structuring to arrange tabular data. Because it’s a plain text file, it can contain only
actual text data.
The structure of a CSV file is given away by its name. Normally, CSV files use a comma
to separate each specific data value.
CSV files are normally created by programs that handle large amounts of data. They
are a convenient way to export data from spreadsheets and databases as well as import
or use it in other programs.
CSV files are very easy to work with programmatically. Any language that supports text
file input and string manipulation (like Python) can work with CSV files directly.
The csv library provides functionality to both read from and write to CSV files. Designed
to work out of the box with Excel-generated CSV files, it is easily adapted to work with a
variety of CSV formats. The csv library contains objects and other code to read, write,
and process data from and to CSV files.
Advantages of CSV
1. It is easy to generate
2. It is human readable and easy to edit
3. It is faster to handle
4. It is smaller in size
5. It is simple to implement and parse
6. It is processed by almost all existing applications
Disadvantages of CSV
1
1. No standard way to represent binary data
2. Poor support of special characters
3. No standard way to represent control characters
4. Problems with importing CSV into SQL (no distinction between NULL and
quotes)
5. Lack of universal standard
A CSV (Comma Separated Values) format is one of the most simple and common ways
to store tabular data. To represent a CSV file, it must be saved with the .csv file
extension.
A CSV file, which is a “comma separated values” file, allows you to save your data in a
table-structured format, which is useful when you need to manage a large database.
Suppose, we would like to create a CSV file (stud.csv) for the following data:
2
Method 1 (Using Spreadsheet)
3
Method 2 Using Text Editor (Notepad)
Suppose, we would like to create a CSV file (emp.csv) for the following data:
4
Eno Name Salary
Step 1: Open Notepad and type each of your headers/field names separated by
commas onto the first line. E.g. Eno,Name,Salary and so on.
Step 2: Type your data on to the second line, using same format as your field names on
the first line. E.g. 15,Anubhav Singh,45000 and so on.
Step 3: Continue typing your data for each individual element onto each subsequent
line. If leaving any field empty, make sure you include the comma.
Step 4: Click on File and select Save option.
Step 5: Type the name of your file with .csv extension.
Step 6: Click on Save. Now your CSV file is created in Notepad.
5
Method 3 Using Python Program
1. To perform read and write operation with CSV file, we must import CSV module.
2. open() function is used to open file, and return file object.
While we could use the built in open() function to work with CSV files in Python, there is
a dedicated csv module that makes working with CSV files much easier.
Before we can use the methods of csv module, we need to import the module first
using: import csv
Here you will learn to write CSV files with different formats in Python with the help of
examples.
The csv module is used for reading and writing files. It mainly provides following classes
and functions:
1. writer()
2. reader()
3. DictWriter()
6
4. DictReader()
While using writer() function, it takes arguments and returns a writer object.
Argument Description
dialect (optional) Dialect refers to the different ways of formatting the CSV
document. By default, the csv module uses the same format as
Microsoft Excel.
The writer() instance provides the following two methods to write data:
Method Description
writerow(row) Writes a single row of data and returns the number of characters
written. The row must be a sequence of strings and number.
writerows(rows) Writes multiple rows of data and returns None. The rows must be a
sequence.
7
# Creating CSV file through Python program
# Using writerow()
import csv
header=['eno','name','salary']
rows=[
[15,'Anubhav Singh',45000],
[20,'Shivesh Tripathi',55000],
[40,'Abhinav Sagar',62000],
[45,'Dhruv Sharma',48000],
[60,'Aryan Rana',64000],
]
f=open('emp3.csv','w')
csv_writer=csv.writer(f)
csv_writer.writerow(header) # To write header
for r in rows:
csv_writer.writerow(r)
f.close()
Example 2:
# Creating CSV file through Python program
# Using writerows()
import csv
header=['eno','name','salary']
rows=[
[15,'Anubhav Singh',45000],
[20,'Shivesh Tripathi',55000],
[40,'Abhinav Sagar',62000],
[45,'Dhruv Sharma',48000],
[60,'Aryan Rana',64000],
]
f=open('emp3.csv','w')
csv_writer=csv.writer(f)
csv_writer.writerow(header) # To write header
csv_writer.writerows(rows)
f.close()
8
The output generated by both listing will be the same and it looks like this:
When we run the above program, a emp.csv file is created with the following content:
9
In the above program, we have opened the file in writing mode.
Then, we have passed each row as a list. These lists are converted to a delimited string
and written into the CSV file.
If we need to write the contents of the 2-dimensional list to a CSV file, here's how we
can do it.
Here, list is passed to the mywriter.writerows() method to write the content of the
list to the CSV file.
11
Notice the optional parameter delimiter='\t' in the csv.writer() function.
Argument Description
dialect (optional) Dialect refers to the different ways of formatting the CSV
document. By default, the csv module uses the same format as
Microsoft Excel.
Suppose we have a csv file named student.csv in the current directory with the
following entries.
12
SN,Name,City
1,Amit Gupta,Mumbai
2,Shriya Jha,Delhi
3,Dev Pathak,Chandigarh
4,Manav Chauhan,Lucknow
5,Sakshi Rawat,Chennai
Output:
13
SN Name City
1 Amit Gupta Mumbai
2 Shriya Jha Delhi
3 Dev Pathak Chandigarh
4 Manav Chauhan Lucknow
5 Sakshi Rawat Chennai
To read such files, we can pass optional parameters to the csv.reader() function.
Let's take an example.
Example 7: Read CSV file Having Tab Delimiter
Output:
['SN', 'Name', 'City']
['1', 'Amit Gupta', 'Mumbai']
['2', 'Shriya Jha', 'Delhi']
['3', 'Dev Pathak', 'Chandigarh']
['4', 'Manav Chauhan', 'Lucknow']
['5', 'Sakshi Rawat', 'Chennai']
14
file - CSV file where we want to write to
fieldnames - a list object which should contain the column headers specifying the
order in which data should be written in the CSV file
Example 8: Python csv.DictWriter()
# Program using DictWriter()
import csv
with open('stud1.csv', 'w', newline='') as f:
fieldnames = ['rno', 'name','marks']
mywriter = csv.DictWriter(f, fieldnames=fieldnames)
mywriter.writeheader()
mywriter.writerow({'rno': '10','name': 'Ananya
Pandey','marks':85})
mywriter.writerow({'rno': '20','name': 'Divyansh
Kumar','marks':70})
mywriter.writerow({'rno': '30','name': 'Swayam
Khanduri','marks':80})
The program creates a stud1.csv file with the following entries:
15
Let's see how csv.DictReader() can be used.
Output:
As we can see, the entries of the first row are the dictionary keys. And, the entries in the
other rows are the dictionary values.
Here, csv_file is a csv.DictReader() object. The object can be iterated over using
a for loop. The csv.DictReader() returned an OrderedDict type for each row.
That's why we used dict() to convert each row to a dictionary.
Notice that, we have explicitly used the dict() method to create dictionaries inside the
for loop.
print(dict(r))
Note: Starting from Python 3.8 version, csv.DictReader() returns a dictionary for
each row, and we do not need to use dict() explicitly.
16