0% found this document useful (0 votes)
457 views

CSV File Handling

A CSV (comma separated values) file is a plain text file format used to store tabular data such as data exported from spreadsheets and databases. CSV files use commas to separate each data value into fields and can be generated by programs that handle large amounts of data. CSV files are easy to work with programmatically using languages like Python that support text file input and string manipulation. The csv library in Python provides functionality to read from and write to CSV files.

Uploaded by

koopr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
457 views

CSV File Handling

A CSV (comma separated values) file is a plain text file format used to store tabular data such as data exported from spreadsheets and databases. CSV files use commas to separate each data value into fields and can be generated by programs that handle large amounts of data. CSV files are easy to work with programmatically using languages like Python that support text file input and string manipulation. The csv library in Python provides functionality to read from and write to CSV files.

Uploaded by

koopr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

CSV File Handling

A CSV file (Comma Separated Values file) is a type of plain text file that uses specific
structuring to arrange tabular data. Because it’s a plain text file, it can contain only
actual text data.

The structure of a CSV file is given away by its name. Normally, CSV files use a comma
to separate each specific data value.

CSV files are normally created by programs that handle large amounts of data. They
are a convenient way to export data from spreadsheets and databases as well as import
or use it in other programs.

CSV files are very easy to work with programmatically. Any language that supports text
file input and string manipulation (like Python) can work with CSV files directly.

The csv library provides functionality to both read from and write to CSV files. Designed
to work out of the box with Excel-generated CSV files, it is easily adapted to work with a
variety of CSV formats. The csv library contains objects and other code to read, write,
and process data from and to CSV files.

Features (characteristics) of CSV files:


1. One line for each record
2. Comma separated fields
3. Space characters adjacent to commas are ignored
4. Fields with in built commas are separated by double quote characters
5. Fields with double quote characters must be surrounded by double quotes. Each
in built double quote must be represented by a pair of consecutive quotes.
6. Fields that contain inbuilt line-breaks must be surrounded by double quotes

Advantages of CSV
1. It is easy to generate
2. It is human readable and easy to edit
3. It is faster to handle
4. It is smaller in size
5. It is simple to implement and parse
6. It is processed by almost all existing applications

Disadvantages of CSV

1
1. No standard way to represent binary data
2. Poor support of special characters
3. No standard way to represent control characters
4. Problems with importing CSV into SQL (no distinction between NULL and
quotes)
5. Lack of universal standard

A CSV (Comma Separated Values) format is one of the most simple and common ways
to store tabular data. To represent a CSV file, it must be saved with the .csv file
extension.

How to Create a CSV File

A CSV file, which is a “comma separated values” file, allows you to save your data in a
table-structured format, which is useful when you need to manage a large database.

CSV files can be created in three ways:

1. Using Spreadsheet (Microsoft Excel/OpenOffice Calc/ Google Spreadsheets


2. Using Text Editor (Notepad)
3. Using Python program

Suppose, we would like to create a CSV file (stud.csv) for the following data:

RNo Name Class Marks

10 Aaditya Bisht XII 80

15 Brinda Pathak XII 75

20 Shriya Jha XII 90

25 Gaurav Adhikari XII 85

40 Swayam Khanduri XII 80

2
Method 1 (Using Spreadsheet)

Step 1: Open a new spreadsheet/worksheet


Step 2: Type each of your headers/field names into the cells located in row 1 at the top
of worksheet. For example, type Rno into cell A1, Name into cell B1, Class into
cell C1, Marks in D1 and so on.
Step 3: Enter your data into worksheet under each column as needed. Using the
example outlined in step 2, enter roll no into cell A2, Name of student into cell
B2, class into C2, marks in D2 and so on.

Step 4: Click on File tab and select Save As after you


have entered all data into the spreadsheet. If
using Google Spreadsheets, this option will
read as FileDownload as.

Step 5: Select CSV under the Save as type dropdown menu.


Step 6: Type a name for your CSV file (like stud.csv), then select Save option. Now
your CSV file is created and comma is automatically added to the file to
separate each field.

3
Method 2 Using Text Editor (Notepad)

Suppose, we would like to create a CSV file (emp.csv) for the following data:

4
Eno Name Salary

15 Anubhav Singht 45000

20 Shivesh Tripathi 55000

40 Abhinav Sagar 62000

45 Dhruv Sharmai 48000

60 Aryan Rana 64000

Step 1: Open Notepad and type each of your headers/field names separated by
commas onto the first line. E.g. Eno,Name,Salary and so on.
Step 2: Type your data on to the second line, using same format as your field names on
the first line. E.g. 15,Anubhav Singh,45000 and so on.
Step 3: Continue typing your data for each individual element onto each subsequent
line. If leaving any field empty, make sure you include the comma.
Step 4: Click on File and select Save option.
Step 5: Type the name of your file with .csv extension.
Step 6: Click on Save. Now your CSV file is created in Notepad.

5
Method 3 Using Python Program

Working with CSV file in Python

1. To perform read and write operation with CSV file, we must import CSV module.
2. open() function is used to open file, and return file object.

While we could use the built in open() function to work with CSV files in Python, there is
a dedicated csv module that makes working with CSV files much easier.

Before we can use the methods of csv module, we need to import the module first
using: import csv

csv module functions:

a) csv.field_size_limit  Returns maximum field size


b) csv.get_dialect  Get the dialect which is associated with the name
c) csv.list_dialects  Show all registered dialects
d) csv.reader  read data from a csv file
e) csv.writer  write data to a csv file
f) csv.register_dialect  associate dialect with name
g) csv.unregister_dialect  delete the dialect associated with the name the
dialect registry
h) csv.QUOTE_ALL  Quote everything, regardless of type
i) csv.QUOTE_MINIMAL  Quote fields with special characters
j) csv.QUOTE_NONNUMERIC  Quote all fields which are not numbers value
k) csv.QUOTE_NONE  Do not quote anything in output

Writing CSV files in Python

Here you will learn to write CSV files with different formats in Python with the help of
examples.

Writing CSV files Using csv.writer()

To write to a csv file in Python, csv.writer() is used. The csv.writer() returns a


writer object that converts the user’s data into a delimited string. This string can later be
used to write into csv files.

The csv module is used for reading and writing files. It mainly provides following classes
and functions:

1. writer()
2. reader()
3. DictWriter()

6
4. DictReader()

Creating and writing csv files with writer()

While using writer() function, it takes arguments and returns a writer object.

Syntax: writer(fileobj [,dialect='excel'[, **fmtparam]])

Argument Description

fileobj (required) It refers to the file object

dialect (optional) Dialect refers to the different ways of formatting the CSV
document. By default, the csv module uses the same format as
Microsoft Excel.

fmtparam (optional) Formatting parameters, it refers to the set of keyword


arguments to customize the dialect.

The writer() instance provides the following two methods to write data:

Method Description

writerow(row) Writes a single row of data and returns the number of characters
written. The row must be a sequence of strings and number.

writerows(rows) Writes multiple rows of data and returns None. The rows must be a
sequence.

Let’s take examples:


Example 1: Write to a CSV file

7
# Creating CSV file through Python program
# Using writerow()
import csv

header=['eno','name','salary']
rows=[
[15,'Anubhav Singh',45000],
[20,'Shivesh Tripathi',55000],
[40,'Abhinav Sagar',62000],
[45,'Dhruv Sharma',48000],
[60,'Aryan Rana',64000],
]
f=open('emp3.csv','w')
csv_writer=csv.writer(f)
csv_writer.writerow(header) # To write header
for r in rows:
csv_writer.writerow(r)
f.close()

Example 2:
# Creating CSV file through Python program
# Using writerows()
import csv

header=['eno','name','salary']
rows=[
[15,'Anubhav Singh',45000],
[20,'Shivesh Tripathi',55000],
[40,'Abhinav Sagar',62000],
[45,'Dhruv Sharma',48000],
[60,'Aryan Rana',64000],
]
f=open('emp3.csv','w')
csv_writer=csv.writer(f)
csv_writer.writerow(header) # To write header
csv_writer.writerows(rows)
f.close()

8
The output generated by both listing will be the same and it looks like this:

Example 3: Write to a CSV file (interactive mode)


# Creating CSV file through Python program
# Using interactive mode
import csv
f=open("emp.csv", 'w')
mywriter=csv.writer(f)
mywriter.writerow(["eno","name","salary"])
while True:
eno=int(input("Enter Employee no "))
name=input("Enter Employee name ")
salary=int(input("Enter Salary "))
choice=input("Enter more records Y/N : ")
mywriter.writerow([eno,name,salary])
if (choice.upper()=='N'):
break
f.close()

When we run the above program, a emp.csv file is created with the following content:

9
In the above program, we have opened the file in writing mode.

Then, we have passed each row as a list. These lists are converted to a delimited string
and written into the CSV file.

Example 4: Writing multiple rows with writerows()

If we need to write the contents of the 2-dimensional list to a CSV file, here's how we
can do it.

# Program to write multiple rows


import csv
f=open("stud.csv", 'w')
mywriter=csv.writer(f)
lst=[]
lst.append(["rno","name","marks"])
while True:
rno=int(input("Enter roll no no "))
name=input("Enter name ")
marks=int(input("Enter marks "))
choice=input("Enter more records Y/N : ")
lst.append([rno,name,marks])
if (choice.upper()=='N'):
break
10
mywriter.writerows(lst)
f.close()

The contents of stud.csv file will be as follows:

Here, list is passed to the mywriter.writerows() method to write the content of the
list to the CSV file.

Example 5: Writing to a CSV File with Tab Delimiter


# Program to write with tab delimiter
import csv
f=open("stud.csv", 'w')
mywriter=csv.writer(f, delimiter='\t')
mywriter.writerow(["rno","name","marks"])
while True:
rno=int(input("Enter roll no no "))
name=input("Enter name ")
marks=int(input("Enter marks "))
choice=input("Enter more records Y/N : ")
mywriter.writerow([rno,name,marks])
if (choice.upper()=='N'):
break
f.close()

11
Notice the optional parameter delimiter='\t' in the csv.writer() function.

Reading a CSV File with reader()


The reader() function takes a file object and returns a csv.reader object that can be
used to iterate over the contents of a CSV file. The syntax of reader() function is as
follows:

Syntax: reader(fileobj[,dialect='excel'[, **fmtparam]])

Argument Description

fileobj (required) It refers to the file object

dialect (optional) Dialect refers to the different ways of formatting the CSV
document. By default, the csv module uses the same format as
Microsoft Excel.

fmtparam (optional) Formatting parameters, it refers to the set of keyword


arguments to customize the dialect.

Reading CSV files Using csv.reader()

Suppose we have a csv file named student.csv in the current directory with the
following entries.

12
SN,Name,City
1,Amit Gupta,Mumbai
2,Shriya Jha,Delhi
3,Dev Pathak,Chandigarh
4,Manav Chauhan,Lucknow
5,Sakshi Rawat,Chennai

Let's read this file using csv.reader():


Example 6: Read CSV file Having Comma Delimiter
# Reading csv file with comma delimiter
import csv
f=open("student.csv", 'r')
reader=csv.reader(f)
for r in reader:
print(r)

Output:

['SN', 'Name', 'City']


['1', 'Amit Gupta', 'Mumbai']
['2', 'Shriya Jha', 'Delhi']
['3', 'Dev Pathak', 'Chandigarh']
['4', 'Manav Chauhan', 'Lucknow']
['5', 'Sakshi Rawat', 'Chennai']

Here, we have opened the student.csv file in reading mode:


Then, the csv.reader() is used to read the file, which returns an iterable reader
object. The reader object is then iterated using a for loop to print the contents of each
row.
In the above example, we are using the csv.reader() function in default mode for
CSV files having comma delimiter.
However, the function is much more customizable.

Suppose our CSV file was using tab as a delimiter.

13
SN Name City
1 Amit Gupta Mumbai
2 Shriya Jha Delhi
3 Dev Pathak Chandigarh
4 Manav Chauhan Lucknow
5 Sakshi Rawat Chennai

To read such files, we can pass optional parameters to the csv.reader() function.
Let's take an example.
Example 7: Read CSV file Having Tab Delimiter

# Reading csv file with tab delimiter


import csv
f=open("student.csv", 'r')
reader=csv.reader(f,delimiter='\t')
for r in reader:
print(r)

Output:
['SN', 'Name', 'City']
['1', 'Amit Gupta', 'Mumbai']
['2', 'Shriya Jha', 'Delhi']
['3', 'Dev Pathak', 'Chandigarh']
['4', 'Manav Chauhan', 'Lucknow']
['5', 'Sakshi Rawat', 'Chennai']

Notice the optional parameter delimiter=’\t’ in the above example.

Python csv.DictWriter() Class


The objects of csv.DictWriter() class can be used to write to a CSV file from a
Python dictionary.
The minimal syntax of the csv.DictWriter() class is:
csv.DictWriter(file, fieldnames)
Here,

14
 file - CSV file where we want to write to
 fieldnames - a list object which should contain the column headers specifying the
order in which data should be written in the CSV file
Example 8: Python csv.DictWriter()
# Program using DictWriter()
import csv
with open('stud1.csv', 'w', newline='') as f:
fieldnames = ['rno', 'name','marks']
mywriter = csv.DictWriter(f, fieldnames=fieldnames)
mywriter.writeheader()
mywriter.writerow({'rno': '10','name': 'Ananya
Pandey','marks':85})
mywriter.writerow({'rno': '20','name': 'Divyansh
Kumar','marks':70})
mywriter.writerow({'rno': '30','name': 'Swayam
Khanduri','marks':80})
The program creates a stud1.csv file with the following entries:

Python csv.DictReader() Class

The objects of a csv.DictReader() class can be used to read a CSV file as a


dictionary.
Example 9: Python csv.DictReader()

Suppose we have the same file student.csv as in Example 1.

15
Let's see how csv.DictReader() can be used.

# Program using csv.DictReader()


import csv
with open("student.csv", 'r') as f:
csv_file = csv.DictReader(f)
for r in csv_file:
print(dict(r))

Output:

{'SN': '1', 'Name': 'Amit Gupta', 'City': 'Mumbai'}


{'SN': '2', 'Name': 'Shriya Jha', 'City': 'Delhi'}
{'SN': '3', 'Name': 'Dev Pathak', 'City': 'Chandigarh'}
{'SN': '4', 'Name': 'Manav Chauhan', 'City': 'Lucknow'}
{'SN': '5', 'Name': 'Sakshi Rawat', 'City': 'Chennai'}

As we can see, the entries of the first row are the dictionary keys. And, the entries in the
other rows are the dictionary values.
Here, csv_file is a csv.DictReader() object. The object can be iterated over using
a for loop. The csv.DictReader() returned an OrderedDict type for each row.
That's why we used dict() to convert each row to a dictionary.
Notice that, we have explicitly used the dict() method to create dictionaries inside the
for loop.
print(dict(r))
Note: Starting from Python 3.8 version, csv.DictReader() returns a dictionary for
each row, and we do not need to use dict() explicitly.

16

You might also like