0% found this document useful (0 votes)
13 views4 pages

12 Ip

Uploaded by

Vimala Rajendran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views4 pages

12 Ip

Uploaded by

Vimala Rajendran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

CLASS

NOTES
Class: XII Date: 08/07/2020

Subject: Topic:
Informatics Practices Importing/Exporting Data Between CSV Files/MySQL
and Pandas

1. Transferring Data between .csv files and DataFrames


The acronym CSV is short for Comma-Separated Values
It is a tabulated plain text data separated by commas.

Advantages of CSV file:


1. A simple compact and ubiquitous format for data storage.
2. A common format for data interchange.
3. It can be opened in Ms-Excel, Calc etc.
4. Nearly all spreadsheets and databases support import/export to csv format.

Loading data from csv to DataFrames


Steps :
1. Prepare a file in Notepad/Excel and save it in csv file format.
2. Ensure that data are separated by comma in Notepad.
3. Open Python shell-windows and import Pandas
4. Now issue the following statement to create a DataFrame using csv file:
DF = pandas.read_csv(<filepath>)

To specify own column name in the DataFrame


If the column names are missing in the CSV file, we can specify the column header
in DataFrame by using names attribute.
DF = pandas.read_csv( <filepath>, names=<Sequence containg column names>)

Example :
import pandas as pd
df=pd.read_csv("C:\\Users\\Desktop\\covid19.csv")

Output : Without column headers


1 Maharashtra 174761 75979 90911
0 2 TamilNadu 94049 39859 52926
1 3 Delhi 89802 27007 59992
2 4 Gujraat 32643 7125 23670

df=pd.read_csv("C:\\Users\\Desktop\\covid19.csv", names=['sno', 'State', 'Total', 'Active', )


'Recovered']
Output : With specified column headers
sno State Total Active Recovered
0 1 Maharashtra 174761 75979 90911
1 2 TamilNadu 94049 39859 52926
2 3 Delhi 89802 27007 59992
3 4 Gujraat 32643 7125 23670

The header=None attribute

df=pd.read_csv("C:\\Users\\Desktop\\covid19.csv", header = None)


Output : With header=None attribute
0 1 2 3 4
0 1 Maharashtra 174761 75979 90911
1 2 TamilNadu 94049 39859 52926
2 3 Delhi 89802 27007 59992
3 4 Gujraat 32643 7125 23670

“If the CSV file has some column headings but you don’t want to use them as
header”
Solution: Use skiprows argument with read_csv() function.
skiprows argument can either take a number for number of rows to be skipped from
beginning or it take
rd th th
A list of rows numbers to be skipped, e.g., to skip 3 , 5 and 7 rows write :
skiprows = [3,5,7]

So, to skip the default header and apply our own header row use the following code :

df=pd.read_csv(<“filePath“>, names= <Sequence>, skiprows=value)


Get DataFrame index Labels from CSV
If you want to use a column of csv file as index of DataFrame, you should use
index_col argument.
df =pd.read_csv(“filePath", index_col=<columnName>)
Example :
Write a program to read from CSV file Employee.csv and create a dataframe from it,
Also set the column Empno as index label:
Contents of Employee.csv File

Empno Ename Desig Salary


2001 Tanisha Manager 23000
2002 Gaurav Programmer 20000
2003 Rajani Assistant 18000
2004 Lokesh Clerk 15000
2005 Anjali Clerk 16000

>>> import pandas as pd


>>> D5 = pd.read_csv("E:\online Classes July2020\\Employee.csv")
>>> D5
Empno Ename Desig Salary
0 2001 Tanisha Manager 23000
1 2002 Gaurav Programmer 20000
2 2003 Rajani Assistant 18000
3 2004 Lokesh Clerk 15000
4 2005 Anjali Clerk 16000

To set the Empno column as index label issue the following command at the prompt :

D5 = pd.read_csv("E:\online Classes July2020\\Employee.csv", index_col= ‘Empno’)


Practice Question: Refer the previous example and create a dataframe which doesn’t contain the column’s
header rather should use own column numbers as 0,1,2,… etc.

Answer :

import pandas as pd

D5 = pd.read_csv("E:\online Classes July2020\\Employee.csv", header= None, skiprows=1)

To specify a column of csv file as index labels


Refer previous example the csv file is :
Empno Ename Desig Salary
0 2001 Tanisha Manager 23000
1 2002 Gaurav Programmer 20000
2 2003 Rajani Assistant 18000
3 2004 Lokesh Clerk 15000
4 2005 Anjali Clerk 16000
Now we want the column Empno should become index label. To do this issue the
following command at the prompt
>>> D6 = pd.read_csv("E:\online Classes July2020\\Employee.csv",
index_col="Empno")
>>> D6
Empno Ename Desig Salary

2001 Tanisha Manager 23000


2002 Gaurav Programmer 20000
2003 Rajani Assistant 18000
2004 Lokesh Clerk 15000
2005 Anjali Clerk 16000

Get DataFrame index Labels from CSV


If you want to use a column of csv file as index of DataFrame, you should use
index_col argument.
df =pd.read_csv(“filePath", index_col=<columnName>)

Practice Question: Using the following CSV file create a dataframe with desired
column and print the highest salary value:
Empno Ename Desig Salary
0 2001 Tanisha Manager 23000
1 2002 Gaurav Programmer 20000
2 2003 Rajani Assistant 18000
3 2004 Lokesh Clerk 15000
4 2005 Anjali Clerk 16000
D7 = pd.read_csv("E:\online Classes July2020\\Employee.csv",
names=["EmpID","Name","Desig","Sal"], skiprows=1)
>>> D7
EmpID Name Desig Sal
0 2001 Tanisha Manager 23000
1 2002 Gaurav Programmer 20000
2 2003 Rajani Assistant 18000
3 2004 Lokesh Clerk 15000
4 2005 Anjali Clerk 16000
>>> print("The Maximum Salary is :",D7.Sal.max())
The Maximum Salary is : 23000

You might also like