Dataframe Notes
Dataframe Notes
Introduction to DataFrame
A DataFrame is a two-dimensional labelled data structure like a spreadsheet.
It contains rows and columns and therefore has both a row and column index.
Each column can have a different type of values such a numeric, string,
Boolean,,etc., in tables of a database
1
you want to add or change in place. If you
remove an element want to add/remove an
internally a new Series element, it will change in
object will be created, existing DataFrame
object.
2
4.Creation of DataFrame from numpy ndarrays with custom index label
CODE OUTPUT
import pandas as pd C1 C2 C3 C4
import numpy as np R1 11 12 13 14
arr=np.array([11,12,13,14]) R2 1 2 3 4
arr1=np.array([1,2,3,4]) R3 101 102 103 104
arr2=np.array([101,102,103,104])
df=pd.DataFrame([arr,arr1,arr2],index=['R1','R2','R3']
,
columns=['C1','C2','C3','C4'])
print(df)
CODE OUTPUT
import pandas as pd Name Marks Age
D1={'Name':'Jaya','Marks':87} 0 Jaya 87 NaN
D2={'Name':'Abi','Age':17,'Marks':87} 1 Abi 87 17.0
D3={'Name':'Kavi','Age':18,'Marks':76} 2 Kavi 76 18.0
l=[D1,D2,D3]
df=pd.DataFrame(l)
print(df)
Note: Dictionary keys are become column labels by default in a DataFrame,and lists
become the rows.
6.Creation of DataFrame from List of Dictionaries with custom index value for
rows
CODE OUTPUT
import pandas as pd Name Marks Age
D1={'Name':'sai','Marks':87} R1 sai 87 NaN
D2={'Name':'Abi','Age':17,'Marks':87} R2 Abi 87 17.0
D3={'Name':'Kavi','Age':18,'Marks':76} R3 Kavi 76 18.0
l=[D1,D2,D3]
df=pd.DataFrame(l,index=['R1','R2','R3'])
3
print(df)
7.Creation of DataFrame from List of Dictionaries with custom index value for
columns
CODE OUTPUT
import pandas as pd a1 a2 a3
D1={'Name':'Jaya','Marks':87} R1 NaN NaN NaN
D2={'Name':'Abi','Age':17,'Marks':87} R2 NaN NaN NaN
D3={'Name':'Kavi','Age':18,'Marks':76} R3 NaN NaN NaN
L=[D1,D2,D3]
df=pd.DataFrame(L,index=['R1','R2','R3'],
columns=['a1','a2','a3'])
print(df)
4
10.Creation of DataFrame from Dictionary of List with custom index value for
rows
CODE OUTPUT
import pandas as pd Name Age Marks
N=['jaya','bala','krish'] R1 jaya 14 98
A=[14,17,15] R2 bala 17 78
M=[98,78,68] R3 krish 15 68
D={'Name':N,'Age':A,'Marks':M}
df=pd.DataFrame(D,index=['R1','R2','R3']
)
print(df)
11.Creation of DataFrame from Dictionary of List with custom index value for
columns
CODE OUTPUT
import pandas as pd a1 a2 a3
N=['jaya','bala','krish'] R1 NaN NaN NaN
A=[14,17,15] R2 NaN NaN NaN
M=[98,78,68] R3 NaN NaN NaN
D={'Name':N,'Age':A,'Marks':M}
df=pd.DataFrame(D,index=['R1','R2','R3'],
columns=['a1','a2','a3'])
print(df)
5
print(df)
13. Creation of DataFrame from Series(includes dtype)
To create a DataFrame using more than on series, we need to pass multiple
Series in the list.
The labels in the Series object become the column name in the Dataframe
object.
Each Series becomes a row in the DataFrame.
If a particular Series does not have a corresponding value for a label, NaN is
inserted in the DataFrame column.
CODE OUTPUT
import pandas as pd 0 14
L=[14,17,15] 1 17
s=pd.Series(L) 2 15
print(s) dtype: int64
15. Creation of DataFrame from Series (dtype not include) with Custom index
label for rows
CODE OUTPUT
import pandas as pd 0 1 2 3 4
s1=pd.Series([11,12,13,14,15]) a 11 12 13 14 15
s2=pd.Series([1,2,3,4,5]) b 1 2 3 4 5
s3=pd.Series([111,122,133,144,155]) c 111 122 133 144 155
s4=pd.Series([21,22,23,24,9]) d 21 22 23 24 9
df=pd.DataFrame([s1,s2,s3,s4],index=['a','b','c','d'] >>>
6
)
print(df)
16. Creation of DataFrame from Series with Custom index label for columns
CODE OUTPUT
import pandas as pd a b c d e
s1=pd.Series([11,12,13,14,15],index=['a','b','c','d','e'] 0 11 12 13 14 15
) 1 1 2 3 4 5
s2=pd.Series([1,2,3,4,5],index=['a','b','c','d','e']) 2 31 32 33 34 35
s3=pd.Series([31,32,33,34,35],index=['a','b','c','d','e'] 3 21 22 23 24 45
)
s4=pd.Series([21,22,23,24,45],index=['a','b','c','d','e']
)
df=pd.DataFrame([s1,s2,s3,s4])
print(df)
17.Creation of DataFrame from Series with Custom index label for columns
and rows
CODE OUTPUT
import pandas as pd a b c d e
s1=pd.Series([11,12,13,14,15],index=['a','b','c','d','e']) aa 11 12 13 14 15
s2=pd.Series([1,2,3,4,5],index=['a','b','c','d','e']) bb 1 2 3 4 5
s3=pd.Series([41,42,43,44,45],index=['a','b','c','d','e']) cc 41 42 43 44 45
s4=pd.Series([21,22,23,24,45],index=['a','b','c','d','e']) dd 21 22 23 24 45
df=pd.DataFrame([s1,s2,s3,s4],index=['aa','bb','cc','dd']
)
print(df)
18.Creation of DataFrame from Series (includes dtype) with Custom index label
for rows
CODE OUTPUT
import pandas as pd R1 14
L=[14,17,15] R2 17
7
s=pd.Series(L,index=['R1','R2','R3']) R3 15
print(s) dtype: int64
19.Creation of DataFrame from Series (includes dtype) with Custom index label
for column
CODE OUTPUT
import pandas as pd C1
L=[14,17,15] R1 14
s=pd.Series(L,index=['R1','R2','R3']) R2 17
df=pd.DataFrame(s,columns=['C1']) R3 15
print(df)'''
CODE OUTPUT
import pandas as pd Humanities Medical Non Med
D1={'Name':'Jaya','Marks':87} Name Jaya Abi Kavi
D2={'Name':'Abi','Age':17,'Marks':87} Marks 87 87 76
D3={'Name':'Kavi','Age':18,'Marks':76} Age NaN 17 18
8
DD={"Humanities":D1,"Medical":D2,"No
n Med":D3}
df=pd.DataFrame(DD)
print(df)
Keys of outer dictionary is column labels and inner dictionary is index or row labels.
CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],
columns=['Name','Age','Marks','Subject'])
print(df)
OUTPUT
Humanities Medical Non Med
Name Jaya Abi Kavi
Marks 87 87 76
Age NaN 17 18
CODE
#Select options in rows and columns
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
9
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df)
OUTPUT
Name Age Marks Subject
R1 jaya 14 98 cs
R2 bala 17 78 bio
R3 krish 15 68 pe
R4 sakthi 15 65 ip
R5 abi 13 87 cs
R6 bharathi 14 98 ip
R7 geetha 13 76 bio
R8 sandhya 12 65 cs
Syntax: DataFrameObject[ColumnName]
CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df['Name'])
print(df['Marks'])
OUTPUT
R1 jaya R1 98
10
R2 bala R2 78
R3 krish R3 68
R4 sakthi R4 65
R5 abi R5 87
R6 bharathi R6 98
R7 geetha R7 76
R8 sandhya R8 65
Name: Name, dtype: object Name: Marks, dtype: int64
CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df['Marks'])
OUTPUT
R1 98
R2 78
R3 68
R4 65
R5 87
R6 98
R7 76
R8 65
Name: Marks, dtype: int64
Syntax: DataFrameObject.ColumnName
11
Note: while using Dot Notation, Column Name is to be written without quotes.
CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df.Name)
OUTPUT
R1 jaya
R2 bala
R3 krish
R4 sakthi
R5 abi
R6 bharathi
R7 geetha
R8 sandhya
Name: Name, dtype: object
12
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df[['Name','Age']])
OUTPUT
Name Age
R1 jaya 14
R2 bala 17
R3 krish 15
R4 sakthi 15
R5 abi 13
R6 bharathi 14
R7 geetha 13
R8 sandhya 12
CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df[['Name','Age']])
OUTPUT
Name Age
R1 jaya 14
R2 bala 17
R3 krish 15
R4 sakthi 15
R5 abi 13
13
R6 bharathi 14
R7 geetha 13
R8 sandhya 12
CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df.loc['R2':'R4']) #includes stop value
print(df.loc['R2':'R4',:]) #includes stop value
OUTPUT
Name Age Marks Subject
R2 bala 17 78 bio
R3 krish 15 68 pe
R4 sakthi 15 65 ip
Name Age Marks Subject
R2 bala 17 78 bio
R3 krish 15 68 pe
R4 sakthi 15 65 ip
Method-2
loc is used to select and/ or a combination of rows and columns from the DataFrame.
Syntax:
14
DataFrameObject.loc[StartRow:EndRow,StartColumn:EndColumn:StepValue]
Note:
15
CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df.loc['Name':'Marks'])
print(df.loc[:,'Name':'Marks']) #Start,Stop and Step
OUTPUT
Empty DataFrame
Columns: [Name, Age, Marks, Subject]
Index: []
16
R3 krish 15 68
R4 sakthi 15 65
R5 abi 13 87
R6 bharathi 14 98
R7 geetha 13 76
R8 sandhya 12 65
CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df.loc['R2':'R4',:])
OUTPUT
Name Age Marks Subject Name Age Marks Subject
R2 bala 17 78 bio R2 bala 17 78 bio
R3 krish 15 68 pe R3 krish 15 68 pe
R4 sakthi 15 65 ip R4 sakthi 15 65 ip
CODE
17
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df.loc['R5'])
OUTPUT
Name abi
Age 13
Marks 87
Subject cs
Name: R5, dtype: object
CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
18
Age','Marks','Subject'])
print(df.loc[:,'Name':'Subject':2]) #Columns
print("************************************")
print(df.loc['R1':'R7':3]) #rows
OUTPUT
Name Marks
R1 jaya 98
R2 bala 78
R3 krish 68
R4 sakthi 65
R5 abi 87
R6 bharathi 98
R7 geetha 76
R8 sandhya 65
************************************
Name Age Marks Subject
R1 jaya 14 98 cs
R4 sakthi 15 65 ip
R7 geetha 13 76 bio
Keyerror
CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
19
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df['R2'])#Keyerror
print(df[['R2','R3']])#Keyerror
print(df.loc['Name'])#Keyerror
OUTPUT
Keyerror
CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df['R2'])#Keyerror
print(df[['R2','R3']])#Keyerror
print(df.loc['Name'])#Keyerror
OUTPUT
20
Name Age Marks Subject
R5 abi 13 87 cs
************************************
Name krish
Age 15
Marks 68
Subject pe
Name: R3, dtype: object
CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df.loc['R2':'R6','Name':'Marks'])
print("************************************")
21
print("************************************")
print(df.loc['R2':'R6',['Name','Subject']])
print("************************************")
print(df.loc[['R2','R4'],['Name','Subject']])
print("************************************")
print(df.loc['R1':'R7':2,'Name':'Marks':2])
print("************************************")
print(df.loc['R2':'R5'])
R3 krish 15 68 R3 krish pe
R4 sakthi 15 65 R4 sakthi ip
R5 abi 13 87 R5 abi cs
R6 bharathi 14 98 R6 bharathi ip
************************************ ************************************
R3 krish 68 R4 sakthi ip
R4 sakthi 65 ************************************
22
R5 abi 87 Name Marks
R6 bharathi 98 R1 jaya 98
************************************ R3 krish 68
R5 abi 87
R7 geetha 76
If we want to extract sunset from DataFrame using the row and column numeric
index/position, then we can use iloc.
Syntax:
Df.iloc[StartRowInex:EndRowIndex:StopValue,StartColumnInex:EndColumnIndex:StopValue
]
23
Display Rows at Index
CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
24
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df.iloc[:,0:2])
print("************************************")
print(df.iloc[0:3])
print("************************************")
print(df.iloc[0:4,0:2])
print("************************************")
print(df.iloc[2:6,0:3])
OUTPUT
Name Age Name Age
R1 jaya 14 R1 jaya 14
R2 bala 17 R2 bala 17
R3 krish 15 R3 krish 15
R4 sakthi 15 R4 sakthi 15
R5 abi 13 ************************************
R6 bharathi 14 Name Age Marks
R7 geetha 13 R3 krish 15 68
R8 sandhya 12 R4 sakthi 15 65
************************************ R5 abi 13 87
Name Age Marks Subject R6 bharathi 14 98
25
R1 jaya 14 98 cs
R2 bala 17 78 bio
R3 krish 15 68 pe
************************************
CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df.iloc[[1,5],[0,2]])
print("####################################")
print(df.iloc[[1,5],0:3:2])
print("************************************")
print(df.iloc[0:4:2,0:2])
print("************************************")
26
print(df.iloc[:])
print("************************************")
OUTPUT
Name Marks Name Age Marks Subject
R2 bala 78 R1 jaya 14 98 cs
R6 bharathi 98 R2 bala 17 78 bio
################################### R3 krish 15 68 pe
# R4 sakthi 15 65 ip
Name Marks R5 abi 13 87 cs
R2 bala 78 R6 bharathi 14 98 ip
R6 bharathi 98 R7 geetha 13 76 bio
************************************ R8 sandhya 12 65 cs
Name Age ************************************
R1 jaya 14
R3 krish 15
************************************
CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
27
Age','Marks','Subject'])
print(df.iloc[2:5])
print("************************************")
print(df.iloc[2:5,:])
print("************************************")
print(df.iloc[2:5,[0,3]])
print("************************************")
print(df.iloc[2:5,['Name','Marks']])
print("************************************")
OUTPUT
28
Selecting / Accessing Individual Value
(i)Either give a name of row or numeric index in square brackets with column name
CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df.Name[3])
print("************************************")
print(df.Name[3])
print("************************************")
print(df.Marks[2])
print("************************************")
print(df.Marks['R5'])
print("************************************")
print(df.Marks[2])
print("************************************")
29
OUTPUT
sakthi 87
************************************ ************************************
sakthi 68
************************************ ************************************
68
************************************
Using at
Syntax:
<DF object>.at[Rowlabel,Columnlabel]
CODE OUTPUT
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya'] Bio
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],
columns=['Name','Age','Marks','Subject'])
#Display subject of geetha
print(df.at['R7','Subject'])
30
Using iat : It is used to access a single value for row/column label pair by integer
position
Syntax:<DF object>.iat[Rowindex,Columnindex]
CODE OUTPUT
import pandas as pd
65
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya'] cs
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],
columns=['Name','Age','Marks','Subject'])
print(df.iat[3,2])
print(df.iat[4,3])
CODE OUTPUT
import pandas as pd
98
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya'] 13
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
31
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],
columns=['Name','Age','Marks','Subject'])
print(df.Marks['R6'])
print(df.Age[4])
Attributes of DataFrame
When we create a DataFrame object, all information related to it (such as its size, its
datatype, its dimensions etc.) is available through its attributes.
Syntax:
<Data/frameObject>.AttributeName
32
in the DataFrame df = pd.DataFrame(data)
print(df.ndim)
# Output: 2
print(df.T)
#Output
0 1 2
Age 21 20 22
Methods in DataFrame
33
head() function
CODE
import pandas as pd
data = {
'Name': ['A', 'B', 'C','D','E','F','G'],
'Age': [25, 30, 35,24,35,22,34],
'City': ['Delhi', 'Goa', 'Mumbai','AP','MP','TN','Goa']
}
df = pd.DataFrame(data,
index=['Stud1','Stud2','Stud3','Stud4','Stud5','Stud6','Stud7'])
print(df.head(3))
print(df.head())
print(df.head(-1))
OUTPUT
Name Age City Name Age City Name Age City
Stud1 A 25 Delhi Stud1 A 25 Delhi Stud1 A 25 Delhi
Stud2 B 30 Goa Stud2 B 30 Goa Stud2 B 30 Goa
Stud3 C 35 Mumbai Stud3 C 35 Mumbai Stud3 C 35 Mumbai
Stud4 D 24 AP Stud4 D 24 AP
Stud5 E 35 MP
tail() function
If the value for n is not passed, then by default n takes 5 and the last five rows are
displayed.
CODE
import pandas as pd
data = {
34
'Name': ['A', 'B', 'C','D','E','F','G'],
'Age': [25, 30, 35,24,35,22,34],
'City': ['Delhi', 'Goa', 'Mumbai','AP','MP','TN','Goa']
}
df = pd.DataFrame(data,
index=['Stud1','Stud2','Stud3','Stud4','Stud5','Stud6','Stud7'])
print(df.tail(1))
print(df.tail())
print(df.tail(-3))
OUTPUT
Name Age City Stud7 G 34 Goa
Stud7 G 34 Goa Name Age City
Name Age City Stud4 D 24 AP
Stud3 C 35 Mumbai Stud5 E 35 MP
Stud4 D 24 AP Stud6 F 22 TN
Stud5 E 35 MP Stud7 G 34 Goa
Stud6 F 22 TN
Note: If you pass a negative integer n to head(), it will return all rows except the last
n rows and if you pass a negative integer n to tail(), it will return all rows except the
first n rows.
Label Indexing
In label indexing,we can access the elements of the DataFrame with the help
of either Row or Column Labels.
There are various methods to access the elements of DataFRame using
Labels.
loc and at are the two popular techniques for Label Based Indexing.
35
import pandas as pd
data = {
df = pd.DataFrame(data,
index=['Stud1','Stud2','Stud3','Stud4','Stud5','Stud6','Stud7'])
print(df)
OUTPUT
Stud4 D 24 AP Stud5 E
Stud5 E 35 MP Stud6 F
Stud6 F 22 TN Stud7 G
Name: Name, dtype: object
Stud7 G 34 Goa
36
print(df['Age']) Stud2 30
print(df.Age) Stud3 35
print(df.loc[:,'Age']) Stud4 24
Stud5 35
Stud6 22
Stud7 34
Stud2 B 30 Goa
Stud4 D 24 AP
Boolean Indexing
import pandas as pd
data = {
37
'Marks': [75, 82, 68, 91, 80]
# Create DataFrame
df = pd.DataFrame(data)
Method-2
Code
import pandas as pd
#Create a dictionary
38
#Create a dataframe with boolean values
print(df)
print(df.loc[True])
OUTPUT
39