0% found this document useful (0 votes)

2 views

Dataframe Notes

The document provides an overview of DataFrames, a two-dimensional labeled data structure in Python's pandas library, including their creation from various data sources such as numpy arrays, lists of dictionaries, and series. It also highlights the differences between Series and DataFrames, emphasizing their dimensionality and data type flexibility. Additionally, it includes code examples demonstrating the creation of DataFrames with custom indices and column labels.

Uploaded by

Jayabharathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Dataframe Notes

Uploaded by

Jayabharathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 39

DataFrame

Introduction to DataFrame
 A DataFrame is a two-dimensional labelled data structure like a spreadsheet.
 It contains rows and columns and therefore has both a row and column index.
 Each column can have a different type of values such a numeric, string,
Boolean,,etc., in tables of a database

Difference Between Series and DataFrame

Sno Property Series DataFrame
1 Dimensions 1-Dimensional 2-Dimensional
2 Type of Data Homogenous.,i.e. all Hetrogenous,i.e.
elements must be same DataFRame object can
datatype have elements of
different datatypes.
3 Mutability Values mutable,i.e. their Values mutable,i.e.
elements values can their elements values
change can change
Size Immutable,i.e. size Size mutable,i.e. size of
of Series object, once a DataFrame object,
created cant change.if once created can

1
you want to add or change in place. If you
remove an element want to add/remove an
internally a new Series element, it will change in
object will be created, existing DataFrame
object.

1.Creation of Empty DataFrame

CODE OUTPUT
import pandas as pd Empty DataFrame
df=pd.DataFrame() Columns: []
print(df) Index: []

2.Creation of DataFrame from numpy ndarrays

CODE OUTPUT
import pandas as pd 0
import numpy as np 0 11
arr=np.array([11,12,13,14]) 1 12
df=pd.DataFrame(arr) 2 13
print(df) 3 14
#Dataframe created from numpy array
is (1D)

3.Creation of DataFrame from numpy ndarrays

CODE OUTPUT
import pandas as pd 0 1 2 3
import numpy as np 0 11 12 13 14
arr=np.array([11,12,13,14]) 1 1 2 3 4
arr1=np.array([1,2,3,4]) 2 101 102 103 104
arr2=np.array([101,102,103,104])
df=pd.DataFrame([arr,arr1,arr2])
print(df)

2
4.Creation of DataFrame from numpy ndarrays with custom index label
CODE OUTPUT
import pandas as pd C1 C2 C3 C4
import numpy as np R1 11 12 13 14
arr=np.array([11,12,13,14]) R2 1 2 3 4
arr1=np.array([1,2,3,4]) R3 101 102 103 104
arr2=np.array([101,102,103,104])
df=pd.DataFrame([arr,arr1,arr2],index=['R1','R2','R3']
,
columns=['C1','C2','C3','C4'])
print(df)

5.Creation of DataFrame from List of Dictionaries

CODE OUTPUT
import pandas as pd Name Marks Age
D1={'Name':'Jaya','Marks':87} 0 Jaya 87 NaN
D2={'Name':'Abi','Age':17,'Marks':87} 1 Abi 87 17.0
D3={'Name':'Kavi','Age':18,'Marks':76} 2 Kavi 76 18.0
l=[D1,D2,D3]
df=pd.DataFrame(l)
print(df)

Note: Dictionary keys are become column labels by default in a DataFrame,and lists
become the rows.

6.Creation of DataFrame from List of Dictionaries with custom index value for
rows
CODE OUTPUT
import pandas as pd Name Marks Age
D1={'Name':'sai','Marks':87} R1 sai 87 NaN
D2={'Name':'Abi','Age':17,'Marks':87} R2 Abi 87 17.0
D3={'Name':'Kavi','Age':18,'Marks':76} R3 Kavi 76 18.0
l=[D1,D2,D3]
df=pd.DataFrame(l,index=['R1','R2','R3'])

3
print(df)
7.Creation of DataFrame from List of Dictionaries with custom index value for
columns
CODE OUTPUT
import pandas as pd a1 a2 a3
D1={'Name':'Jaya','Marks':87} R1 NaN NaN NaN
D2={'Name':'Abi','Age':17,'Marks':87} R2 NaN NaN NaN
D3={'Name':'Kavi','Age':18,'Marks':76} R3 NaN NaN NaN
L=[D1,D2,D3]
df=pd.DataFrame(L,index=['R1','R2','R3'],
columns=['a1','a2','a3'])
print(df)

8.Creation of DataFrame from List of Dictionaries with key values as column

but changing sequence
CODE OUTPUT
import pandas as pd Name Age Marks
D1={'Name':'Jaya','Marks':87} R1 Jaya NaN 87
D2={'Name':'Abi','Age':17,'Marks':87} R2 Abi 17.0 87
D3={'Name':'Kavi','Age':18,'Marks':76} R3 Kavi 18.0 76
l=[D1,D2,D3]
df=pd.DataFrame(l,index=['R1','R2','R3'],
columns=['Name','Age','Marks'])
print(df)

9.Creation of DataFrame from Dictionary of Lists

CODE OUTPUT
import pandas as pd Name Age Marks
N=['jaya','bala','krish'] 0 jaya 14 98
A=[14,17,15] 1 bala 17 78
M=[98,78,68] 2 krish 15 68
D={'Name':N,'Age':A,'Marks':M}
df=pd.DataFrame(D)
print(df)

4
10.Creation of DataFrame from Dictionary of List with custom index value for
rows
CODE OUTPUT
import pandas as pd Name Age Marks
N=['jaya','bala','krish'] R1 jaya 14 98
A=[14,17,15] R2 bala 17 78
M=[98,78,68] R3 krish 15 68
D={'Name':N,'Age':A,'Marks':M}
df=pd.DataFrame(D,index=['R1','R2','R3']
)
print(df)

11.Creation of DataFrame from Dictionary of List with custom index value for
columns
CODE OUTPUT
import pandas as pd a1 a2 a3
N=['jaya','bala','krish'] R1 NaN NaN NaN
A=[14,17,15] R2 NaN NaN NaN
M=[98,78,68] R3 NaN NaN NaN
D={'Name':N,'Age':A,'Marks':M}
df=pd.DataFrame(D,index=['R1','R2','R3'],
columns=['a1','a2','a3'])
print(df)

12.Creation of DataFrame from Dictionary of List with changing sequence of

column
CODE OUTPUT
import pandas as pd Marks Age Name
N=['jaya','bala','krish'] R1 98 14 jaya
A=[14,17,15] R2 78 17 bala
M=[98,78,68] R3 68 15 krish
D={'Name':N,'Age':A,'Marks':M}
df=pd.DataFrame(D,index=['R1','R2','R3'],
columns=['Marks','Age','Name'])

5
print(df)
13. Creation of DataFrame from Series(includes dtype)
 To create a DataFrame using more than on series, we need to pass multiple
Series in the list.
 The labels in the Series object become the column name in the Dataframe
object.
 Each Series becomes a row in the DataFrame.
 If a particular Series does not have a corresponding value for a label, NaN is
inserted in the DataFrame column.
CODE OUTPUT
import pandas as pd 0 14
L=[14,17,15] 1 17
s=pd.Series(L) 2 15
print(s) dtype: int64

14. Creation of DataFrame from Series(dtype not include)

CODE OUTPUT
import pandas as pd 0 1 2 3 4
s1=pd.Series([11,12,13,14,15]) 0 11 12 13 14 15
s2=pd.Series([1,2,3,4,5]) 1 1 2 3 4 5
s3=pd.Series([111,122,133,144,155]) 2 111 122 133 144 155
s4=pd.Series([21,22,23,24,9]) 3 21 22 23 24 9
df=pd.DataFrame([s1,s2,s3,s4])
print(df)

15. Creation of DataFrame from Series (dtype not include) with Custom index
label for rows
CODE OUTPUT
import pandas as pd 0 1 2 3 4
s1=pd.Series([11,12,13,14,15]) a 11 12 13 14 15
s2=pd.Series([1,2,3,4,5]) b 1 2 3 4 5
s3=pd.Series([111,122,133,144,155]) c 111 122 133 144 155
s4=pd.Series([21,22,23,24,9]) d 21 22 23 24 9
df=pd.DataFrame([s1,s2,s3,s4],index=['a','b','c','d'] >>>

6
)
print(df)
16. Creation of DataFrame from Series with Custom index label for columns
CODE OUTPUT
import pandas as pd a b c d e
s1=pd.Series([11,12,13,14,15],index=['a','b','c','d','e'] 0 11 12 13 14 15
) 1 1 2 3 4 5
s2=pd.Series([1,2,3,4,5],index=['a','b','c','d','e']) 2 31 32 33 34 35
s3=pd.Series([31,32,33,34,35],index=['a','b','c','d','e'] 3 21 22 23 24 45
)
s4=pd.Series([21,22,23,24,45],index=['a','b','c','d','e']
)
df=pd.DataFrame([s1,s2,s3,s4])
print(df)

17.Creation of DataFrame from Series with Custom index label for columns
and rows
CODE OUTPUT
import pandas as pd a b c d e
s1=pd.Series([11,12,13,14,15],index=['a','b','c','d','e']) aa 11 12 13 14 15
s2=pd.Series([1,2,3,4,5],index=['a','b','c','d','e']) bb 1 2 3 4 5
s3=pd.Series([41,42,43,44,45],index=['a','b','c','d','e']) cc 41 42 43 44 45
s4=pd.Series([21,22,23,24,45],index=['a','b','c','d','e']) dd 21 22 23 24 45
df=pd.DataFrame([s1,s2,s3,s4],index=['aa','bb','cc','dd']
)
print(df)

18.Creation of DataFrame from Series (includes dtype) with Custom index label
for rows
CODE OUTPUT
import pandas as pd R1 14
L=[14,17,15] R2 17

7
s=pd.Series(L,index=['R1','R2','R3']) R3 15
print(s) dtype: int64

19.Creation of DataFrame from Series (includes dtype) with Custom index label
for column
CODE OUTPUT
import pandas as pd C1
L=[14,17,15] R1 14
s=pd.Series(L,index=['R1','R2','R3']) R2 17
df=pd.DataFrame(s,columns=['C1']) R3 15
print(df)'''

20.Creation of DataFrame from Dictionary of Series

CODE OUTPUT
import pandas as pd key1 key2 key3 key4
s1=pd.Series([11,12,13,14,15]) 0 11 1 111 21
s2=pd.Series([1,2,3,4,5]) 1 12 2 122 22
s3=pd.Series([111,122,133,144,155]) 2 13 3 133 23
s4=pd.Series([21,22,23,24,9]) 3 14 4 144 24
D={'key1':s1,'key2':s2,'key3':s3,'key4':s4 4 15 5 155 9
}
df=pd.DataFrame(D)
print(df)

21.Creation of DataFrame from Dictionary of Dictionary

CODE OUTPUT
import pandas as pd Humanities Medical Non Med
D1={'Name':'Jaya','Marks':87} Name Jaya Abi Kavi
D2={'Name':'Abi','Age':17,'Marks':87} Marks 87 87 76
D3={'Name':'Kavi','Age':18,'Marks':76} Age NaN 17 18

8
DD={"Humanities":D1,"Medical":D2,"No
n Med":D3}
df=pd.DataFrame(DD)
print(df)

Keys of outer dictionary is column labels and inner dictionary is index or row labels.

Select Option in rows and columns

CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],
columns=['Name','Age','Marks','Subject'])
print(df)
OUTPUT
Humanities Medical Non Med
Name Jaya Abi Kavi
Marks 87 87 76
Age NaN 17 18

CODE
#Select options in rows and columns
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}

9
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df)

OUTPUT
Name Age Marks Subject
R1 jaya 14 98 cs
R2 bala 17 78 bio
R3 krish 15 68 pe
R4 sakthi 15 65 ip
R5 abi 13 87 cs
R6 bharathi 14 98 ip
R7 geetha 13 76 bio
R8 sandhya 12 65 cs

Selecting a Single Column

Method-1: Using Square Bracklet

Syntax: DataFrameObject[ColumnName]

CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df['Name'])
print(df['Marks'])
OUTPUT
R1 jaya R1 98

10
R2 bala R2 78
R3 krish R3 68
R4 sakthi R4 65
R5 abi R5 87
R6 bharathi R6 98
R7 geetha R7 76
R8 sandhya R8 65
Name: Name, dtype: object Name: Marks, dtype: int64

CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df['Marks'])
OUTPUT
R1 98
R2 78
R3 68
R4 65
R5 87
R6 98
R7 76
R8 65
Name: Marks, dtype: int64

Method-2: Using Dot Notation

Syntax: DataFrameObject.ColumnName

11
Note: while using Dot Notation, Column Name is to be written without quotes.

CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df.Name)
OUTPUT
R1 jaya
R2 bala
R3 krish
R4 sakthi
R5 abi
R6 bharathi
R7 geetha
R8 sandhya
Name: Name, dtype: object

Selecting Multiple Columns

Method-1: Using Double square brackets
To select multiple columns, we can give list having multiple columns. Inside the
square brackets with DataFrame Object.
Syntax: DataFRameObject[[col1,col2,col3…]]
CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']

12
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df[['Name','Age']])
OUTPUT
Name Age
R1 jaya 14
R2 bala 17
R3 krish 15
R4 sakthi 15
R5 abi 13
R6 bharathi 14
R7 geetha 13
R8 sandhya 12

CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df[['Name','Age']])
OUTPUT
Name Age
R1 jaya 14
R2 bala 17
R3 krish 15
R4 sakthi 15
R5 abi 13

13
R6 bharathi 14
R7 geetha 13
R8 sandhya 12

Selecting Multiple Rows

CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df.loc['R2':'R4']) #includes stop value
print(df.loc['R2':'R4',:]) #includes stop value

OUTPUT
Name Age Marks Subject
R2 bala 17 78 bio
R3 krish 15 68 pe
R4 sakthi 15 65 ip
Name Age Marks Subject
R2 bala 17 78 bio
R3 krish 15 68 pe
R4 sakthi 15 65 ip

Method-2

Accessing Data using loc

loc is used to select and/ or a combination of rows and columns from the DataFrame.

Syntax:

14
DataFrameObject.loc[StartRow:EndRow,StartColumn:EndColumn:StepValue]

Note:

Using using Dot Notation, Column Name is to be written without quotes.

15
CODE
import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])
print(df.loc['Name':'Marks'])
print(df.loc[:,'Name':'Marks']) #Start,Stop and Step
OUTPUT
Empty DataFrame
Columns: [Name, Age, Marks, Subject]
Index: []

Name Age Marks

R1 jaya 14 98
R2 bala 17 78

16
R3 krish 15 68
R4 sakthi 15 65
R5 abi 13 87
R6 bharathi 14 98
R7 geetha 13 76
R8 sandhya 12 65

Selecting Multiple Rows

CODE

import pandas as pd

N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']

A=[14,17,15,15,13,14,13,12]

M=[98,78,68,65,87,98,76,65]

S=['cs','bio','pe','ip','cs','ip','bio','cs']

D={'Name':N,'Age':A,'Marks':M,'Subject':S}

df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])

print(df.loc['R2':'R4']) #includes stop values

print(df.loc['R2':'R4',:])
OUTPUT
Name Age Marks Subject Name Age Marks Subject
R2 bala 17 78 bio R2 bala 17 78 bio
R3 krish 15 68 pe R3 krish 15 68 pe
R4 sakthi 15 65 ip R4 sakthi 15 65 ip

Selecting Individual Element

CODE

17
import pandas as pd

N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']

A=[14,17,15,15,13,14,13,12]

M=[98,78,68,65,87,98,76,65]

S=['cs','bio','pe','ip','cs','ip','bio','cs']

D={'Name':N,'Age':A,'Marks':M,'Subject':S}

df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])

print(df.loc['R5'])
OUTPUT
Name abi
Age 13
Marks 87
Subject cs
Name: R5, dtype: object

Select Using Step Values

CODE

import pandas as pd

N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']

A=[14,17,15,15,13,14,13,12]

M=[98,78,68,65,87,98,76,65]

S=['cs','bio','pe','ip','cs','ip','bio','cs']

D={'Name':N,'Age':A,'Marks':M,'Subject':S}

df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','

18
Age','Marks','Subject'])

print(df.loc[:,'Name':'Subject':2]) #Columns

print("************************************")

print(df.loc['R1':'R7':3]) #rows
OUTPUT
Name Marks
R1 jaya 98
R2 bala 78
R3 krish 68
R4 sakthi 65
R5 abi 87
R6 bharathi 98
R7 geetha 76
R8 sandhya 65
************************************
Name Age Marks Subject
R1 jaya 14 98 cs
R4 sakthi 15 65 ip
R7 geetha 13 76 bio

Keyerror

CODE

import pandas as pd

N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']

A=[14,17,15,15,13,14,13,12]

M=[98,78,68,65,87,98,76,65]

S=['cs','bio','pe','ip','cs','ip','bio','cs']

19
D={'Name':N,'Age':A,'Marks':M,'Subject':S}

df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])

print(df['R2'])#Keyerror

print(df[['R2','R3']])#Keyerror

print(df.loc['Name'])#Keyerror
OUTPUT

Keyerror

Selecting A Single Row

CODE

import pandas as pd

N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']

A=[14,17,15,15,13,14,13,12]

M=[98,78,68,65,87,98,76,65]

S=['cs','bio','pe','ip','cs','ip','bio','cs']

D={'Name':N,'Age':A,'Marks':M,'Subject':S}

df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])

print(df['R2'])#Keyerror

print(df[['R2','R3']])#Keyerror

print(df.loc['Name'])#Keyerror
OUTPUT

20
Name Age Marks Subject

R5 abi 13 87 cs

************************************

Name krish

Age 15

Marks 68

Subject pe
Name: R3, dtype: object

Selecting Rows And Columns At A Time

CODE

import pandas as pd

N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']

A=[14,17,15,15,13,14,13,12]

M=[98,78,68,65,87,98,76,65]

S=['cs','bio','pe','ip','cs','ip','bio','cs']

D={'Name':N,'Age':A,'Marks':M,'Subject':S}

df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])

print(df.loc['R2':'R6','Name':'Marks'])

print("************************************")

print(df.loc['R2':'R6','Name':'Marks':2])#includes step value

21
print("************************************")

print(df.loc['R2':'R6',['Name','Subject']])

print("************************************")

print(df.loc[['R2','R4'],['Name','Subject']])

print("************************************")

print(df.loc['R1':'R7':2,'Name':'Marks':2])

print("************************************")

print(df.loc['R2':'R5'])

print(df.iloc[0:3])#positional ,iloc , Excludes stop value

OUTPUT

Name Age Marks Name Subject

R2 bala 17 78 R2 bala bio

R3 krish 15 68 R3 krish pe

R4 sakthi 15 65 R4 sakthi ip

R5 abi 13 87 R5 abi cs

R6 bharathi 14 98 R6 bharathi ip

************************************ ************************************

Name Marks Name Subject

R2 bala 78 R2 bala bio

R3 krish 68 R4 sakthi ip

R4 sakthi 65 ************************************

22
R5 abi 87 Name Marks

R6 bharathi 98 R1 jaya 98

************************************ R3 krish 68

R5 abi 87

R7 geetha 76

Accessing Elements Using iloc

If we want to extract sunset from DataFrame using the row and column numeric
index/position, then we can use iloc.

Syntax:

Df.iloc[StartRowInex:EndRowIndex:StopValue,StartColumnInex:EndColumnIndex:StopValue
]

iloc works like Slicing operation.

Here, EndRowInex and EndColumnIndex values are not included.

Comparison Between iloc and loc

23
Display Rows at Index

CODE

import pandas as pd

N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']

A=[14,17,15,15,13,14,13,12]

M=[98,78,68,65,87,98,76,65]

S=['cs','bio','pe','ip','cs','ip','bio','cs']

D={'Name':N,'Age':A,'Marks':M,'Subject':S}

24
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])

#Display Rows at index 0 to 8

print(df.iloc[:,0:2])

print("************************************")

# Display Rows at index 0 to 2

print(df.iloc[0:3])

print("************************************")

# Display Rows at index 0 to 3

print(df.iloc[0:4,0:2])

print("************************************")

#Display Rows at index 2 to 5

print(df.iloc[2:6,0:3])

OUTPUT
Name Age Name Age
R1 jaya 14 R1 jaya 14
R2 bala 17 R2 bala 17
R3 krish 15 R3 krish 15
R4 sakthi 15 R4 sakthi 15
R5 abi 13 ************************************
R6 bharathi 14 Name Age Marks
R7 geetha 13 R3 krish 15 68
R8 sandhya 12 R4 sakthi 15 65
************************************ R5 abi 13 87
Name Age Marks Subject R6 bharathi 14 98

25
R1 jaya 14 98 cs
R2 bala 17 78 bio
R3 krish 15 68 pe
************************************

CODE

import pandas as pd

N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']

A=[14,17,15,15,13,14,13,12]

M=[98,78,68,65,87,98,76,65]

S=['cs','bio','pe','ip','cs','ip','bio','cs']

D={'Name':N,'Age':A,'Marks':M,'Subject':S}

df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])

#Display Rows at index1 and 5 and name and marks column

print(df.iloc[[1,5],[0,2]])

print("####################################")

print(df.iloc[[1,5],0:3:2])

print("************************************")

#Display Rows at index 0 and 2

print(df.iloc[0:4:2,0:2])

print("************************************")

#Complete dataframe row and column part

26
print(df.iloc[:])

print("************************************")
OUTPUT
Name Marks Name Age Marks Subject
R2 bala 78 R1 jaya 14 98 cs
R6 bharathi 98 R2 bala 17 78 bio
################################### R3 krish 15 68 pe
# R4 sakthi 15 65 ip
Name Marks R5 abi 13 87 cs
R2 bala 78 R6 bharathi 14 98 ip
R6 bharathi 98 R7 geetha 13 76 bio
************************************ R8 sandhya 12 65 cs
Name Age ************************************
R1 jaya 14
R3 krish 15
************************************

CODE

import pandas as pd

N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']

A=[14,17,15,15,13,14,13,12]

M=[98,78,68,65,87,98,76,65]

S=['cs','bio','pe','ip','cs','ip','bio','cs']

D={'Name':N,'Age':A,'Marks':M,'Subject':S}

df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','

27
Age','Marks','Subject'])

#Display Rows at index 2 to 4 (both inclusive)

print(df.iloc[2:5])

print("************************************")

#From rows at index 2 to 4,display colums Name and Marks

print(df.iloc[2:5,:])

print("************************************")

#Row index from 2 to 4 and column index only 0 and 3

print(df.iloc[2:5,[0,3]])

print("************************************")

#index Error,cant use cutom index label

print(df.iloc[2:5,['Name','Marks']])

print("************************************")

OUTPUT

Name Age Marks Subject Name Subject

R3 krish 15 68 pe R3 krish pe
R4 sakthi 15 65 ip R4 sakthi ip
R5 abi 13 87 cs R5 abi cs
************************************
Name Age Marks Subject InderError
R3 krish 15 68 pe
R4 sakthi 15 65 ip
R5 abi 13 87 cs
************************************

28
Selecting / Accessing Individual Value

(i)Either give a name of row or numeric index in square brackets with column name

<df object>.<column>>[Row name or row numeric index]

CODE

import pandas as pd

N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya']

A=[14,17,15,15,13,14,13,12]

M=[98,78,68,65,87,98,76,65]

S=['cs','bio','pe','ip','cs','ip','bio','cs']

D={'Name':N,'Age':A,'Marks':M,'Subject':S}

df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],columns=['Name','
Age','Marks','Subject'])

print(df.Name[3])

print("************************************")

print(df.Name[3])

print("************************************")

print(df.Marks[2])

print("************************************")

print(df.Marks['R5'])

print("************************************")

print(df.Marks[2])

print("************************************")

29
OUTPUT

sakthi 87
************************************ ************************************
sakthi 68
************************************ ************************************
68
************************************

ii)We can use at or iat attributes with DataFrame object

Using at

It is used to access a single value for row/column label pair

Syntax:

<DF object>.at[Rowlabel,Columnlabel]

CODE OUTPUT

import pandas as pd
N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya'] Bio
A=[14,17,15,15,13,14,13,12]
M=[98,78,68,65,87,98,76,65]
S=['cs','bio','pe','ip','cs','ip','bio','cs']
D={'Name':N,'Age':A,'Marks':M,'Subject':S}
df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],
columns=['Name','Age','Marks','Subject'])
#Display subject of geetha
print(df.at['R7','Subject'])

30
Using iat : It is used to access a single value for row/column label pair by integer
position

Syntax:<DF object>.iat[Rowindex,Columnindex]

CODE OUTPUT

import pandas as pd
65

N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya'] cs

A=[14,17,15,15,13,14,13,12]

M=[98,78,68,65,87,98,76,65]

S=['cs','bio','pe','ip','cs','ip','bio','cs']

D={'Name':N,'Age':A,'Marks':M,'Subject':S}

df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],

columns=['Name','Age','Marks','Subject'])

print(df.iat[3,2])

print(df.iat[4,3])

Individual Elements in a List

CODE OUTPUT

import pandas as pd
98

N=['jaya','bala','krish','sakthi','abi','bharathi','geetha','sandhya'] 13

A=[14,17,15,15,13,14,13,12]

M=[98,78,68,65,87,98,76,65]

S=['cs','bio','pe','ip','cs','ip','bio','cs']

31
D={'Name':N,'Age':A,'Marks':M,'Subject':S}

df=pd.DataFrame(D,index=['R1','R2','R3','R4','R5','R6','R7','R8'],

columns=['Name','Age','Marks','Subject'])

#Display mark of bharathi

print(df.Marks['R6'])

#Display age of abi

print(df.Age[4])

Attributes of DataFrame

When we create a DataFrame object, all information related to it (such as its size, its
datatype, its dimensions etc.) is available through its attributes.

Syntax:

<Data/frameObject>.AttributeName

Attributes Description Example

index returns the index import pandas as pd

labels of the
data = {
DataFrame

'Student Name': ['Ravi', 'Priya', 'Rahul'],

columns returns the column
labels of the 'Age': [21, 20, 22],
DataFrame
'City': ['Mumbai', 'Delhi', 'Bangalore']
size return the total
number of elements }

32
in the DataFrame df = pd.DataFrame(data)

shape return a tuple print("Row labels (index):", df.index)

representing the
#Row labels (index): RangeIndex(start=0,
dimensionality of the
stop=3, step=1)
DataFrame

print("Column labels:", df.columns)

empty Checks if the
DataFrame is empty #Column labels: Index(['Student Name',

'Age', 'City'], dtype='object')

ndim returns the number
of dimensions of the print(df.size)
DataFrame
#Output: 9 (3*3)
T Used to transpose
the DataFrame print(df.shape)

(switching rows and

# Output: (3,3) (3 rows, 3 columns)
columns)
print(df.empty)

# Output: False (since df has data)

print(df.ndim)

# Output: 2

print(df.T)

#Output

0 1 2

Student Name Ravi Priya Rahul

Age 21 20 22

City Mumbai Delhi Bangalore

Methods in DataFrame

33
head() function

 Returns the first n rows of the DataFrame.

 If the value for n is not passed, then by default n takes 5 and the first five rows
are displayed.

CODE
import pandas as pd
data = {
'Name': ['A', 'B', 'C','D','E','F','G'],
'Age': [25, 30, 35,24,35,22,34],
'City': ['Delhi', 'Goa', 'Mumbai','AP','MP','TN','Goa']
}
df = pd.DataFrame(data,
index=['Stud1','Stud2','Stud3','Stud4','Stud5','Stud6','Stud7'])
print(df.head(3))
print(df.head())
print(df.head(-1))
OUTPUT
Name Age City Name Age City Name Age City
Stud1 A 25 Delhi Stud1 A 25 Delhi Stud1 A 25 Delhi
Stud2 B 30 Goa Stud2 B 30 Goa Stud2 B 30 Goa
Stud3 C 35 Mumbai Stud3 C 35 Mumbai Stud3 C 35 Mumbai
Stud4 D 24 AP Stud4 D 24 AP
Stud5 E 35 MP

tail() function

 Returns the last n rows of the DataFrame.

If the value for n is not passed, then by default n takes 5 and the last five rows are
displayed.

CODE
import pandas as pd
data = {

34
'Name': ['A', 'B', 'C','D','E','F','G'],
'Age': [25, 30, 35,24,35,22,34],
'City': ['Delhi', 'Goa', 'Mumbai','AP','MP','TN','Goa']
}
df = pd.DataFrame(data,
index=['Stud1','Stud2','Stud3','Stud4','Stud5','Stud6','Stud7'])
print(df.tail(1))
print(df.tail())
print(df.tail(-3))
OUTPUT
Name Age City Stud7 G 34 Goa
Stud7 G 34 Goa Name Age City
Name Age City Stud4 D 24 AP
Stud3 C 35 Mumbai Stud5 E 35 MP
Stud4 D 24 AP Stud6 F 22 TN
Stud5 E 35 MP Stud7 G 34 Goa
Stud6 F 22 TN

Note: If you pass a negative integer n to head(), it will return all rows except the last
n rows and if you pass a negative integer n to tail(), it will return all rows except the
first n rows.

Accessing Elements through Indexing

Two Types of Indexing

Label Indexing

 In label indexing,we can access the elements of the DataFrame with the help
of either Row or Column Labels.
 There are various methods to access the elements of DataFRame using
Labels.
 loc and at are the two popular techniques for Label Based Indexing.

Code to display details of students who scored more than 80 marks

35
import pandas as pd

data = {

'Name': ['A', 'B', 'C','D','E','F','G'],

'Age': [25, 30, 35,24,35,22,34],

'City': ['Delhi', 'Goa', 'Mumbai','AP','MP','TN','Goa']

df = pd.DataFrame(data,
index=['Stud1','Stud2','Stud3','Stud4','Stud5','Stud6','Stud7'])

print(df)

print(df.Name) or print(df[‘Name’]) or print(df.loc[:,'Name']) #Displaying Row

OUTPUT

Name Age City Stud1 A

Stud1 A 25 Delhi Stud2 B

Stud2 B 30 Goa Stud3 C

Stud3 C 35 Mumbai Stud4 D

Stud4 D 24 AP Stud5 E

Stud5 E 35 MP Stud6 F

Stud6 F 22 TN Stud7 G
Name: Name, dtype: object
Stud7 G 34 Goa

#Displaying Column Stud1 25

36
print(df['Age']) Stud2 30

print(df.Age) Stud3 35

print(df.loc[:,'Age']) Stud4 24

Stud5 35

Stud6 22

Stud7 34

Name: Age, dtype: int64

print(df.loc[['Stud2','Stud4']]) Name Age City

Stud2 B 30 Goa

Stud4 D 24 AP

Boolean Indexing

 Boolean indexing in pandas DataFrames allows you to filter data based on

specific conditions.
 It involves creating a boolean Series (a Series of True/False values) and
using it to select rows that meet the condition.

Code to display details of students who scored more than 80 marks

import pandas as pd

data = {

'Name': ['Ravi', 'Priya', 'Rahul', 'Sneha', 'Amit'],

37
'Marks': [75, 82, 68, 91, 80]

# Create DataFrame

df = pd.DataFrame(data)

print("Names with more than 80 marks:")

print(df[df['Marks'] > 80])

print(df['Marks'] > 80)

OUTPUT
0 False
Names with more than 80 marks:
1 True

Name Marks 2 False

3 True
1 Priya 82 4 False
3 Sneha 91 Name: Marks, dtype: bool

Names with more than 80 marks:

Method-2

Code

import pandas as pd

#Create a dictionary

dict = {'name':["Rachel", "Monica", "Joey", "Phoebe"],

'job': ["Doctor", "Chef", "Actor", "Singer"],

'Age':[28, 28, 30, 31]}

38
#Create a dataframe with boolean values

df = pd.DataFrame(dict, index = [False, True, True, False])

print(df)

print(df.loc[True])
OUTPUT

name job Age

False Rachel Doctor 28 name job Age

True Monica Chef 28 True Monica Chef 28

True Joey Actor 30 True Joey Actor 30

False Phoebe Singer 31

C# 10.0 All-in-One For Dummies 1st Edition John Paul Muellerdownload
100% (2)
C# 10.0 All-in-One For Dummies 1st Edition John Paul Muellerdownload
61 pages
Dataframe Notes
No ratings yet
Dataframe Notes
26 pages
Block 1-Data Handling Using Pandas DataFrame
No ratings yet
Block 1-Data Handling Using Pandas DataFrame
17 pages
Class 12 Practical File
No ratings yet
Class 12 Practical File
29 pages
data frame CREATION
No ratings yet
data frame CREATION
7 pages
14_Pandas
No ratings yet
14_Pandas
25 pages
DataFrame Notes1
No ratings yet
DataFrame Notes1
32 pages
Python Pandas ch-2
No ratings yet
Python Pandas ch-2
56 pages
09_Pandas slides
No ratings yet
09_Pandas slides
33 pages
Pandas DataFrame1
No ratings yet
Pandas DataFrame1
22 pages
Python Pandas New Sylabus
No ratings yet
Python Pandas New Sylabus
53 pages
Pandas & Numpy
No ratings yet
Pandas & Numpy
32 pages
Chapter 2 Data Handling using pandas - I(DATA FRAME)
No ratings yet
Chapter 2 Data Handling using pandas - I(DATA FRAME)
15 pages
IP Practical File - Reference
No ratings yet
IP Practical File - Reference
98 pages
3. DATAFRAME.pdf
No ratings yet
3. DATAFRAME.pdf
14 pages
Creation of DF
No ratings yet
Creation of DF
16 pages
LIst of practicals 2024 - 25 class xii
No ratings yet
LIst of practicals 2024 - 25 class xii
10 pages
Chapter 1 Python Pandas - I
No ratings yet
Chapter 1 Python Pandas - I
35 pages
lecture-9-pandas
No ratings yet
lecture-9-pandas
176 pages
CSL-410-L15
No ratings yet
CSL-410-L15
29 pages
Class 12 Panda Project
No ratings yet
Class 12 Panda Project
13 pages
SBLC 1
No ratings yet
SBLC 1
23 pages
Notes for Dataframe ip
No ratings yet
Notes for Dataframe ip
7 pages
IP Slybuss
No ratings yet
IP Slybuss
21 pages
Pandas
No ratings yet
Pandas
82 pages
ip study
No ratings yet
ip study
18 pages
Class Notes: Class: XII Date: 7-Apr-2020 Subject: Informatics Practices Topic: 2. Python Pandas
No ratings yet
Class Notes: Class: XII Date: 7-Apr-2020 Subject: Informatics Practices Topic: 2. Python Pandas
4 pages
Create A Data Frame
No ratings yet
Create A Data Frame
25 pages
IP DataFrames (Introduction)
No ratings yet
IP DataFrames (Introduction)
18 pages
Practical Record Programs - Solutions
No ratings yet
Practical Record Programs - Solutions
23 pages
Panda
No ratings yet
Panda
33 pages
p.no 35 to 52
No ratings yet
p.no 35 to 52
18 pages
Python Pandas - DataFrame
No ratings yet
Python Pandas - DataFrame
12 pages
Lab 9
No ratings yet
Lab 9
9 pages
Unit 4
No ratings yet
Unit 4
36 pages
Python Pandas Module - Introduction-07-11-2023
No ratings yet
Python Pandas Module - Introduction-07-11-2023
84 pages
Python pandas
No ratings yet
Python pandas
34 pages
Pandas Dataframe
No ratings yet
Pandas Dataframe
48 pages
PANDAS
No ratings yet
PANDAS
24 pages
Copy of Copy of Black Doodle Group Project Presentation - 20230903 - 211147 - 0000
No ratings yet
Copy of Copy of Black Doodle Group Project Presentation - 20230903 - 211147 - 0000
32 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
Creating A Series Using Scalar Values
No ratings yet
Creating A Series Using Scalar Values
15 pages
L1_DataFrames_I
No ratings yet
L1_DataFrames_I
24 pages
DATA HANDLING AND CSV 2024- 2025
No ratings yet
DATA HANDLING AND CSV 2024- 2025
12 pages
UNIT 1 PYTHON PROGRAMMING-II
No ratings yet
UNIT 1 PYTHON PROGRAMMING-II
15 pages
Python Pandas-Data Frames
No ratings yet
Python Pandas-Data Frames
41 pages
Python UnitIV
No ratings yet
Python UnitIV
20 pages
Practical Xii 11-25
No ratings yet
Practical Xii 11-25
14 pages
Practical File Python
No ratings yet
Practical File Python
25 pages
ML UNIT-2 NOTES
No ratings yet
ML UNIT-2 NOTES
17 pages
Data Handling Using Pandas-I-ORG
No ratings yet
Data Handling Using Pandas-I-ORG
44 pages
Practical File 2024-25
No ratings yet
Practical File 2024-25
25 pages
P Unit-4 NP
No ratings yet
P Unit-4 NP
30 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
10 pages
Pandas DataFrameObject
No ratings yet
Pandas DataFrameObject
4 pages
Pandas Class 12 Ncertttt
No ratings yet
Pandas Class 12 Ncertttt
48 pages
18_Pandas
No ratings yet
18_Pandas
33 pages
Data Science - Unit-3-Part-2
No ratings yet
Data Science - Unit-3-Part-2
32 pages
P03 Introduction To Pandas Ans
No ratings yet
P03 Introduction To Pandas Ans
45 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
From Everand
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
Ginno
No ratings yet
Class 11 Lesson Plan
No ratings yet
Class 11 Lesson Plan
25 pages
Section
No ratings yet
Section
6 pages
T1 IP qp
No ratings yet
T1 IP qp
8 pages
Term 1 IP AK
No ratings yet
Term 1 IP AK
6 pages
Term 1 IP AK
No ratings yet
Term 1 IP AK
6 pages
Section physical education
No ratings yet
Section physical education
5 pages
Iteration Over DataFrame
No ratings yet
Iteration Over DataFrame
10 pages
Full Download Spring 5 Recipes: A Problem-Solution Approach 4th Edition Marten Deinum PDF
100% (9)
Full Download Spring 5 Recipes: A Problem-Solution Approach 4th Edition Marten Deinum PDF
53 pages
My Gita
No ratings yet
My Gita
201 pages
DLL Matter G7 Q1.W1.D2
No ratings yet
DLL Matter G7 Q1.W1.D2
4 pages
25 basic Linux commands for beginners
No ratings yet
25 basic Linux commands for beginners
9 pages
Coding Decoding 02
No ratings yet
Coding Decoding 02
6 pages
Dll Matatag _music&Arts 7 q3 w5
No ratings yet
Dll Matatag _music&Arts 7 q3 w5
10 pages
Virtual Private Network
No ratings yet
Virtual Private Network
8 pages
Defining and Non-Defining Relative Clauses
No ratings yet
Defining and Non-Defining Relative Clauses
8 pages
VI Editor
No ratings yet
VI Editor
23 pages
EAM Process Flow
100% (1)
EAM Process Flow
3 pages
Unit 3 Lesson 2 4e Anglais
No ratings yet
Unit 3 Lesson 2 4e Anglais
4 pages
Off course
No ratings yet
Off course
2 pages
Tartakower's Poetry
No ratings yet
Tartakower's Poetry
5 pages
Revised Term Test Time Table 2024 November
No ratings yet
Revised Term Test Time Table 2024 November
14 pages
Paper 2 Foundation
No ratings yet
Paper 2 Foundation
18 pages
Baghel Resume
No ratings yet
Baghel Resume
2 pages
Speaking Thanh Loan
No ratings yet
Speaking Thanh Loan
208 pages
Mitsubishi FX5U
No ratings yet
Mitsubishi FX5U
5 pages
Absolute Beginner S1 #13 Finding A Bulgarian Bathroom: Lesson Notes
No ratings yet
Absolute Beginner S1 #13 Finding A Bulgarian Bathroom: Lesson Notes
5 pages
Faith and Patience - Kenneth Copeland Ministries PDF
100% (1)
Faith and Patience - Kenneth Copeland Ministries PDF
36 pages
(Ebook) Caribbean Critique: Antillean Critical Theory from Toussaint to Glissant (Contemporary French and Francophone Cultures LUP) by Nick Nesbitt ISBN 9781846318665, 1846318661 - Quickly download the ebook to start your content journey
100% (2)
(Ebook) Caribbean Critique: Antillean Critical Theory from Toussaint to Glissant (Contemporary French and Francophone Cultures LUP) by Nick Nesbitt ISBN 9781846318665, 1846318661 - Quickly download the ebook to start your content journey
51 pages
(Maa 3.1-3.3) 3D Geometry - Triangles
No ratings yet
(Maa 3.1-3.3) 3D Geometry - Triangles
9 pages
Self Recognition Is The Key To Recognition of One's Creator
No ratings yet
Self Recognition Is The Key To Recognition of One's Creator
2 pages
God's Presence PDF
No ratings yet
God's Presence PDF
66 pages
9CU SINIF AYLIQ SINAQ 4
No ratings yet
9CU SINIF AYLIQ SINAQ 4
3 pages
DrWeb Crash
No ratings yet
DrWeb Crash
9 pages
Advice to the Newly Married Lady
No ratings yet
Advice to the Newly Married Lady
7 pages
A Reading of T.S.Eliot's Ash-Wednesday1
100% (1)
A Reading of T.S.Eliot's Ash-Wednesday1
22 pages
Answer Sheet English 3 Module 1 JMC
No ratings yet
Answer Sheet English 3 Module 1 JMC
2 pages