Data Handling With Pandas - 1 Notes Xii Ip
Data Handling With Pandas - 1 Notes Xii Ip
Pandas is one of the most important and useful open source Python’s library for Data
Science. It is basically used for handling complex and large amount of data efficiently and
easily. Pandas has derived it’s name from Panel Data System where panel represent a 3D
data structure.
Data Structure refers to organizing, storing and managing data efficiently. Python Pandas
provide different data structure to manage data of different types which are depicted in
following diagram-
It is 1D data structure
It is size immutable
It is value mutable
It stores homogeneous data
It supports explicit indexing
Use Series() method of pandas library to create series object as per syntax given
below:
While creating Series object using Series() method, following are the points we should keep
in our mind:
Syntax:
Without Indexing-
Exp-1:
import pandas as pd
S = pd.Series([23,34,55])
print(S)
Output –
My Series is
0 23
1 34
2 55
dtype : int64
With index –
Exp-2:
import pandas as pd
print(S)
My Series is
jan 23
feb 34
mar 55
dtype : int64
Exp-3: Create a Series object using a list that stores the total no of students of three
sections ‘A’,’B’,’C.
import pandas as pd
section = [„A‟,‟B‟,„C‟]
students = [40,50,45]
print(S)
Output –
My Series is
A 40
B 50
C 45
dtype : int64
To use Numpy array as data for Series object make sure Numpy library is imported as per
syntax given below:
import numpy as np
Syntax:
Without Indexing-
Exp-1:
import pandas as pd
import Numpy as np
n = np.array([10.8,12.6,8.2])
S = pd.Series(n)
print(S)
Output –
My Series is
0 10.8
1 12.6
2 8.2
dtype : float64
Exp-2:
import pandas as pd
import numpy as np
n = np.arrange(10,50,10)
print(S)
Output –
My Series is
1 10
2 20
3 30
4 40
dtype : int32
Exp-3: Create a Series object that stores amount paid as values and name of customer
as index. Amount paid is taken as nparray.
import pandas as pd
import numpy as np
print(S)
Output –
My Series is
Raj 200.6
Sunil 350.4
kamal 760.2
dtype : float64
Scalar value refers to single value passed for creating series object.
Index argument must be passed while creating series object using scalar value.
Syntax:
Exp-1:
import pandas as pd
print(S)
My Series is
Jan 2021
Feb 2021
Mar 2021
dtype : int64
Exp-2:
import pandas as pd
print(S)
Output –
My Series is
0 2000
1 2000
2 2000
3 2000
dtype : int64
Syntax:
import pandas as pd
S = pd.Series([„m‟,‟o‟,‟l‟])
print(S)
Output –
My Series is
0 m
1 o
2 l
dtype : object
import pandas as pd
S = pd.Series(„techtipnow‟)
print(S)
My Series is
0 techtipnow
dtype : object
import pandas as pd
print(S)
Output –
My Series is
1 I
2 am
3 Indian
dtype : object
10 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Creating Series object using Dictionaries
Exp-1: Create a series object using using dictionary that stores no of votes secured by
each party in election 21-22
import pandas as pd
d={„bjp‟:234,‟inc‟:210,‟jdu‟:80}
S = pd.Series(d)
print(S)
Output –
My Series is
Bjp 234
INC 210
jdu 80
dtype : int64
11 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Specifying Missing values in Series object
None
np.NaN
Exp-1: Create a series object using using dictionary that stores no of votes secured by
each party in election 21-22. Make sure Nota is included with missing value.
import pandas as pd
S = pd.Series(d)
print(S)
Output –
My Series is
Bjp 234
INC 210
jdu 80
nota NaN
dtype : float64
12 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Exp-1: Create a series object using tuple that stores month wise percentage of absent
students of class 12th . Missing value should be represented properly
import pandas as pd
import numpy as np
month = [„jan‟,‟feb‟,‟mar‟]
attend = (50,np.nan,70)
S = pd.Series(d)
print(S)
Output –
My Series is
Jan 50
Feb NaN
Mar 70
dtype : float64
13 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Mathematical Expressions to Create Data in Series
We can provide data for Series() method by implementing mathematical expression that can
calculate values for Data sequence as per syntax given below:
Exp-1:
import pandas as pd
import numpy as np
n = np.array([12.2,23.3,40.0])
print(S)
Output –
My Series is
1 24.4
2 46.6
3 80.0
dtype : float64
14 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Exp-2:
import pandas as pd
L = [12,23,None]
S = pd.Series(L*2)
print(S)
Output –
My Series is
0 12.0
1 23.0
2 NaN
3 12.0
4 23.0
5 NaN
dtype : float64
15 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Accessing Elements of Series Object
To access individual element, we have to provide index no of the element within square
bracket of Series object.
Syntax:
<Series object>[<Index>]
Exp-1:
import pandas as pd
S = pd.Series(range(10,101,10))
print(S[4])
Output –
We have Accessed
50
16 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Accessing Multiple Elements Using Index
To access multiple elements, we have to provide index no of the each element as List within
square bracket of Series object .
Syntax:
Exp-1:
import pandas as pd
S = pd.Series([12,23,34,45,55],index = [„m1‟,‟m2‟,‟m3‟,‟m4‟,‟m5‟])
print(S[[„m1‟,‟m4‟]])
Output –
We have Accessed
m1 12
m2 45
dtype: int64
17 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Slicing
Exp-1:
import pandas as pd
S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])
print(“Slicing demo”)
print(S[2:6])
Output –
Slicing Demo
2 34
3 45
4 55
5 76
dtype: int64
18 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Exp-2:
import pandas as pd
S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])
print(“Slicing demo”)
print(S[:])
Output –
Slicing Demo
0 12
1 23
2 34
3 45
4 55
5 76
6 80
7 92
8 41
9 69
10 56
dtype: int64
19 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Exp-3:
import pandas as pd
S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])
print(“Slicing demo”)
print(S[:4])
Output –
Slicing Demo
0 12
1 23
2 34
3 45
dtype: int64
Exp-4:
import pandas as pd
S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])
print(“Slicing demo”)
print(S[5:])
Output –
Slicing Demo
5 76
6 80
20 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
7 92
8 41
9 69
10 56
dtype: int64
Exp-5:
import pandas as pd
S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])
print(“Slicing demo”)
print(S[1:9:2])
Output –
Slicing Demo
1 23
3 45
5 76
7 92
dtype: int64
21 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Exp-6:
import pandas as pd
S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])
print(“Slicing demo”)
print(S[7:15])
Output –
Slicing Demo
7 92
8 41
9 69
10 56
dtype: int64
Note – if end index is out of bound, even though slicing produces subset
Exp-7:
import pandas as pd
S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])
print(“Slicing demo”)
print(S[3:-5])
22 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Output –
Slicing Demo
3 45
4 55
5 76
dtype: int64
Exp-8:
import pandas as pd
S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])
print(“Slicing demo”)
print(S[:-6])
Output –
Slicing Demo
0 12
1 23
2 34
3 45
4 55
dtype: int64
23 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Exp-9:
import pandas as pd
S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])
print(“Slicing demo”)
print(S[-4:])
Output –
Slicing Demo
7 92
8 41
9 69
10 56
dtype: int64
Exp-9:
import pandas as pd
S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])
print(“Slicing demo”)
print(S[-7:-2])
Output –
Slicing Demo
4 55
5 76
24 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
6 80
7 92
8 41
dtype: int64
Exp-10:
import pandas as pd
S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])
print(“Slicing demo”)
print(S[-4:])
Output –
Slicing Demo
7 92
8 41
9 69
10 56
dtype: int64
Exp-11:
import pandas as pd
S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])
print(“Slicing demo”)
print(S[::3])
25 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Output –
Slicing Demo
0 12
3 45
6 80
9 69
dtype: int64
Exp-12:
import pandas as pd
S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])
print(“Slicing demo”)
print(S[::-3])
Output –
Slicing Demo
10 56
7 92
4 55
1 23
dtype: int64
26 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Mathematical Operation on Series object
import pandas as pd
S1 = pd.Series([12,23,34])
S2 = pd.Series([10,20,10])
print(S1 + S2)
Output –
0 22
1 43
2 44
dtype: int64
Exp-2:
import pandas as pd
S1 = pd.Series([12,23,34,56])
S2 = pd.Series([10,20,10])
print(S1 + S2)
27 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Output –
0 22
1 43
2 44
3 NaN
dtype: int64
Exp-3:
import pandas as pd
S1 = pd.Series([12,23,34])
S2 = pd.Series([10,20,10],index=[„a‟,‟b‟,‟c‟])
print(S1 + S2)
Output –
28 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )