0% found this document useful (0 votes)
32 views28 pages

Data Handling With Pandas - 1 Notes Xii Ip

Pandas is an essential open-source Python library for data handling, designed to manage large datasets efficiently. It provides various data structures, including Series, which can store one-dimensional data with customizable indexing. The document covers how to create Series objects from lists, NumPy arrays, scalars, strings, and dictionaries, as well as accessing and manipulating data within these structures.

Uploaded by

gaminggalaxy133
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views28 pages

Data Handling With Pandas - 1 Notes Xii Ip

Pandas is an essential open-source Python library for data handling, designed to manage large datasets efficiently. It provides various data structures, including Series, which can store one-dimensional data with customizable indexing. The document covers how to create Series objects from lists, NumPy arrays, scalars, strings, and dictionaries, as well as accessing and manipulating data within these structures.

Uploaded by

gaminggalaxy133
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

CHAPTER – 1

DATA HANDLING WITH PANDAS-1

What is Pandas in Python?

Pandas is one of the most important and useful open source Python’s library for Data
Science. It is basically used for handling complex and large amount of data efficiently and
easily. Pandas has derived it’s name from Panel Data System where panel represent a 3D
data structure.

It was mainly develop by Wes Mckinney.

Why Pandas | Features of Pandas

 It supports Data Visualization Excellently.


 It handles large data efficiently
 It can read data in many format
 it makes data customizable and flexible
 it handles missing and duplicate data efficiently.
 Less writing and more work done
Pandas Data Structure

Data Structure refers to organizing, storing and managing data efficiently. Python Pandas
provide different data structure to manage data of different types which are depicted in
following diagram-

1|DATA HANDLING WITH PANDAS -1(XII – IP(065)


Series Data Structure

 It is 1D data structure
 It is size immutable
 It is value mutable
 It stores homogeneous data
 It supports explicit indexing

Create or Declare Series object

To create series object we need to do followings

Import pandas library in your program as per syntax given below:

import pandas as <object name>


For exp –
import pandas as pd

Use Series() method of pandas library to create series object as per syntax given
below:

<Series object> = <panda_object>.Series( data = <value>, index = <value>)


For exp –
S = pd.Series(data = [12,23,34],index = [‘a’,’b’,’c’])

While creating Series object using Series() method, following are the points we should keep
in our mind:

 Series() can have two arguments data and index


 Series() arguments can be taken in any order.
 Index argument is optional
 It can be made on any data such as python sequence (List, Tuple, String), ndarray,
dictionary, scalar value
 We can assign value of any data type as Index.
 We can skip keyword data for assigning values to series object.

2|DATA HANDLING WITH PANDAS -1(XII – IP(065)


Creating Series object using List-

Syntax:

<Series object> = <panda_object>.Series(<any list of values>)

Without Indexing-

Exp-1:

import pandas as pd

S = pd.Series([23,34,55])

print(“My Series is”)

print(S)

Output –

My Series is

0 23

1 34

2 55

dtype : int64

With index –

Exp-2:

import pandas as pd

S = pd.Series([23,34,55], index = [„jan‟,‟feb‟,‟mar‟])

print(“My Series is”)

print(S)

3|DATA HANDLING WITH PANDAS -1(XII – IP(065)


Output –

My Series is

jan 23

feb 34

mar 55

dtype : int64

Exp-3: Create a Series object using a list that stores the total no of students of three
sections ‘A’,’B’,’C.

import pandas as pd

section = [„A‟,‟B‟,„C‟]

students = [40,50,45]

S = pd.Series(students, index = section)

print(“My Series is”)

print(S)

Output –

My Series is

A 40

B 50

C 45

dtype : int64

4|DATA HANDLING WITH PANDAS -1(XII – IP(065)


Creating Series object Numpy Array

To use Numpy array as data for Series object make sure Numpy library is imported as per
syntax given below:

import numpy as np

Syntax:

<Series object> = <panda_object>.Series(data = <nparray>, index = <sequence>)

Without Indexing-

Exp-1:

import pandas as pd

import Numpy as np

n = np.array([10.8,12.6,8.2])

S = pd.Series(n)

print(“My Series is”)

print(S)

Output –

My Series is

0 10.8

1 12.6

2 8.2

dtype : float64

5|DATA HANDLING WITH PANDAS -1(XII – IP(065)


With index –

Exp-2:

import pandas as pd

import numpy as np

n = np.arrange(10,50,10)

S = pd.Series(n, index = range(1,5))

print(“My Series is”)

print(S)

Output –

My Series is

1 10

2 20

3 30

4 40

dtype : int32

Exp-3: Create a Series object that stores amount paid as values and name of customer
as index. Amount paid is taken as nparray.

import pandas as pd

import numpy as np

amount = np.array([ 200.6, 350.4, 760.2])

names = [„raj‟, „sunil‟, „kamal‟]

6|DATA HANDLING WITH PANDAS -1(XII – IP(065)


S = pd.Series(amount, index = names)

print(“My Series is”)

print(S)

Output –

My Series is

Raj 200.6

Sunil 350.4

kamal 760.2

dtype : float64

Creating Series object using Scalar value

 Scalar value refers to single value passed for creating series object.
 Index argument must be passed while creating series object using scalar value.
Syntax:

<Series object> = <panda_object>.Series(scalar value, index = <sequence>)

Exp-1:

import pandas as pd

S = pd.Series(2021, index= [„jan‟,‟feb‟,‟mar‟])

print(“My Series is”)

print(S)

7|DATA HANDLING WITH PANDAS -1(XII – IP(065)


Output –

My Series is

Jan 2021

Feb 2021

Mar 2021

dtype : int64

Exp-2:

import pandas as pd

S = pd.Series(2000, index= range(5))

print(“My Series is”)

print(S)

Output –

My Series is

0 2000

1 2000

2 2000

3 2000

dtype : int64

8|DATA HANDLING WITH PANDAS -1(XII – IP(065)


Creating Series object using String

Syntax:

<Series object> = <panda_object>.Series(<string value>, index = <sequence>)

Exp-1: Create a series object using individual characters

import pandas as pd

S = pd.Series([„m‟,‟o‟,‟l‟])

print(“My Series is”)

print(S)

Output –

My Series is

0 m

1 o

2 l

dtype : object

Exp-2: Create a series object using String

import pandas as pd

S = pd.Series(„techtipnow‟)

print(“My Series is”)

print(S)

9|DATA HANDLING WITH PANDAS -1(XII – IP(065)


Output –

My Series is

0 techtipnow

dtype : object

Exp-3: Create a series object using multiple Strings

import pandas as pd

S = pd.Series([„I‟,‟am‟,‟Indian‟l, index = [1,2,3])

print(“My Series is”)

print(S)

Output –

My Series is

1 I

2 am

3 Indian

dtype : object

10 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Creating Series object using Dictionaries

 Key of Dictionary is always treated as index of Series object


 Values of Dictionary are always treated as data of Series object.
Syntax:

<Series object> = <panda_object>.Series(<dictionary object>)

Exp-1: Create a series object using using dictionary that stores no of votes secured by
each party in election 21-22

import pandas as pd

d={„bjp‟:234,‟inc‟:210,‟jdu‟:80}

S = pd.Series(d)

print(“My Series is”)

print(S)

Output –

My Series is

Bjp 234

INC 210

jdu 80

dtype : int64

11 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Specifying Missing values in Series object

To represent missing values for Series objects, We can use –

 None
 np.NaN

Exp-1: Create a series object using using dictionary that stores no of votes secured by
each party in election 21-22. Make sure Nota is included with missing value.

import pandas as pd

d={„bjp‟:234,‟inc‟:210,‟jdu‟:80, „nota‟: None}

S = pd.Series(d)

print(“My Series is”)

print(S)

Output –

My Series is

Bjp 234

INC 210

jdu 80

nota NaN

dtype : float64

12 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Exp-1: Create a series object using tuple that stores month wise percentage of absent
students of class 12th . Missing value should be represented properly

import pandas as pd

import numpy as np

month = [„jan‟,‟feb‟,‟mar‟]

attend = (50,np.nan,70)

S = pd.Series(d)

print(“My Series is”)

print(S)

Output –

My Series is

Jan 50

Feb NaN

Mar 70

dtype : float64

13 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Mathematical Expressions to Create Data in Series

We can provide data for Series() method by implementing mathematical expression that can
calculate values for Data sequence as per syntax given below:

<Series object> = <panda_object>.Series(<mathematical expression>, index =


<sequence>)

Exp-1:

import pandas as pd

import numpy as np

n = np.array([12.2,23.3,40.0])

S = pd.Series(n*2, index = range(1,4))

print(“My Series is”)

print(S)

Output –

My Series is

1 24.4

2 46.6

3 80.0

dtype : float64

Note: In numpy array n*2 will be applied to each value of n.

14 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Exp-2:

import pandas as pd

L = [12,23,None]

S = pd.Series(L*2)

print(“My Series is”)

print(S)

Output –

My Series is

0 12.0

1 23.0

2 NaN

3 12.0

4 23.0

5 NaN

dtype : float64

Note: Here L*2 will replicate values of L two times.

15 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Accessing Elements of Series Object

We can access elements of Series object in two ways –

 Index: Using Index we can access


 Using Indexing we can access Individual element of Series object.
 Using Indexing we can access multiple elements of Series object that may not be
contiguous element.
 Indexing can be used in two ways : Positional Index, Labelled Index
 In Positional Index an Integer value is taken which represent specific element.
 In Labelled Index any user defined label as index is taken.
 Slice: Using Slice we can access
 Subset of Series object contain multiple elements are always contiguous element.

Accessing Individual Element

To access individual element, we have to provide index no of the element within square
bracket of Series object.

Syntax:

<Series object>[<Index>]

Exp-1:

import pandas as pd

S = pd.Series(range(10,101,10))

print(“We have Accessed”)

print(S[4])

Output –

We have Accessed

50

16 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Accessing Multiple Elements Using Index

To access multiple elements, we have to provide index no of the each element as List within
square bracket of Series object .

Syntax:

<Series object>[[<Index, Index,…>]]

Exp-1:

import pandas as pd

S = pd.Series([12,23,34,45,55],index = [„m1‟,‟m2‟,‟m3‟,‟m4‟,‟m5‟])

print(“We have Accessed”)

print(S[[„m1‟,‟m4‟]])

Output –

We have Accessed

m1 12

m2 45

dtype: int64

17 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Slicing

Extracting a specific part of Series object is called Slicing.


 Subset occurred after slicing contains contiguous elements.
 Slicing is done using positions not Index of the Series object.
 In Positional slicing value of end index is excluded.
 If labels are used in slicing, than value at end index label is also included.
 Slicing can also be used to extract slice elements in reverse order.
We can retrieve subset of series object as per syntax given below –

<Series object>[start_index : end_index : step-value]

Exp-1:

import pandas as pd

S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])

print(“Slicing demo”)

print(S[2:6])

Output –

Slicing Demo

2 34

3 45

4 55

5 76

dtype: int64

18 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Exp-2:

import pandas as pd

S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])

print(“Slicing demo”)

print(S[:])

Output –

Slicing Demo

0 12

1 23

2 34

3 45

4 55

5 76

6 80

7 92

8 41

9 69

10 56

dtype: int64

19 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Exp-3:

import pandas as pd

S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])

print(“Slicing demo”)

print(S[:4])

Output –

Slicing Demo

0 12

1 23

2 34

3 45

dtype: int64

Exp-4:

import pandas as pd

S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])

print(“Slicing demo”)

print(S[5:])

Output –

Slicing Demo

5 76

6 80

20 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
7 92

8 41

9 69

10 56

dtype: int64

Exp-5:

import pandas as pd

S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])

print(“Slicing demo”)

print(S[1:9:2])

Output –

Slicing Demo

1 23

3 45

5 76

7 92

dtype: int64

21 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Exp-6:

import pandas as pd

S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])

print(“Slicing demo”)

print(S[7:15])

Output –

Slicing Demo

7 92

8 41

9 69

10 56

dtype: int64

Note – if end index is out of bound, even though slicing produces subset

Exp-7:

import pandas as pd

S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])

print(“Slicing demo”)

print(S[3:-5])

22 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Output –

Slicing Demo

3 45

4 55

5 76

dtype: int64

Exp-8:

import pandas as pd

S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])

print(“Slicing demo”)

print(S[:-6])

Output –

Slicing Demo

0 12

1 23

2 34

3 45

4 55

dtype: int64

23 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Exp-9:

import pandas as pd

S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])

print(“Slicing demo”)

print(S[-4:])

Output –

Slicing Demo

7 92

8 41

9 69

10 56

dtype: int64

Exp-9:

import pandas as pd

S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])

print(“Slicing demo”)

print(S[-7:-2])

Output –

Slicing Demo

4 55

5 76

24 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
6 80

7 92

8 41

dtype: int64

Exp-10:

import pandas as pd

S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])

print(“Slicing demo”)

print(S[-4:])

Output –

Slicing Demo

7 92

8 41

9 69

10 56

dtype: int64

Exp-11:

import pandas as pd

S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])

print(“Slicing demo”)

print(S[::3])

25 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Output –

Slicing Demo

0 12

3 45

6 80

9 69

dtype: int64

Exp-12:

import pandas as pd

S = pd.Series([12,23,34,45,55,76,80,92,41,69,56])

print(“Slicing demo”)

print(S[::-3])

Output –

Slicing Demo

10 56

7 92

4 55

1 23

dtype: int64

26 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Mathematical Operation on Series object

 We can do arithmetic operations ( +, -, *, /) on more than one series objects.


 The arithmetic operation is performed only on matching index.
 For non-matching index it produces NaN values.
 If data items of matching indexes are not compatible for the operation, it produces
NaN values as a result.
Exp-1:

import pandas as pd

S1 = pd.Series([12,23,34])

S2 = pd.Series([10,20,10])

print(“Addition of Series with matching indexes”)

print(S1 + S2)

Output –

Addition of Series with matching indexes

0 22

1 43

2 44

dtype: int64

Exp-2:

import pandas as pd

S1 = pd.Series([12,23,34,56])

S2 = pd.Series([10,20,10])

print(“Addition of Series of Different sizes”)

print(S1 + S2)

27 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )
Output –

Addition of Series of Different sizes

0 22

1 43

2 44

3 NaN

dtype: int64

Exp-3:

import pandas as pd

S1 = pd.Series([12,23,34])

S2 = pd.Series([10,20,10],index=[„a‟,‟b‟,‟c‟])

print(“Addition of Series With Non Matching Index”)

print(S1 + S2)

Output –

Addition of Series With Non Matching Index


0 NaN
1 NaN
2 NaN
a NaN
b NaN
c NaN
dtype: float64

28 | D A T A H A N D L I N G W I T H P A N D A S - 1 ( X I I – I P ( 0 6 5 )

You might also like