0% found this document useful (0 votes)

75 views13 pages

Chapter 2 Data Handling using pandas - I(Series)

Chapter 2 focuses on data handling using the Pandas library, highlighting its capabilities for data manipulation and analysis. It covers key data structures like Series and DataFrame, differences between Pandas and NumPy, methods for creating Series, accessing elements, and performing mathematical operations. The chapter also explains attributes and methods of Series, including indexing, slicing, and the use of iloc() and loc() for data retrieval.

Uploaded by

vmichaelarmstrong2200

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

75 views13 pages

Chapter 2 Data Handling using pandas - I(Series)

Uploaded by

vmichaelarmstrong2200

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Chapter 2

Data Handling using pandas – I

NumPy, Pandas and Matplotlib are three well-established Python libraries for scientific and
analytical use.

PANDAS (PANEL DATA)

➢ High-level data manipulation tool used for analysing data.
➢ It is very easy to import and export data using Pandas library.
➢ It is built on packages to do most of our data analysis and visualisation work.
➢ Pandas has three important data structures, namely –Series, DataFrame and Panel.

Differences between Pandas and Numpy:

1. A Numpy array is homogeneous data, while a Pandas DataFrame is heterogeneous data.
2. Pandas have an interface for operations like file loading, plotting, selection, joining, GROUP BY.
3. Pandas DataFrames (with column names) make it very easy to keep track of data.
4. Pandas data is in Tabular Format, whereas Numpy is numeric array.

Installing Pandas
To install Pandas from command line, we need to type in:
pip install pandas.

Data Structure in Pandas

A data structure is a collection of data values and operations that can be applied to that data. It
enables efficient storage, retrieval and modification of the data.
Two commonly used data structures in Pandas are Series and DataFrame.

Series
➢ one-dimensional array
➢ homogenous data
➢ containing a sequence of values with index
➢ sequence of values of any data type (int, float, list, string, etc)
➢ data is mutable
➢ size is immutable
The data label associated with a particular value is called its index

1
Creation of Series
A) Creation of Series from List
Program:
import pandas as pd
l=[10,20,30]
series1 = pd.Series(l)
# series1 = pd.Series([10,20,30])
print(series1)
Output:
0 10
1 20
2 30
dtype: int64

User-defined labels can be assigned to the index and use them to access elements of a Series.
Program:
import pandas as pd
series2 = pd.Series(["Kavi","Shyam"], index=[3,5])
print(series2)
Output:
3 Kavi
5 Shyam
dtype: object

We can also use letters or strings as indices

Program:
import pandas as pd
series2 = pd.Series([2,3],index=["Feb","Mar"])
print(series2)
Output:
Feb 2
Mar 3
dtype: int64
(B) Creation of empty Series
Program:
import pandas as pd

2
print("Creation of empty Series")
s=pd.Series(dtype=int)
print(s)
Note: Below all statements also create empty Series
s=pd.Series()
s=pd.Series([],dtype=int)
s=pd.Series({},dtype=int)
s=pd.Series((),dtype=int)
Output:
Creation of empty Series
Series([], dtype: int32)
(c) Creation of Series from Scalar value
Program:
import pandas as pd
print("create a series from scalar value")
s4=pd.Series(25,index=[10,11,12])
print(s4)
Output:
create a series from scalar value
10 25
11 25
12 25
dtype: int64
(d) Creation of Series from dictionary
Keys of the dictionary will become indices in the series.
Program:
import pandas as pd
print("create a series from dictionary")
d={'a':'ant','b':'bat'}
s5=pd.Series(d)
print(s5)
Output:
create a series from dictionary
a ant
b bat
dtype: object

3
(e) Creation of Series from ndarray
Program:
import numpy as np
import pandas as pd
print("create a series from ndarray")
a=np.array([10,20,30])
s6=pd.Series(a)
print(s6)
Output:
create a series from ndarray
0 10
1 20
2 30
dtype: int32
OR
Program:
import numpy as np
import pandas as pd
print("create a series from ndarray using arange function")
b=np.arange(10,20,3)
s7=pd.Series(b)
print(s7)
Output:
create a series from ndarray
0 10
1 13
2 16
3 19
dtype: int32

Accessing Elements of a Series

There are two common ways for accessing the elements of a series: Indexing and Slicing.
(A) Indexing
Indexing in Series is used to access elements in a series. Indexes are of two types: positional
index and labelled index. Positional index takes an integer value that corresponds to its position
in the series starting from 0, whereas labelled index takes any user-defined label as index.

4
Program:
import pandas as pd
s1=pd.Series([10,20,30,40,50],index=['I','II','III','IV','V'])
print(s1)
print("Assigning new index values")
s1.index=['one','two','three','four','five']
print(s1)
print("To access an element 30 using labelled indexing")
print(s1['three'])
print("To access the element 50 using postional indexing")
print(s1[4])
print("To access the element 20 and 40 using labelled indexing")
print(s1[['two','four']])
print("To access the element 20 and 40 using postional indexing")
print(s1[[1,3]])
print("To change value of an postional index 4")
s1[4]=55
print(s1)

Output:
I 10
II 20
III 30
IV 40
V 50
dtype: int64
Assigning new index values
one 10
two 20
three 30
four 40
five 50
dtype: int64
To access an element 30 using labelled indexing
30
To access the element 50 using postional indexing

5
50
To access the element 20 and 40 using labelled indexing
two 20
four 40
dtype: int64
To access the element 20 and 40 using postional indexing
two 20
four 40
dtype: int64
To change value of an postional index 4
one 10
two 20
three 30
four 40
five 55
dtype: int64

(B) Slicing
To extract a part of a series can be done through slicing. We can define which part of the series
is to be sliced by specifying the start and end parameters [start :end] with the series name.
When we use positional indices for slicing, the value at the end index position is excluded. If
labelled indexes are used for slicing, then value at the end index label is also included in the
output.
Program:
import pandas as pd
s1=pd.Series([10,20,30,40,55],index=['one','two','three','four','five'])
print("Positional index used for slicing")
print(s1[1:4])#excludes the value at index position 4
print("Labelled index used for slicing")
print(s1['one':'three'])
print("The series in reverse order")
print(s1[::-1])
print("To give same values for a given slice")
s1[1:4]=5
print(s1)
print("To give different values for a given slice")

6
s1[1:4]=[5,10,15]
print(s1)

Output:
Positional index used for slicing
two 20
three 30
four 40
dtype: int64
Labelled index used for slicing
one 10
two 20
three 30
dtype: int64
The series in reverse order
five 55
four 40
three 30
two 20
one 10
dtype: int64
To give same values for a given slice
one 10
two 5
three 5
four 5
five 55
dtype: int64
To give different values for a given slice
one 10
two 5
three 10
four 15
five 55
dtype: int64

7
Attributes of Series
Attribute Name Purpose
name assigns a name to the Series
index.name assigns a name to the index of the series
values prints a list of the values in the series
size prints the number of values in the Series object
empty prints True if the series is empty, and False otherwise

Program:
import pandas as pd
import numpy as np
s1=pd.Series({'a':np.NAN,'b':20,'c':30,'d':40})
print(s1)
s1.name='NIMS'
print(s1)
s1.index.name='Division'
print(s1)
print(s1.size)
print(s1.values)
print(s1.empty)
print(s1.count())
s2=pd.Series(dtype=int)
print(s2)
s2.name='Test'
print(s2)
s1.index.name='Result'
print(s2)
print(s2.size)
print(s2.values)
print(s2.empty)
print(s2.count())
Output:
a NaN
b 20.0
c 30.0
d 40.0
dtype: float64
a NaN

8
b 20.0
c 30.0
d 40.0
Name: NIMS, dtype: float64
Division
a NaN
b 20.0
c 30.0
d 40.0
Name: NIMS, dtype: float64
4
[nan 20. 30. 40.]
False
3
Series([], dtype: int32)
Series([], Name: Test, dtype: int32)
Series([], Name: Test, dtype: int32)
0
[]
True
0

Methods of Series
Method Explanation
Returns the first n members of the series. If the value for n is not passed, then
head(n)
by default n takes 5 and the first five members are displayed.
count() Returns the number of non-NaN values in the Series
Returns the last n members of the series. If the value for n is not passed, then
tail(n)
by default n takes 5 and the last five members are displayed.

Program:
import pandas as pd
s1=pd.Series([10,20,30,40,50,60,70,80,90])
print(s1.head())
print(s1.tail())
print(s1.head(2))
print(s1.tail(3))

9
Output:
0 10
1 20
2 30
3 40
4 50
dtype: int64
4 50
5 60
6 70
7 80
8 90
dtype: int64
0 10
1 20
dtype: int64
6 70
7 80
8 90
dtype: int64

Mathematical Operations on Series

While performing mathematical operations on series, index matching is implemented and all
missing values are filled in with NaN by default. Basic mathematical operations like addition,
subtraction, multiplication, division, etc., can be done on two Series, the operation is done on
each corresponding pair of elements.
(A) Addition of two Series
It can be done in two ways. In the first way, two series are simply added together (eg: s1+s2)
The second way is applied when we do not want to have NaN values in the output. We can use
the series method add() and a parameter fill_value to replace missing value with a specified
value. eg: s1.add(s2,fill_value=10)
(B)Subtraction of two Series
Again, it can be done in two different ways
s1-s2
s1.sub(s2,fill_value=20)

10
(C) Multiplication of two Series
Again, it can be done in two different ways
s1*s2
s1.mul(s2,fill_value=10)
(D) Division of two Series
Again, it can be done in two different ways
s1/s2
s1.div(s2,fill_value=20)
Program:
import pandas as pd
s1=pd.Series([10,20,30])
s2=pd.Series([5,15,25,35])
print(s1+s2)
print(s1.add(s2,fill_value=40))
print(s1-s2)
print(s1.sub(s2,fill_value=40))
print(s1*s2)
print(s1.mul(s2,fill_value=40))
print(s1/s2)
print(s1.div(s2,fill_value=40))
Output:
0 15.0
1 35.0
2 55.0
3 NaN
dtype: float64
0 15.0
1 35.0
2 55.0
3 75.0
dtype: float64
0 5.0
1 5.0
2 5.0
3 NaN
dtype: float64

11
0 5.0
1 5.0
2 5.0
3 5.0
dtype: float64
0 50.0
1 300.0
2 750.0
3 NaN
dtype: float64
0 50.0
1 300.0
2 750.0
3 1400.0
dtype: float64
0 2.000000
1 1.333333
2 1.200000
3 NaN
dtype: float64
0 2.000000
1 1.333333
2 1.200000
3 1.142857
dtype: float64

iloc() and loc()

iloc()-iloc() is used for displaying rows based on positional based indexing.
loc()- loc() is used for displaying rows based on labelled (row name) based indexing.
Program:
import pandas as pd
s1=pd.Series([10,20,30,40,50],index=['a','e','i','o','u'])
print(s1)
print(s1.iloc[1:4])#select rows with positional index 1,2,3(upper limit 4 is excluded)
print(s1.loc['a':'i'])#select rows with labelled index 'a','e','i'(upper limit also included in loc)
Output:

12
a 10
e 20
i 30
o 40
u 50
dtype: int64
e 20
i 30
o 40
dtype: int64
a 10
e 20
i 30
dtype: int64

Operations in Series
No ratings yet
Operations in Series
41 pages
2. SERIES ATTRIBUTES AND OPERATIONS.pdf
No ratings yet
2. SERIES ATTRIBUTES AND OPERATIONS.pdf
29 pages
4.
No ratings yet
4.
7 pages
Sr Ip Pandas i Full Notes
No ratings yet
Sr Ip Pandas i Full Notes
30 pages
DATA HANDLING WITH PANDAS - 1 NOTES XII IP
No ratings yet
DATA HANDLING WITH PANDAS - 1 NOTES XII IP
28 pages
Python Pandas - Series Notes
No ratings yet
Python Pandas - Series Notes
13 pages
Pandas
No ratings yet
Pandas
57 pages
Ncert Pandas
No ratings yet
Ncert Pandas
36 pages
CSE488_Lab5_Pandas
No ratings yet
CSE488_Lab5_Pandas
27 pages
AccountStatement_19-04-2025 09_57_10-1-1
No ratings yet
AccountStatement_19-04-2025 09_57_10-1-1
41 pages
11.2 Pandas
No ratings yet
11.2 Pandas
24 pages
Panda Ncert 1
No ratings yet
Panda Ncert 1
36 pages
Pandas
No ratings yet
Pandas
20 pages
Reading Material For Data Handling Using Pandas-I
No ratings yet
Reading Material For Data Handling Using Pandas-I
51 pages
IJCET_15_06_012
No ratings yet
IJCET_15_06_012
25 pages
Httpsncert.nic.Intextbookpdfleip102.PDF
No ratings yet
Httpsncert.nic.Intextbookpdfleip102.PDF
36 pages
Basic Data Processing with Pandas
No ratings yet
Basic Data Processing with Pandas
29 pages
4. IP Multi-site Connect Training20150611 打印版
No ratings yet
4. IP Multi-site Connect Training20150611 打印版
42 pages
Python Series
No ratings yet
Python Series
11 pages
Python Pandas (II)
No ratings yet
Python Pandas (II)
18 pages
Unit II Notes Revision
No ratings yet
Unit II Notes Revision
20 pages
pandas notes
No ratings yet
pandas notes
19 pages
12 IP Questions
No ratings yet
12 IP Questions
181 pages
IP NOTES
No ratings yet
IP NOTES
20 pages
WN Blog 017 - Cisco Catalyst 9800 - Local Web Auth Configuration Guide
No ratings yet
WN Blog 017 - Cisco Catalyst 9800 - Local Web Auth Configuration Guide
8 pages
1 (1)
No ratings yet
1 (1)
54 pages
Unit-1 Python Pandas (1)
No ratings yet
Unit-1 Python Pandas (1)
56 pages
Unit I: Data Handling Using Pandas and Data Visualization: Marks:25
No ratings yet
Unit I: Data Handling Using Pandas and Data Visualization: Marks:25
135 pages
1100 Series
No ratings yet
1100 Series
28 pages
Unit 1 Pandas - Series and DataFrame
No ratings yet
Unit 1 Pandas - Series and DataFrame
19 pages
Pandas Notes 1
No ratings yet
Pandas Notes 1
6 pages
Ip Chapter 1
No ratings yet
Ip Chapter 1
36 pages
leip102
No ratings yet
leip102
36 pages
Python UnitIV
No ratings yet
Python UnitIV
20 pages
Pandas basics
No ratings yet
Pandas basics
21 pages
Class 12 IP Ch-1, 2 3
No ratings yet
Class 12 IP Ch-1, 2 3
28 pages
XII-IP-QuickRevision
No ratings yet
XII-IP-QuickRevision
26 pages
Pilot Run Checklist 20160928
No ratings yet
Pilot Run Checklist 20160928
12 pages
Dual DUW Configuration
40% (5)
Dual DUW Configuration
9 pages
XII IP Ch 1 Python Pandas - I Series
No ratings yet
XII IP Ch 1 Python Pandas - I Series
45 pages
Python Pandas Series
No ratings yet
Python Pandas Series
30 pages
Cambridge International AS & A Level: Computer Science 9608/41
No ratings yet
Cambridge International AS & A Level: Computer Science 9608/41
20 pages
Ainsworth PDF
100% (3)
Ainsworth PDF
5 pages
Data Logger User Manual V1.2
No ratings yet
Data Logger User Manual V1.2
17 pages
Data Handlinng Using Pandas-I
No ratings yet
Data Handlinng Using Pandas-I
46 pages
Computing-1
No ratings yet
Computing-1
6 pages
Final
No ratings yet
Final
40 pages
Python Code
No ratings yet
Python Code
44 pages
Data Manipulation With Pandas
No ratings yet
Data Manipulation With Pandas
38 pages
1 Data Handlinng Using Pandas-I
No ratings yet
1 Data Handlinng Using Pandas-I
46 pages
Data Handlinng Using Pandas
No ratings yet
Data Handlinng Using Pandas
46 pages
Synopsis For Shopping Mania
No ratings yet
Synopsis For Shopping Mania
38 pages
Human Computer Interaction Lecture Notes
No ratings yet
Human Computer Interaction Lecture Notes
24 pages
Chapter 1 and 2 Series and Data Frame
No ratings yet
Chapter 1 and 2 Series and Data Frame
45 pages
Python Pandas
No ratings yet
Python Pandas
22 pages
XII_ip_Panda_I_Part_I_2023 (1) 1 1
No ratings yet
XII_ip_Panda_I_Part_I_2023 (1) 1 1
25 pages
Exp8 SBLC
No ratings yet
Exp8 SBLC
9 pages
Data Handling using Pandas-1
No ratings yet
Data Handling using Pandas-1
23 pages
User's Manual
No ratings yet
User's Manual
16 pages
Operating A Typical High Speed
No ratings yet
Operating A Typical High Speed
8 pages
Fiber Distributed Data Interface
No ratings yet
Fiber Distributed Data Interface
23 pages
Exp 25_26
No ratings yet
Exp 25_26
17 pages
Data Handling Python NCERT
No ratings yet
Data Handling Python NCERT
36 pages
Jabeee Case Study
No ratings yet
Jabeee Case Study
11 pages
String Handling: - in Java
No ratings yet
String Handling: - in Java
20 pages
Subject: Bcan-502 Unix and Shell Programming (Simple Filter Commands Bca 5 SEM
No ratings yet
Subject: Bcan-502 Unix and Shell Programming (Simple Filter Commands Bca 5 SEM
13 pages
Working With Pandas Notes
No ratings yet
Working With Pandas Notes
27 pages
LAST MINUTES REVISION Pandas Series
No ratings yet
LAST MINUTES REVISION Pandas Series
6 pages
CH 02 - Data Handling Using Pandas Leip102 EDITED Smaller 01 Codes Only
No ratings yet
CH 02 - Data Handling Using Pandas Leip102 EDITED Smaller 01 Codes Only
15 pages
Ip 102
No ratings yet
Ip 102
36 pages
EVPN With IRB Solution Overview - Technical Documentation - Support - Juniper Networks
No ratings yet
EVPN With IRB Solution Overview - Technical Documentation - Support - Juniper Networks
4 pages
FINAL Autumn Break Class XI CS IP Holiday Home Work 2021-22
No ratings yet
FINAL Autumn Break Class XI CS IP Holiday Home Work 2021-22
35 pages
Class12 Pandas Notes
No ratings yet
Class12 Pandas Notes
23 pages
ML Lab8
No ratings yet
ML Lab8
28 pages
CH 2
No ratings yet
CH 2
36 pages
Vsat
No ratings yet
Vsat
16 pages
Irimeter Manual
No ratings yet
Irimeter Manual
41 pages
Unit I: Data Handling Using Pandas and Data Visualization: Marks:30
No ratings yet
Unit I: Data Handling Using Pandas and Data Visualization: Marks:30
75 pages
1 IP 12 NOTES PythonPandas 2022 PDF
100% (3)
1 IP 12 NOTES PythonPandas 2022 PDF
66 pages
NEV Training Academy - Registration & Booking Guide (BETA)
No ratings yet
NEV Training Academy - Registration & Booking Guide (BETA)
3 pages
Phases of The Operation of An Accounting Information System
No ratings yet
Phases of The Operation of An Accounting Information System
3 pages
12ip 22 23
No ratings yet
12ip 22 23
188 pages
Data Analytics Pandas
No ratings yet
Data Analytics Pandas
33 pages
Buj Buj Polka (AUDIO TRACK) Gummibär The Gummy Bear: Lyrics
No ratings yet
Buj Buj Polka (AUDIO TRACK) Gummibär The Gummy Bear: Lyrics
3 pages
Privacy Guide v01
No ratings yet
Privacy Guide v01
4 pages
Revision Point - Series
No ratings yet
Revision Point - Series
5 pages
Database Mirroring Vs Log - 123456
No ratings yet
Database Mirroring Vs Log - 123456
2 pages
Pandas - Series - Short - Notes
No ratings yet
Pandas - Series - Short - Notes
7 pages
Designjet T120/T520 Eprinter Series: Service Manual
No ratings yet
Designjet T120/T520 Eprinter Series: Service Manual
20 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet

Chapter 2 Data Handling using pandas - I(Series)

Uploaded by

Chapter 2 Data Handling using pandas - I(Series)

Uploaded by

Chapter 2

Data Handling using pandas – I

PANDAS (PANEL DATA)

Differences between Pandas and Numpy:

Data Structure in Pandas

We can also use letters or strings as indices

Accessing Elements of a Series

Mathematical Operations on Series

iloc() and loc()

You might also like