0% found this document useful (0 votes)
0 views13 pages

Ch 1 Python Pandas-I

The document provides an introduction to the Python Pandas library, which is used for data analysis and offers high-performance data structures. It details the advantages of Pandas, the two primary data structures (Data Series and DataFrame), and how to create and manipulate Series objects. Additionally, it covers various methods and operations that can be performed on Series, including accessing, modifying, and filtering data.

Uploaded by

dayabhalala97237
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views13 pages

Ch 1 Python Pandas-I

The document provides an introduction to the Python Pandas library, which is used for data analysis and offers high-performance data structures. It details the advantages of Pandas, the two primary data structures (Data Series and DataFrame), and how to create and manipulate Series objects. Additionally, it covers various methods and operations that can be performed on Series, including accessing, modifying, and filtering data.

Uploaded by

dayabhalala97237
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Python Pandas-I

Informatics Practices (065)

Pandas-I

As per CBSE Syllabus 2023-24

8460831302 Page 1
Python Pandas-I

ch:-1python pandas-i

ntroduction to python libraries


Pandas or Python Pandas is Python's library for data analysis.
Pandas has derived its name from panel data system, which is an ecometrics
term for multi- dimensional, structured data sets.
The main author of Pandas is WesMcKinney.
Pandas is an open source, BSD library built for Python programming
language.
Pandas offers high-performance, easy-to-use data structures and data analysis
tools.
You need to install pandas before using it,
Go to start menu, open cmd, type command,
pip install pandas
To work with pandas in Python, you need to import pandas library in your
python program.
You can do this by writing
import pandas
You can rename the library by writing this
import pandas as pd

Advantages of Pandas:-
It can read or write in many different data formats (integer, float, double,
etc.).
It can calculate in all the possible ways data is organized ie across rows and
down columns.
It can easily select subsets of data from bulky data sets and even combine
multiple datasets together. It has functionality to find and fill missing data.
It allows you to apply operations to independent groups within the data.

8460831302 Page 2
Python Pandas-I

It supports reshaping of data into different forms. It supports advanced time-


series functionality (Time series forecasting is the use of a model to predict
future values based on previously observed values.)
It supports visualization by integrating matplotlib and seaborn etc. libraries.
Pandas Data Structures:-
Data Structures refer to specialized way of storing data so as to apply a
specific type of functionality on them.
A data structure is a particular way of storing and organizing data in a
computer to suit a specific purpose so that it can be accessed and worked with
in appropriate ways.
Pandas support two type of data structure.
1. Data Series
2. DataFrame

Difference between data series and dataframe

Topic Data Series Dataframe


Dimensions 1-dimensional 2-dimensional
Type of data Homogeneous, ie, all the Heterogeneous, ie, a
elements must be of same DataFrame object can have
data type in a Series object. elements of different data
types
Mutability Value mutable, their Value mutable, their
elements value can change elements value can change
Size-immutable,i.e, size of Size-mutable, i.e, size of a
a Series object cannot DataFrame object can
change. change.
You can add/drop element
any time.

1.2 series
Series is Pandas data structure that represents a one dimensional array like
object containing an array of data (of any NumPy data type) and an
8460831302 Page 3
Python Pandas-I

associated array of data labels called its index.


There are two components of Series Object.
1. Data
2. Index

Creating Series Object:-


There are following ways to create series object:-
1. Create empty Series Object by using just the Series() with no
parameter:-
You can create empty series using Series().
Syntax:-<Series Object>= pandas. Series()
e.g:-
import pandas as pd
s=pd.Series()

2. Creating non-empty Series objects:-


To create non-empty series objects, you need to write data and index.
Syntax:-<Series Object> pandas. Series (data=d,index=inx)
e.g:-
import pandas as pd
s=pd.Series(data=[“Ram”,”Seeta”,”Laxman”],index=[1,2,3])
Where d and inx is a valid numpy datatype. Such as
 Python sequence
 A Python dictionary
 An ndarray
 A scalar value:

Specify data as Python Sequence


E.g(1):-
import pandas as pd
s=pd.Series(range(5))
8460831302 Page 4
Python Pandas-I

print(s)
output:-
0 0
1 1
2 2
3 3
4 4

E.g(2 using list):-


import pandas as pd
s=pd.Series([25,26,28,29,28])
print(s)
output:-
0 25
1 26
2 28
3 29
4 28

E.g(3 using tuple):-


import pandas as pd
s=pd.Series((25,26,28,29,28))
print(s)
output:-
0 25
1 26
2 28
3 29
4 28

E.g(4 using dictionary):-Keys will be index and value will b data.


import pandas as pd
s=pd.Series({“Jan”:31,”Feb”:28,”March”:31}
8460831302 Page 5
Python Pandas-I

print(s)
output:-
Jan 31
Feb 28
March 31

Linspace():-It is used to create an evenly spaced sequence in a


specified interval.
E.g(5 using linspace):-
import pandas as pd
import numpy as np
s=pd.Series(np.linspace(1,11,5))
print(s)
output:-
0 1.0
1 3.5
2 6.0
3 8.5
4 11.0

Tile():-tile() will repeat list for given no of times.


E.g(6 using tile):-
import pandas as pd
import numpy as np
s=pd.Series(np.tile([2,3],2))
print(s)
output:-
0 2
1 3
2 2
8460831302 Page 6
Python Pandas-I

3 3

E.g(7 Specifying same data for all index):-


import pandas as pd
s=pd.Series(data=5000,index=[0,1,2])
print(s)
output:-
0 5000
1 5000
2 5000
E.g(8 Specifying data and index both):-
import pandas as pd
d=[“Ram”,”Seeta”,”Laxman”]
i=[1,2,3]
s=pd.Series(data=d,index=i)
print(s)
output:-
1 Ram
2 Seeta
3 Laxman

E.g(9 using mathematical function ):-


import pandas as pd
import numpy as np
d=np.array([25,26,27,28,29])
s=pd.Series(data=d*2)
print(s)
output:-
0 50
1 52
2 54
3 56
4 58
8460831302 Page 7
Python Pandas-I

Series object attribute


1. index:- Return index of Series.
2. values:-return values of series
3. dtype:-return data type of data in series object
4. shape:-return shape of series
5. ndin:-return number of dimensions (Return 1)
6. size:-return size of series object
7. hasnans:-return True if there is/are NaN values,else return False
8. empty:-return True if Series object is empty,else False
9. name:-return or assign name given to Series object
E.g(10 using all attribute):-
import pandas as pd
import numpy as np
d=np.array([25,26,27,28,29])
s=pd.Series(data=d*2)
print("Index",s.index)
s.name="new_index"
print("Index name",s.name)
print("values",s.values)
print("dtype",s.dtype)
print("shape",s.shape)
print("ndim",s.ndim)
print("size",s.size)
print("hasnans",s.hasnans)
print("empty",s.empty)
output:-
Index RangeIndex(start=0, stop=5, step=1)
Index name new_index
values [50 52 54 56 58]
dtype int32

8460831302 Page 8
Python Pandas-I

shape (5,)
ndim 1
size 5
hasnans False
empty False

Accessing Series object and its elements:-


You can access individual element and slice of Series object by its
index.
Index starts with 0.[you can give your own index]
So, if you want to access 3rd element, you can write
print(s[2])
To do slicing, you can specify starting index and ending
index+1seperated by : (colon)
If you want 1st to 5th elements of series you can write
print(s[1:6])
Python also support backward indexes of Series object just like list

Modifying elements of series object:-


You can change value of any element by writing,
Syntax:-Seriesobject[index]=new value
E.g:-s[2]=25

Renaming index of series object:-


You can rename the index by writing,
Syntax:-Seriesobject.index=new index
E.g:-s.index=[‘a’,’b’,’c’,’d’,’e’]

Methods of series:-

8460831302 Page 9
Python Pandas-I

Head()
head() function is used to display starting elements of series.
You can specify no of elements as argument of head function.
If you don’t specify the argument, by default it will display first 5
elements.
Syntax:-Seriesobject.head(n)
E.g:-s.head(3)
Output:-
0 50
1 52
2 54

Tail()
tail() function is used to display ending elements of series.
You can specify no of elements as argument of tail function.
If you don’t specify the argument,by default it will display last 5
elements.
Syntax:-Seriesobject.tail(n)
E.g:-s.tail(3)
Output:-
2 54
3 56
4 58

Count():-
Returns the number of non-NaN values in the Series
Syntax:-Seriesobject.count()
E.g:-s.count()
Output:-
5
8460831302 Page 10
Python Pandas-I

Arithmetic operation on series:-


You can perform any arithmetic operation on series object such as
+,-,*,/
E.g:- Create two series object c11 and c12 which stores no of students
in IP and PE of that class. Find total IP and PE students.
E.g:-
import pandas as pd
c11=pd.Series(data=[25,26,27],index=["Sci","Com","Arts"])
c12=pd.Series(data=[25,26,27],index=["Sci","Com","Arts"])
tot=c11+c12
print(tot)

Output:-
Sci 50
Com 52
Arts 54
Before performing arithmetic operation make sure both the series
have same indexes, if it is not same it will display NaN(Not a
Number).
E.g:-
import pandas as pd
c11=pd.Series(data=[25,26,27],index=["Sci","Com","Arts"])
c12=pd.Series(data=[25,26,27],index=["Sci","Com","Arts"])
tot=c11+c12
print(tot)

Output:-
Arts 54.0
Com 52.0

8460831302 Page 11
Python Pandas-I

Hummanities NaN
Sci NaN

Filtering elements from series:-


You can filter the elements of series by using relational operators
(<,>,<=,>=,!=,==).
If you write series operator value(e.g:-s>5),then it will give result in true
or false.
Syntax:-
series operator value
e.g:-
s>5
If you want value then you need to write, series[series operator value] (e.g
s[s>5]).
Syntax:-
series[series operator value]
e.g:-
s[s>5]

Sorting Series values:-


You can sort the values of a series object on basis of values.
Syntax:-
8460831302 Page 12
Python Pandas-I

Series.sort_values([ =True[False])
E.g:-
s.sort_values()

Sorting Series index:-


You can sort the values of a series object on basis of indexes.
Syntax:-
Series.sort_index([ascending=True[False])
E.g:-
s.sort_index()

8460831302 Page 13

You might also like