0% found this document useful (0 votes)
9 views13 pages

Session-24 - Jupyter Notebook

Uploaded by

patilyashyp22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views13 pages

Session-24 - Jupyter Notebook

Uploaded by

patilyashyp22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

3/12/24, 10:13 PM Session-24 - Jupyter Notebook

In [1]:  1 import pandas as pd

In [2]:  1 df=pd.read_csv("Iris.csv")
2 df

Out[2]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

3 4 4.6 3.1 1.5 0.2 Iris-setosa

4 5 5.0 3.6 1.4 0.2 Iris-setosa

... ... ... ... ... ... ...

145 146 6.7 3.0 5.2 2.3 Iris-virginica

146 147 6.3 2.5 5.0 1.9 Iris-virginica

147 148 6.5 3.0 5.2 2.0 Iris-virginica

148 149 6.2 3.4 5.4 2.3 Iris-virginica

149 150 5.9 3.0 5.1 1.8 Iris-virginica

150 rows × 6 columns

In [3]:  1 # read first five rows


2 ​
3 df.head()

Out[3]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

3 4 4.6 3.1 1.5 0.2 Iris-setosa

4 5 5.0 3.6 1.4 0.2 Iris-setosa

In [4]:  1 # read last five rows


2 df.tail()

Out[4]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

145 146 6.7 3.0 5.2 2.3 Iris-virginica

146 147 6.3 2.5 5.0 1.9 Iris-virginica

147 148 6.5 3.0 5.2 2.0 Iris-virginica

148 149 6.2 3.4 5.4 2.3 Iris-virginica

149 150 5.9 3.0 5.1 1.8 Iris-virginica

localhost:8888/notebooks/Desktop/Techpaathsala/Session-24.ipynb 1/13
3/12/24, 10:13 PM Session-24 - Jupyter Notebook

In [5]:  1 df.head(1)

Out[5]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

In [6]:  1 df.head(2)

Out[6]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

In [7]:  1 df.tail(1)

Out[7]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

149 150 5.9 3.0 5.1 1.8 Iris-virginica

In [8]:  1 df.tail(2)

Out[8]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

148 149 6.2 3.4 5.4 2.3 Iris-virginica

149 150 5.9 3.0 5.1 1.8 Iris-virginica

In [9]:  1 df.tail(8)

Out[9]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

142 143 5.8 2.7 5.1 1.9 Iris-virginica

143 144 6.8 3.2 5.9 2.3 Iris-virginica

144 145 6.7 3.3 5.7 2.5 Iris-virginica

145 146 6.7 3.0 5.2 2.3 Iris-virginica

146 147 6.3 2.5 5.0 1.9 Iris-virginica

147 148 6.5 3.0 5.2 2.0 Iris-virginica

148 149 6.2 3.4 5.4 2.3 Iris-virginica

149 150 5.9 3.0 5.1 1.8 Iris-virginica

In [10]:  1 # Access columns


2 ​
3 df.columns

Out[10]: Index(['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWi


dthCm',
'Species'],
dtype='object')

localhost:8888/notebooks/Desktop/Techpaathsala/Session-24.ipynb 2/13
3/12/24, 10:13 PM Session-24 - Jupyter Notebook

In [11]:  1 df=pd.read_csv("Iris.csv")
2 df

Out[11]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

3 4 4.6 3.1 1.5 0.2 Iris-setosa

4 5 5.0 3.6 1.4 0.2 Iris-setosa

... ... ... ... ... ... ...

145 146 6.7 3.0 5.2 2.3 Iris-virginica

146 147 6.3 2.5 5.0 1.9 Iris-virginica

147 148 6.5 3.0 5.2 2.0 Iris-virginica

148 149 6.2 3.4 5.4 2.3 Iris-virginica

149 150 5.9 3.0 5.1 1.8 Iris-virginica

150 rows × 6 columns

In [12]:  1 # how to change the attribute name


2 ​
3 df.columns = ['A','B','C','D','E','F']
4 df

Out[12]:
A B C D E F

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

3 4 4.6 3.1 1.5 0.2 Iris-setosa

4 5 5.0 3.6 1.4 0.2 Iris-setosa

... ... ... ... ... ... ...

145 146 6.7 3.0 5.2 2.3 Iris-virginica

146 147 6.3 2.5 5.0 1.9 Iris-virginica

147 148 6.5 3.0 5.2 2.0 Iris-virginica

148 149 6.2 3.4 5.4 2.3 Iris-virginica

149 150 5.9 3.0 5.1 1.8 Iris-virginica

150 rows × 6 columns

localhost:8888/notebooks/Desktop/Techpaathsala/Session-24.ipynb 3/13
3/12/24, 10:13 PM Session-24 - Jupyter Notebook

In [14]:  1 # transpose
2 ​
3 df.T

Out[14]:
0 1 2 3 4 5 6 7 8 9 ... 14

A 1 2 3 4 5 6 7 8 9 10 ... 14

B 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 ... 6

C 3.5 3.0 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ... 3

D 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ... 5

E 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ... 2

Iris- Iris- Iris- Iris- Iris- Iris- Iris- Iris- Iris- Iris- Iri
F ...
setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa virginic

6 rows × 150 columns

In [15]:  1 # Access both rows and columns


2 ​
3 df=pd.read_csv("Iris.csv")
4 df

Out[15]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

3 4 4.6 3.1 1.5 0.2 Iris-setosa

4 5 5.0 3.6 1.4 0.2 Iris-setosa

... ... ... ... ... ... ...

145 146 6.7 3.0 5.2 2.3 Iris-virginica

146 147 6.3 2.5 5.0 1.9 Iris-virginica

147 148 6.5 3.0 5.2 2.0 Iris-virginica

148 149 6.2 3.4 5.4 2.3 Iris-virginica

149 150 5.9 3.0 5.1 1.8 Iris-virginica

150 rows × 6 columns

In [16]:  1 df.axes # Both rows and columns

Out[16]: [RangeIndex(start=0, stop=150, step=1),


Index(['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalW
idthCm',
'Species'],
dtype='object')]

localhost:8888/notebooks/Desktop/Techpaathsala/Session-24.ipynb 4/13
3/12/24, 10:13 PM Session-24 - Jupyter Notebook

In [17]:  1 # only rows


2 ​
3 df.index

Out[17]: RangeIndex(start=0, stop=150, step=1)

1 df.columns >> columns


2 df.index >> rows
3 df.axes >> rows and column

In [18]:  1 # Most frequent used function


2 ​
3 df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Id 150 non-null int64
1 SepalLengthCm 150 non-null float64
2 SepalWidthCm 150 non-null float64
3 PetalLengthCm 150 non-null float64
4 PetalWidthCm 150 non-null float64
5 Species 150 non-null object
dtypes: float64(4), int64(1), object(1)
memory usage: 7.2+ KB

In [19]:  1 # Statistical calculation of attribute


2 ​
3 df.describe()

Out[19]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm

count 150.000000 150.000000 150.000000 150.000000 150.000000

mean 75.500000 5.843333 3.054000 3.758667 1.198667

std 43.445368 0.828066 0.433594 1.764420 0.763161

min 1.000000 4.300000 2.000000 1.000000 0.100000

25% 38.250000 5.100000 2.800000 1.600000 0.300000

50% 75.500000 5.800000 3.000000 4.350000 1.300000

75% 112.750000 6.400000 3.300000 5.100000 1.800000

max 150.000000 7.900000 4.400000 6.900000 2.500000

In [20]:  1 # It is used to count the no of rows and columns


2 ​
3 df.shape

Out[20]: (150, 6)

In [21]:  1 df.shape[0] # no of rows

Out[21]: 150

localhost:8888/notebooks/Desktop/Techpaathsala/Session-24.ipynb 5/13
3/12/24, 10:13 PM Session-24 - Jupyter Notebook

In [22]:  1 df.shape[1] # no of columns

Out[22]: 6

DataFrame:-
In [23]:  1 # 1. Create DataFrame:-
2 ​
3 import pandas as pd
4 import numpy as np

In [24]:  1 df=pd.DataFrame()
2 df

Out[24]:

1 # We can create dataframe using following :-


2 1.List
3 2.Array
4 3.Dict
5 4.csv
6 5.Excel
7 6.Database

In [27]:  1 # By Using List


2 ​
3 List1=[2,3,4,7,6]
4 df=pd.DataFrame([List1])
5 df

Out[27]:
0 1 2 3 4

0 2 3 4 7 6

In [26]:  1 list1=[2,3,4,5,6]
2 list2=[100,200,300,400,500]
3 ​
4 df=pd.DataFrame([list1,list2])
5 df

Out[26]:
0 1 2 3 4

0 2 3 4 5 6

1 100 200 300 400 500

In [28]:  1 #By using array


2 ​
3 arr1=np.array([1,2,3,4,5])
4 arr1

Out[28]: array([1, 2, 3, 4, 5])

localhost:8888/notebooks/Desktop/Techpaathsala/Session-24.ipynb 6/13
3/12/24, 10:13 PM Session-24 - Jupyter Notebook

In [29]:  1 ​
2 df=pd.DataFrame(arr1)
3 df

Out[29]:
0

0 1

1 2

2 3

3 4

4 5

In [30]:  1 # 3.By using dictionary


2 ​
3 dict1={'Randint' : np.random.randint(10,20,size=10),
4 'Arange' : np.arange(11,31,2)}
5 dict1

Out[30]: {'Randint': array([17, 18, 19, 16, 10, 10, 12, 16, 14, 19]),
'Arange': array([11, 13, 15, 17, 19, 21, 23, 25, 27, 29])}

In [31]:  1 df=pd.DataFrame(dict1)
2 df

Out[31]:
Randint Arange

0 17 11

1 18 13

2 19 15

3 16 17

4 10 19

5 10 21

6 12 23

7 16 25

8 14 27

9 19 29

In [ ]:  1 # first last
2 # 0 krishna rai
3 # 1 Shubham Singh
4 # 2 rahul raj

In [32]:  1 data={'First_name':['Yash','Kimaya','Krishna'],
2 'Last_name' : ['Patil','Churi','Rai']}
3 data

Out[32]: {'First_name': ['Yash', 'Kimaya', 'Krishna'],


'Last_name': ['Patil', 'Churi', 'Rai']}

localhost:8888/notebooks/Desktop/Techpaathsala/Session-24.ipynb 7/13
3/12/24, 10:13 PM Session-24 - Jupyter Notebook

In [33]:  1 df=pd.DataFrame(data)
2 df

Out[33]:
First_name Last_name

0 Yash Patil

1 Kimaya Churi

2 Krishna Rai

In [34]:  1 l1 = ["krishna","shubham","rahul"]
2 l2 = ["rai","singh","raj"]
3 data = {"FIRST NAME":l1,
4 "LAST NAME":l2}
5 df = pd.DataFrame(data)
6 df

Out[34]:
FIRST NAME LAST NAME

0 krishna rai

1 shubham singh

2 rahul raj

In [35]:  1 First=['Krishna','shubham','rahul']
2 Last=['Rai', 'Singh', 'Raj']
3 df=pd.DataFrame([First,Last])
4 df

Out[35]:
0 1 2

0 Krishna shubham rahul

1 'Rai' Singh Raj

In [36]:  1 df=pd.DataFrame([[1,2],[3,4]])
2 df

Out[36]:
0 1

0 1 2

1 3 4

In [37]:  1 df=pd.DataFrame([[1,2],[3,4]],columns=['X','Y'],index=['P','Q'])
2 df

Out[37]:
X Y

P 1 2

Q 3 4

localhost:8888/notebooks/Desktop/Techpaathsala/Session-24.ipynb 8/13
3/12/24, 10:13 PM Session-24 - Jupyter Notebook

In [38]:  1 df=pd.DataFrame([[1,2],[3,4]],columns=['col1','col2'],index=['row1',
2 df

Out[38]:
col1 col2

row1 1 2

row2 3 4

In [39]:  1 df['col3']=[100,200] # Adding columns in dataframe


2 df

Out[39]:
col1 col2 col3

row1 1 2 100

row2 3 4 200

In [40]:  1 # Add rows in DataFrame


2 ​
3 ​
4 df1=pd.DataFrame([[1,2],[3,4]],columns=['col1','col2'],index=['row1'
5 df1

Out[40]:
col1 col2

row1 1 2

row2 3 4

In [43]:  1 df2=pd.DataFrame([[100,200],[300,400]],columns=['col1','col2'],index
2 df2

Out[43]:
col1 col2

row3 100 200

row4 300 400

In [53]:  1 df=pd.concat([df1,df2],axis=0)
2 df

Out[53]:
col1 col2

row1 1 2

row2 3 4

row3 100 200

row4 300 400

localhost:8888/notebooks/Desktop/Techpaathsala/Session-24.ipynb 9/13
3/12/24, 10:13 PM Session-24 - Jupyter Notebook

In [54]:  1 df=pd.concat([df1,df2],axis=0)
2 df

Out[54]:
col1 col2 col1 col2

row1 1.0 2.0 NaN NaN

row2 3.0 4.0 NaN NaN

row3 NaN NaN 100.0 200.0

row4 NaN NaN 300.0 400.0

In [56]:  1 df2=pd.DataFrame([[100,200],[300,400]],columns=['col1','col2'],index
2 display(df2)
3 display(df1)

col1 col2

row1 100 200

row2 300 400

col1 col2

row1 1 2

row2 3 4

In [57]:  1 df=pd.concat([df1,df2],axis=1)
2 df

Out[57]:
col1 col2 col1 col2

row1 1 2 100 200

row2 3 4 300 400

In [65]:  1 # df1
2 df2

Out[65]:
col1 col2

row1 100 200

row2 300 400

localhost:8888/notebooks/Desktop/Techpaathsala/Session-24.ipynb 10/13
3/12/24, 10:13 PM Session-24 - Jupyter Notebook

In [67]:  1 # df1=pd.DataFrame([[10,20],[100,200]])
2 # df2=pd.DataFrame([[100,200],[23,56]])
3 # df1.append(df2)

-----------------------------------------------------------------------
----
AttributeError Traceback (most recent call l
ast)
~\AppData\Local\Temp\ipykernel_22608\400476097.py in ?()
1 df1=pd.DataFrame([[10,20],[100,200]])
2 df2=pd.DataFrame([[100,200],[23,56]])
----> 3 df1.append(df2)

~\anaconda3\Lib\site-packages\pandas\core\generic.py in ?(self, name)


5985 and name not in self._accessors
5986 and self._info_axis._can_hold_identifiers_and_holds
_name(name)
5987 ):
5988 return self[name]
-> 5989 return object.__getattribute__(self, name)

AttributeError: 'DataFrame' object has no attribute 'append'

In [68]:  1 df=pd.read_csv('Iris.csv')
2 df

Out[68]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

3 4 4.6 3.1 1.5 0.2 Iris-setosa

4 5 5.0 3.6 1.4 0.2 Iris-setosa

... ... ... ... ... ... ...

145 146 6.7 3.0 5.2 2.3 Iris-virginica

146 147 6.3 2.5 5.0 1.9 Iris-virginica

147 148 6.5 3.0 5.2 2.0 Iris-virginica

148 149 6.2 3.4 5.4 2.3 Iris-virginica

149 150 5.9 3.0 5.1 1.8 Iris-virginica

150 rows × 6 columns

Access rows and columns

localhost:8888/notebooks/Desktop/Techpaathsala/Session-24.ipynb 11/13
3/12/24, 10:13 PM Session-24 - Jupyter Notebook

In [69]:  1 # 1. Slicing
2 # Access rows
3 df[1:10]

Out[69]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

3 4 4.6 3.1 1.5 0.2 Iris-setosa

4 5 5.0 3.6 1.4 0.2 Iris-setosa

5 6 5.4 3.9 1.7 0.4 Iris-setosa

6 7 4.6 3.4 1.4 0.3 Iris-setosa

7 8 5.0 3.4 1.5 0.2 Iris-setosa

8 9 4.4 2.9 1.4 0.2 Iris-setosa

9 10 4.9 3.1 1.5 0.1 Iris-setosa

In [91]:  1 #Access columns


2 # While accessing the columns, columns name should be in two dimensi
3 df[['Id','SepalLengthCm','Species']]

Out[91]:
Id SepalLengthCm Species

0 1 5.1 Iris-setosa

1 2 4.9 Iris-setosa

2 3 4.7 Iris-setosa

3 4 4.6 Iris-setosa

4 5 5.0 Iris-setosa

... ... ... ...

145 146 6.7 Iris-virginica

146 147 6.3 Iris-virginica

147 148 6.5 Iris-virginica

148 149 6.2 Iris-virginica

149 150 5.9 Iris-virginica

150 rows × 3 columns

1 loc
2 iloc
3 groupby
4 get_group
5 join
6 merge
7 replace
8 Apply
9 null value
10 unique value
11 columns_reset
12 inplace

localhost:8888/notebooks/Desktop/Techpaathsala/Session-24.ipynb 12/13
3/12/24, 10:13 PM Session-24 - Jupyter Notebook
13 Insert

In [ ]:  1 df.isna().sum()
2 df.fillna(mean,medium,mode)

localhost:8888/notebooks/Desktop/Techpaathsala/Session-24.ipynb 13/13

You might also like