0% found this document useful (0 votes)

57 views

Practical File Python

The document provides code for several practical examples of working with Pandas dataframes using CSV data. It shows how to import a CSV file, view the column names and data types. It demonstrates extracting specific rows and columns using loc() and iloc() methods. Random samples of rows are selected using sample() to view a subset of the data. The document contains code for common operations on CSV data after importing into a Pandas dataframe.

Uploaded by

kaizenpro01

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views

Practical File Python

Uploaded by

kaizenpro01

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 25

Practical File Don’t Copy this page start from Practical 1 on next page

Practical 1: Create a DataSeriese from List

Date:
Aim : To create DataSeriese in Python Pandas using a List and Using predefined Statistical functions
Source Code:

import pandas as p
lst = []
n = int(input("Enter number of elements : "))
for i in range(0, n):
t = int(input("Enter "+str(i+1)+" Number "))
lst.append(t) # adding the element
my_series = pd.Series(lst)
print("Sum of all the elements =",my_series.sum())
print("Largest Value =", my_series.max())
print("Smallest Value =",my_series.min())
print("Mean Value =",my_series.mean())
print("Median =",my_series.median())
print("Standard Deviation =",my_series.std())
print("Describe DataSeriese =",my_series.describe())
Output:
Enter number of elements : 5
Enter 1 Number 1
Enter 2 Number 2
Enter 3 Number 3
Enter 4 Number 4
Enter 5 Number 5
Sum of all the elements = 15
Largest Value = 5
Smallest Value = 1
Mean Value = 3.0
Median = 3.0
Standard Deviation = 1.5811388300841898
Describe DataSeriese = count 5.000000
mean 3.000000
std 1.581139
min 1.000000
25% 2.000000
50% 3.000000
75% 4.000000
max 5.000000
dtype: float64

Practical 2: Create a DataFrame from List

Date:

Aim : To create DataFrame in Python Pandas using a List

Source Code:
import pandas as pd
import numpy as np
lst = []
n = int(input("Enter number of elements : "))
for i in range(0, n):
t = int(input("Enter "+str(i+1)+" Number "))
lst.append(t) # adding the element

df = pd.DataFrame(lst,columns=['Values'])
display(df)

Output:
Enter number of elements : 5
Enter 1 Number 10
Enter 2 Number 11
Enter 3 Number 12
Enter 4 Number 13
Enter 5 Number 14

Values

0 10

1 11

2 12

3 13

4 14

Practical 3: Create a DataFrame and Extract Rows and Columns

Date:
Aim : To create DataFrame in Python Pandas and extract Rows and Columns

Source Code:
import pandas as pd
import numpy as np
data = [['tom', 10], ['nick', 15], ['juli', 14],['Suzan',28],['Sam',30],['tom',15]]
df = pd.DataFrame(data, columns = ['Name', 'Age'])
print("First three Elements :\n",df.head(3)) #default 5 items displayed head() and tail()
print("\nExtract Name Column\n",df.Name) #can also use df['Name'] or df.iloc[:,0]
print("\nExtract Age Column\n" ,df.Age) # df['Age'] or df.iloc[:,1] for both df.iloc[:,0:2]

#Extracting Rows using iloc()

print("\nExtracting Third Row\n",(df.iloc[3])) #[Rows,Cols]
print("\nExtracting Third to fifth Row and first and second column\n",df.iloc[3:6,0:2])
print("\nCount Occurrences of values in Name column\n",df.Name.value_counts())

Output:
First three Elements :
Name Age
0 tom 10
1 nick 15
2 juli 14

Extract Name Column

0 tom
1 nick
2 juli
3 Suzan
4 Sam
5 tom
Name: Name, dtype: object

Extract Age Column

0 10
1 15
2 14
3 28
4 30
5 15
Name: Age, dtype: int64

Extracting Third Row

Name Suzan
Age 28
Name: 3, dtype: object

Extracting Third to fifth Row and first and second column

Name Age
3 Suzan 28
4 Sam 30
5 tom 15

Practical 4: Create a Indexed DataFrame from Dictionary

Date:
Aim : To create Indexed DataFrame using dictonary in Python Pandas and Extract row based on user
input

Source Code:
import pandas as pd
import numpy as np
data = { 'Name':['Tom','Alex','Suzain','Rayan','Steve'],
'Age':[28,34,29,28,25],
'English':[87,67,54,89,73],
'Hindi':[54,65,34,65,76],
'Maths':[65,54,67,54,75],
'IP': [90,84,94,75,43]}
df = pd.DataFrame(data,index=[1,2,3,4,5])
display(df)
i=int(input("Enter rollno to see the marks : "))
print(df.iloc[i-1:i])

Output:

Nam Age English Hindi Maths IP

1 Tom 28 87 54 65 90

2 Alex 34 67 65 54 84

3 Suzain 29 54 34 67 94

4 Rayan 28 89 65 54 75

5 Steve 25 73 76 75 43

Enter rollno to see the marks: 3

Name Age English Hindi Maths IP
3 Suzain 29 54 34 67 94

Practical 5: Creating DataFrames from list of Dictionaries

Date:
Aim : To create two DataFrame using list of dictonary in Python Pandas based on subject taken

Source Code

#row indices, and column indices.

import pandas as pd
import numpy as np
data = [{'English':90, 'Hindi': 95,'IP':99},
{'English': 50, 'Hindi': 40, 'Maths': 92},
{'English': 55, 'Hindi': 70, 'PEd': 70}]

#With two common column same as dictionary keys

df1 = pd.DataFrame(data, index=['Suzan', 'Sam','Juli'], columns=['English',
'Hindi'])

#With different columns

df2 = pd.DataFrame(data, index=['Suzan',
'Sam','Juli'],columns=['IP','Maths','PEd'])
display('DataFrame of Common Subjects',df1)
display('DataFrame of Different Subjects',df2)

Output:

'DataFrame of Common Subjects'

English Hindi

Suzan 90 95

Sam 50 40

Juli 55 70

'DataFrame of Different Subjects'

IP Maths PEd

Suzan 99.0 NaN NaN

Na
Sam 92.0 NaN
N

Na
Juli NaN 70.0
N

Practical 6: Adding and Removing columns from DataFrame

Date:
Aim : To adding and removing columns from DataFrame

Source Code
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d) # df DataFrame object
print(df)
print ("Adding a new column by passing as Series:")
df['three']=pd.Series([10,20,30],index=['a','b','c'])
print( df )

# using del function

print ("Deleting the first column using DEL function:")
del df['one']
print (df)

# using pop function

print ("Deleting another column using POP function:")
df.pop('two')
print (df)

Output:

one two
a 1.0 1
b 2.0 2
c 3.0 3
d NaN 4
Adding a new column by passing as Series:
one two three
a 1.0 1 10.0
b 2.0 2 20.0
c 3.0 3 30.0
d NaN 4 NaN
Deleting the first column using DEL function:
two three
a 1 10.0
b 2 20.0
c 3 30.0
d 4 NaN
Deleting another column using POP function:
three
a 10.0
b 20.0
c 30.0
d NaN

Practical 7: Merging two DataFrames

Date:
Aim : To merge two DataFrames together as a single Dataframe

Source Code
import pandas as pd
import numpy as np
data1 = [[1,'tom', 10],[2,'nick', 15], [3,'juli', 14]]
df1 = pd.DataFrame(data1, columns = ['RollNo','Name', 'Age'])
data2 = [[1,98, 100],[2,98, 15], [3,75, 50]]
df2 = pd.DataFrame(data2, columns = ['RollNo','Eng', 'Hin'])

#merging data into merged dataframe

merged = pd.merge(df1,df2, on='RollNo')
display(merged)
t=input("\nEnter the Subject Code For Example 'Eng' for English \t:")
print(t," column data \n",merged[t])

print("\nSum of ",t," column \n",merged[t].sum())

avg=merged[t].sum()/merged[t].count()
print("\nPercentage of ",t," column \n",avg)

Output:

RollN Name Age Eng Hin

0 1 tom 10 98 100

nic
1 2 15 98 15
k

2 3 juli 14 75 50

Enter the Subject Code For Example 'Eng' for English :Eng
Eng column data
0 98
1 98
2 75
Name: Eng, dtype: int64

Sum of Eng column

271

Percentage of Eng column

90.33333333333333

Practical 8: Importing CSV file data and some important functions

Date:
Aim : To import CSV file data and working on it
Source Code:
import pandas as pd
import numpy as np
data=pd.read_csv("c:\emp.csv")
print("Columns in DataFrame :\n",data.columns)
print("\nRange Index :\n",data.index)
print("\nBoth Data and Indexes :\n", data.axes)
print("\nDataType of columns :\n",data.dtypes)
print("\nTotal no of elements:" ,data.size)
print("\nCount of Rows and Columns:",data.shape)
print("\nPrinting data values :\n",data.values)
print("\nCheching if DataFrame is empty :",data.empty)
print("\nCheching diamentions of DataFrame :",data.ndim,"D")
rows=data.sample(n=10) #sample data random rows
display("Random sample of data:",rows)
rows=data.sample(frac=.25) # 25 % of data .25
display("Random sample of 25% data:",rows)

output:
Columns in DataFrame :
Index(['EMPLOYEE_ID', 'FIRST_NAME', 'LAST_NAME', 'EMAIL', 'PHONE_NUMBER',
'HIRE_DATE', 'JOB_ID', 'SALARY', 'COMMISSION_PCT', 'MANAGER_ID',
'DEPARTMENT_ID'],
dtype='object')
Range Index :
RangeIndex(start=0, stop=107, step=1)

Both Data and Indexes :

[RangeIndex(start=0, stop=107, step=1), Index(['EMPLOYEE_ID', 'FIRST_NAME',
'LAST_NAME', 'EMAIL', 'PHONE_NUMBER',
'HIRE_DATE', 'JOB_ID', 'SALARY', 'COMMISSION_PCT', 'MANAGER_ID',
'DEPARTMENT_ID'],
dtype='object')]

DataType of columns :
EMPLOYEE_ID int64
FIRST_NAME object
LAST_NAME object
...
MANAGER_ID float64
DEPARTMENT_ID float64
dtype: object

Total no of elements: 1177

Count of Rows and Columns: (107, 11)

Printing data values :

[[100 'Steven' 'King' ... nan nan 90.0]
[101 'Neena' 'Kochhar' ... nan 100.0 90.0]
...
[205 'Shelley' 'Higgins' ... nan 101.0 110.0]
[206 'William' 'Gietz' ... nan 205.0 110.0]]

Checking if DataFrame is empty : False

Checking dimensions of DataFrame : 2 D

Practical 9: Importing CSV file Extracting Data
Date:
Aim : To import CSV file data extracting data from it loc() and iloc().

Source code:
import pandas as pd
import numpy as np
data=pd.read_csv("c:\emp.csv")
print(data.axes)
print ("Extracting Columns by Column Names :\n",data[['EMPLOYEE_ID','FIRST_NAME','SALARY']])
print ("\nExtracting Columns by Column Numbers :\n",data[data.columns[1:6]])
print ("\nExtracting Rows (1-3) :\n",data.loc[1:3])
print ("\nExtracting 3 Rows and Columns by Column Names Using loc() :\n",
data.loc[1:3,['FIRST_NAME','SALARY','DEPARTMENT_ID']])
print ("\nExtracting 3 Rows and Columns numbers Using loc() :\n",data.loc[1:3,data.columns[1:4]])
print ("\nExtracting 3 Rows and Columns Range Using loc() :\n",data.loc[1:3, 'FIRST_NAME':'SALARY'])
print ("\nExtracting 3 Rows and Columns Range Using loc() :\n",data.loc[1:3,'JOB_ID':])
print ("\nExtracting 3 Rows and Columns Range Using iloc() :\n",data.iloc[1:3,0:2])
print ("\nExtracting 3 Rows and Columns Range Using iloc() :\n",data.iloc[1:3,1:5])

output:

Extracting Columns by Column Names :

EMPLOYEE_ID FIRST_NAME SALARY
0 100 Steven 30000
1 101 Neena 17000
.. ... ... ...
105 205 Shelley 12000
106 206 William 8300

[107 rows x 3 columns]

Extracting Columns by Column Numbers :

FIRST_NAME LAST_NAME EMAIL PHONE_NUMBER HIRE_DATE
0 Steven King SKING 515.123.4567 17-JUN-87
1 Neena Kochhar NKOCHHAR 515.123.4568 21-SEP-89
.. ... ... ... ... ...
105 Shelley Higgins SHIGGINS 515.123.8080 07-JUN-94
106 William Gietz WGIETZ 515.123.8181 07-JUN-94

[107 rows x 5 columns]

Extracting Rows (1-3) :
EMPLOYEE_ID FIRST_NAME LAST_NAME EMAIL PHONE_NUMBER HIRE_DATE \
1 101 Neena Kochhar NKOCHHAR 515.123.4568 21-SEP-89
2 102 Lex De Haan LDEHAAN 515.123.4569 13-JAN-93
3 103 Alexander Hunold AHUNOLD 590.423.4567 03-JAN-90

JOB_ID SALARY COMMISSION_PCT MANAGER_ID DEPARTMENT_ID

1 AD_VP 17000 NaN 100.0 90.0
2 AD_VP 17000 NaN 100.0 90.0
3 IT_PROG 9000 NaN 102.0 60.0

Extracting 3 Rows and Columns by Column Names Using loc() :

FIRST_NAME SALARY DEPARTMENT_ID
1 Neena 17000 90.0
2 Lex 17000 90.0
3 Alexander 9000 60.0

Extracting 3 Rows and Columns numbers Using loc() :

FIRST_NAME LAST_NAME EMAIL
1 Neena Kochhar NKOCHHAR
2 Lex De Haan LDEHAAN
3 Alexander Hunold AHUNOLD

Extracting 3 Rows and Columns Range Using loc() :

FIRST_NAME LAST_NAME EMAIL PHONE_NUMBER HIRE_DATE JOB_ID
SALARY
1 Neena Kochhar NKOCHHAR 515.123.4568 21-SEP-89 AD_VP 17000
2 Lex De Haan LDEHAAN 515.123.4569 13-JAN-93 AD_VP 17000
3 Alexander Hunold AHUNOLD 590.423.4567 03-JAN-90 IT_PROG 9000

Extracting 3 Rows and Columns Range Using loc() :

JOB_ID SALARY COMMISSION_PCT MANAGER_ID DEPARTMENT_ID
1 AD_VP 17000 NaN 100.0 90.0
2 AD_VP 17000 NaN 100.0 90.0
3 IT_PROG 9000 NaN 102.0 60.0

Extracting 3 Rows and Columns Range Using iloc() :

EMPLOYEE_ID FIRST_NAME
1 101 Neena
2 102 Lex

Extracting 3 Rows and Columns Range Using iloc() :

FIRST_NAME LAST_NAME EMAIL PHONE_NUMBER
1 Neena Kochhar NKOCHHAR 515.123.4568
2 Lex De Haan LDEHAAN 515.123.4569

Practical 10: Importing CSV file Modifying data and Saving to CSV file
Date:
Aim : To modifying data in CSV file and writing it back to disk
Source code:
import pandas as pd
import numpy as np
data=pd.read_csv("e:\emp.csv")
print ("\nExtracting 3 Rows and Columns Range Using loc() :\n",data.loc[1:3,
'FIRST_NAME':'SALARY'])
#modifying dataframe value
data.FIRST_NAME[1]='Amit' # gives a warning
data.LAST_NAME[1]='Singh'
data.EMAIL[1]="s.amit18"
data.SALARY[1]=20000
data.HIRE_DATE='27-12-1975' #updates all the columns
data.PHONE_NUMBER[1]='955.95.83030'
print ("\nExtracting 3 Rows and Columns Range Using loc() :\
n",data.loc[1:3,'FIRST_NAME':'SALARY'])

#adding row
data.at[2,:]=102,'Punita','Singh','P.amit18','201.92.0102','21-10-
89','AD_VP',30000,.5,100,20
print ("\nExtracting 3 Rows and Columns Range Using loc() :\
n",data.loc[1:3,'EMPLOYEE_ID':])

# saving changes to csv file

data.to_csv("e:\emp.csv")

output:
Extracting 3 Rows and Columns Range Using loc() :
FIRST_NAME LAST_NAME EMAIL PHONE_NUMBER HIRE_DATE JOB_ID
SALARY
1 Neena Kochhar NKOCHHAR 515.123.4568 21-SEP-89 AD_VP 17000
2 Lex De Haan LDEHAAN 515.123.4569 13-JAN-93 AD_VP 17000
3 Alexander Hunold AHUNOLD 590.423.4567 03-JAN-90 IT_PROG 9000

Extracting 3 Rows and Columns Range Using loc() :

FIRST_NAME LAST_NAME EMAIL PHONE_NUMBER HIRE_DATE JOB_ID
SALARY
1 Amit Singh s.amit18 955.95.83030 27-12-1975 AD_VP
20000
2 Lex De Haan LDEHAAN 515.123.4569 27-12-1975 AD_VP
17000
3 Alexander Hunold AHUNOLD 590.423.4567 27-12-1975 IT_PROG
9000

Extracting 3 Rows and Columns Range Using loc() :

EMPLOYEE_ID FIRST_NAME LAST_NAME EMAIL PHONE_NUMBER
HIRE_DATE \
1 101 Amit Singh s.amit18 955.95.83030 27-12-1975
2 102 Punita Singh P.amit18 201.92.0102 21-10-89
3 103 Alexander Hunold AHUNOLD 590.423.4567 27-12-1975
JOB_ID SALARY COMMISSION_PCT MANAGER_ID DEPARTMENT_ID
1 AD_VP 20000 NaN 100.0 90.0
2 AD_VP 30000 0.5 100.0 20.0
3 IT_PROG 9000 NaN 102.0 60.0

Practical 11: Iteration rows and columns using iterrows() and iteritems()

Date:
Aim : To do iteration on rows and columns using iterrows() and iteritems()

Source code:

import pandas as pd
import numpy as np
data1 = [[1,'tom', 10],[2,'nick', 15], [3,'juli', 14]]
df1 = pd.DataFrame(data1, columns = ['RollNo','Name', 'Age'])
sum=0
for label, contents in df1.iterrows():
print ("\nLabel ", label)
print ("contents:", contents, sep='\n')

for label, contents in df1.iteritems():

print ("\nLabel ", label)
print ("contents:", contents, sep='\n')

output:
Label 0
contents:
RollNo 1
Name tom
Age 10
Name: 0, dtype: object

Label 1
contents:
RollNo 2
Name nick
Age 15
Name: 1, dtype: object

Label 2
contents:
RollNo 3
Name juli
Age 14
Name: 2, dtype: object
Label RollNo
contents:
0 1
1 2
2 3
Name: RollNo, dtype: int64

Label Name
contents:
0 tom
1 nick
2 juli
Name: Name, dtype: object

Label Age
contents:
0 10
1 15
2 14
Name: Age, dtype: int64

Practical 12: Descriptive statistics with pandas

Date:
Aim : To work with Statistics on pandas DataFrame

Source code:
import pandas as pd
import numpy as np
diSales={2016:{'Qtr1':34500,'Qtr2':56000,'Qtr3':47000,'Qtr4':49000},
2017:{'Qtr1':44900,'Qtr2':46100,'Qtr3':57000,'Qtr4':59000},
2018:{'Qtr1':54500,'Qtr2':51000,'Qtr3':57000,'Qtr4':58000},
2019:{'Qtr1':61000}}
sal_df=pd.DataFrame(diSales)
print ( "Data Frame :\n",sal_df )
print (" min() : \n",sal_df.min())
print (" max() : \n",sal_df.max())

##### default axis is axis 0 and for axis 1 :-

print (" min() axis 1 : \n",sal_df.min(axis=1))
print (" max() axis 1 : \n",sal_df.max(axis=1))

print (" mode() : \n",sal_df.mode(axis=0)) #try all for axis=1

print (" mean() : \n",sal_df.mean(axis=0))
print (" median() : \n",sal_df.median(axis=0))
print (" Count() : \n",sal_df.count(axis=0))
print ("Sum() axis=0: \n",sal_df.sum(axis=0))
print ("Quantile() axis=0: \n",sal_df.quantile(q=[.25,.5,.75,1]))
print ("Var() axis=0: \n",sal_df.var(axis=0))
# applying group functions on single column
print(" Min of 2016 : ",sal_df[2016].min())
print(" Max of 2016 : ",sal_df[2016].max())
print(" Sum of 2016 : ",sal_df[2016].sum())

# applying group functions on multiple columns

print(" Sum of 2016 and 2019 : \n",sal_df[[2016,2019]].sum())
print(" Min of 2016 and 2019: \n",sal_df[[2016,2019]].min())
print(" Max of 2016 and 2019: \n",sal_df[[2016,2019]].max())

# applying functions on Rows

print(" Sum of Qtr1 : \n",sal_df.loc['Qtr1'].sum())
print(" Min of Qtr1: \n",sal_df.loc['Qtr1'].min())
print(" Max Qtr1: \n",sal_df.loc['Qtr1'].max())

# multiple rows
print(" Sum of Qtr1 to Qtr3 : \n",sal_df.loc['Qtr1':'Qtr3'].sum())
print(" Min of Qtr1 to Qtr3: \n",sal_df.loc['Qtr1':'Qtr3'].min())
print(" Max Qtr1 to Qtr3: \n",sal_df.loc['Qtr1':'Qtr3'].max())

#applying functions to subset ( few rows and columns )

print(" Sum of Qtr1to3 2018-19 : \
n",sal_df.loc['Qtr3':'Qtr4',2018:2019].sum())
print(" Min of Qtr1to3 2018-19: \n",sal_df.loc['Qtr3':'Qtr4',2018:2019].min())
print(" Max Qtr1to3 2018-19: \n",sal_df.loc['Qtr3':'Qtr4',2018:2019].max())

output:
Data Frame :
2016 2017 2018 2019
Qtr1 34500 44900 54500 61000.0
Qtr2 56000 46100 51000 NaN
Qtr3 47000 57000 57000 NaN
Qtr4 49000 59000 58000 NaN

min() :
2016 34500.0
. . .
2019 61000.0
dtype: float64

max() :
2016 56000.0
. . .
2019 61000.0
dtype: float64

min() axis 1 :
Qtr1 34500.0
. . .
Qtr4 49000.0
dtype: float64

max() axis 1 :
Qtr1 61000.0
. . .
Qtr4 59000.0
dtype: float64

mode() :
2016 2017 2018 2019
0 34500 44900 51000 61000.0
. . .
3 56000 59000 58000 NaN

mean() :
2016 46625.0
. . .
2019 61000.0
dtype: float64

median() :
2016 48000.0
. . .
2019 61000.0
dtype: float64

Count() :
2016 4
. . .
2019 1
dtype: int64

Sum() axis=0:
2016 186500.0
. . .
2019 61000.0
dtype: float64

Quantile() axis=0:
2016 2017 2018 2019
0.25 43875.0 45800.0 53625.0 61000.0
. . .
1.00 56000.0 59000.0 58000.0 61000.0

Var() axis=0:
2016 8.022917e+07
. . .
2019 NaN
dtype: float64

Min of 2016 : 34500

Max of 2016 : 56000
Sum of 2016 : 186500
Sum of 2016 and 2019 :
2016 186500.0
2019 61000.0
dtype: float64

Min of 2016 and 2019:

2016 34500.0
2019 61000.0
dtype: float64

Max of 2016 and 2019:

2016 56000.0
2019 61000.0
dtype: float64

Sum of Qtr1 :
194900.0
Min of Qtr1:
34500.0

Max Qtr1:
61000.0

Sum of Qtr1 to Qtr3 :

2016 137500.0
2017 148000.0
2018 162500.0
2019 61000.0
dtype: float64

Min of Qtr1 to Qtr3:

2016 34500.0
2017 44900.0
2018 51000.0
2019 61000.0
dtype: float64

Max Qtr1 to Qtr3:

2016 56000.0
2017 57000.0
2018 57000.0
2019 61000.0
dtype: float64

Sum of Qtr1to3 2018-19 :

2018 115000.0
2019 0.0
dtype: float64

Min of Qtr1to3 2018-19:

2018 57000.0
2019 NaN
dtype: float64
Max Qtr1to3 2018-19:
2018 58000.0
2019 NaN
dtype: float64

Practical 13: PIVOTING

Date:
Aim : To do Pivoting on pandas DataFrame(pivot() pivottable() )

import pandas as pd
import numpy as np
d={ 'Tutor':['Tahira','Gurjyot','Anusha','Jacob','Vankat'],
'Classes':[28,36,41,32,48],
'Country':['USA','UK','Japan','USA','Brazil']}
df=pd.DataFrame(d)
print(df)
df.pivot(index='Country', columns='Tutor',values='Classes')

test=df.pivot(index='Country', columns='Tutor',values='Classes')
print(test)

#pivot_table

import pandas as pd
import numpy as np
d={ 'Tutor':['Tahira','Gurjyot','Anusha','Jacob','Vankat',
'Tahira','Gurjyot','Anusha','Jacob','Vankat',
'Tahira','Gurjyot','Anusha','Jacob','Vankat',
'Tahira','Gurjyot','Anusha','Jacob','Vankat'],
'Classes':[28,36,41,32,40,36,40,36,40,46,24,30,44,40,32,36,32,36,24,38],
'Quarter':[1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4],
'Country':['USA','UK','Japan','USA','Brazil','USA','USA','Japan',
'Brazil','USA','Brazil','USA','UK','Brazil','USA','Japan',
'Japan','Brazil','UK','USA']}
df=pd.DataFrame(d)
print(df)
test=df.pivot_table(index='Tutor', columns='Country',values='Classes')
print(test)

#sorting
df.sort_values('Country')
df.sort_values('Tutor')
df.sort_values(['Country','Tutor'])
df.sort_values(['Tutor','Country'])
df.sort_values(['Tutor','Country'], ascending=False)

output:
Tutor Classes Country
0 Tahira 28 USA
1 Gurjyot 36 UK
2 Anusha 41 Japan
3 Jacob 32 USA
4 Vankat 48 Brazil

Tutor Anusha Gurjyot Jacob Tahira Vankat

Country
Brazil NaN NaN NaN NaN 48.0
Japan 41.0 NaN NaN NaN NaN
UK NaN 36.0 NaN NaN NaN
USA NaN NaN 32.0 28.0 NaN

Tutor Classes Quarter Country

0 Tahira 28 1 USA
1 Gurjyot 36 1 UK
2 Anusha 41 1 Japan
3 Jacob 32 1 USA
4 Vankat 40 1 Brazil
5 Tahira 36 2 USA
6 Gurjyot 40 2 USA
7 Anusha 36 2 Japan
8 Jacob 40 2 Brazil
9 Vankat 46 2 USA
10 Tahira 24 3 Brazil
11 Gurjyot 30 3 USA
12 Anusha 44 3 UK
13 Jacob 40 3 Brazil
14 Vankat 32 3 USA
15 Tahira 36 4 Japan
16 Gurjyot 32 4 Japan
17 Anusha 36 4 Brazil
18 Jacob 24 4 UK
19 Vankat 38 4 USA
Country Brazil Japan UK USA
Tutor
Anusha 36.0 38.5 44.0 NaN
Gurjyot NaN 32.0 36.0 35.000000
Jacob 40.0 NaN 24.0 32.000000
Tahira 24.0 36.0 NaN 32.000000
Vankat 40.0 NaN NaN 38.666667

Practical 14: Histogram

Date:
Aim : To create Histograms on pandas DataFrame

import pandas as pd
import numpy as np
d={ 'Age':[37,28,38,44,53,69,74,53,35,38,66,46,24,45,92,48,51,62,57]}
hage=pd.DataFrame(d)
hage.hist()
hage.hist(column='Age',grid=True,bins=20 )
Output:
array([[<matplotlib.axes._subplots.AxesSubplot object at
0x0000019A7508D888>]],
dtype=object)

Practical 15: User defined Functions

Date:
Aim : To create user defined functions and calling them

Source code:
#userdefined Function
def addnum (): #function defination
a=int(input("Please enter a number"))
b=int(input("Please enter a number"))
return(a+b)

c=addnum() #function calling

print("sum = ",c)
print("Sum= ",addnum() )

Output:

Please enter a number5

Please enter a number20
sum = 25
Please enter a number25
Please enter a number22
Sum= 47

Practical 16: Table wise Function Application and lambda function

Date:

Aim : To use pipe() apply() and appilymap()and lambda function

Source Code:

import pandas as pd
import numpy as np
import math
# User defined function
def adder(adder1,adder2):
return adder1+adder2
#Create a Dictionary of series
d = {'Score_Math':pd.Series([66,57,75,44,31,67]),
'Score_Science':pd.Series([89,87,67,55,47,72])}
df = pd.DataFrame(d)
print ("DataFrame\n",df)
print ("PIPE() \n",df.pipe(adder,2))

print ("On Rows apply(np.mean,axis=1)\n",df.apply(np.mean,axis=1)) # row wise

print ("On Columns apply(np.mean,axis=0) \n",df.apply(np.mean,axis=0)) #column

wise
print( " LAMBDA \n", df.applymap(lambda x:math.sqrt(x)))

Output:
DataFrame
Score_Math Score_Science
0 66 89
1 57 87
. . . .
5 67 72
PIPE()
Score_Math Score_Science
0 68 91
1 59 89
2 77 69
3 46 57
4 33 49
5 69 74
On Rows apply(np.mean,axis=1)
0 77.5
1 72.0
2 71.0
3 49.5
4 39.0
5 69.5
dtype: float64
On Columns apply(np.mean,axis=0)
Score_Math 56.666667
Score_Science 69.500000
dtype: float64
LAMBDA
Score_Math Score_Science
0 8.124038 9.433981
1 7.549834 9.327379
2 8.660254 8.185353
3 6.633250 7.416198
4 5.567764 6.855655
5 8.185353 8.485281

Practical 17: Data Visualization (Line Chart)

Date:

Aim : : To use line chart plot() for data Visualisation

Source Code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as pl
RollNo=[1,2,3,4,5]
Maths=[20,22,26,28,30]
IP=[21,24,29,16,25]
Science=[26,23,20,26,22]
pl.title("Grade 12 Preboard Exams")
names={'Rayan','Unnati','Khushi','Aryan','Yakshesh'}
pl.xlabel('Names')
pl.ylabel('Marks')
pl.xticks(RollNo,names)
pl.plot(RollNo,Maths,'r',marker='o',label='Maths')
pl.plot(RollNo,IP,'k',marker='s',label='IP')
pl.plot(RollNo,Science,'b',marker='*',label='Science')
pl.legend()
pl.grid(color='y')

Output:

Practical 18: Data Visualization Bar chart

Date:

Aim : : To use Barchart for data Visualisation

Source Code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
ItemCode=np.arange(1,6)
SalesJan=[50,60,25,80,60]
SalesFeb=[30,40,35,70,80]
SalesMar=[40,50,45,40,92]
plt.bar(ItemCode-0.2,SalesJan,width=0.2 ,color='red',label="Jan")
plt.bar(ItemCode,SalesFeb,width=0.2, color='blue',label="Feb")
plt.bar(ItemCode+0.2,SalesMar,width=0.2, color='green',label="Feb")
plt.title('Total Sales March')
plt.xlabel('Items')
plt.ylabel('Quantity')
plt.xticks(RollNo,["Mouse","Printer","Scanner","WebCam","PenTab"],rotation=90)
pl.legend()
pl.grid(True)

output:

Practical 19: Data Visualization Histogram

Date:

Aim : : To use histogram for data Visualisation

Source Code:
import numpy as np
import matplotlib.pyplot as pl
marks=[22,25,18,19,11,21,28,30,24,24,23,15,20,27,21,21,13,30,18,25]
pl.hist(marks,edgecolor='r',bins=5,color='blue')
pl.ylabel ('Frequency' )
pl.xlabel ('Bins/Ranges')
pl.title('My Chart')

#specifying our own bins 5 (20/5=4) so group will be of 4 eg (11-14)

#min value is 11 and max is 30
#range will be 11-14, 15-18, 19-22, 23-26, 27-30
#freq will be 2 3 6 5 4 = 20

x=np.random.randn(1000)
y=np.random.randn(1000)
pl.hist([x,y], bins=10,edgecolor='k',histtype='barstacked')

x=np.random.randn(1000)
y=pl.hist(x,bins=10,edgecolor='b',color='yellow')
a=pd.Series(y[1])
b=pd.Series(y[0])
a.pop(10)
a=a+.25
pl.plot(a,b,'k')

output:

Practical 20: Data Visualization BOXPLOT

Date:

Aim : To use boxplot for data Visualisation

Source Code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as pl
x=[2,3,1,4,4,6,8,10,10,3,4]
y=[5,3,1,7,6,8,9,10,8,3,4]
z=pl.boxplot([x,y],patch_artist=True,labels=['LG','OGeneral'])
pl.title('Air-conditioner' )

colors = ['pink', 'red']

for patch, color in zip(z['boxes'], colors):
patch.set_facecolor(color)

output:
Practical 21: Data Visualization Piechart
Date:

Aim : To use piechart for data Visualisation

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

x=[10,30,27,13,8,12]

fr=['Peach','Banana','Grapes','Oranges','Pineapple','Apple']

co=['pink','yellow','lightgreen','orange','brown','red']

plt.pie(x,labels=fr,colors=co,autopct='%1.0f%
%',shadow=True,explode=(0,.1,.1,0,0,0))

plt.show()

output:

Practical 22: Structure Query Language

Date:

Journal 12
No ratings yet
Journal 12
54 pages
Practical Record Programs - Solutions
No ratings yet
Practical Record Programs - Solutions
23 pages
Dsbda Assignment 1
No ratings yet
Dsbda Assignment 1
5 pages
12th IP PRACTICALS
No ratings yet
12th IP PRACTICALS
18 pages
Pandas,Numpy,Matplotlib
No ratings yet
Pandas,Numpy,Matplotlib
11 pages
Saish IP Project
No ratings yet
Saish IP Project
16 pages
09_Pandas slides
No ratings yet
09_Pandas slides
33 pages
Panda
No ratings yet
Panda
33 pages
Manual
No ratings yet
Manual
52 pages
IP_Record-5
No ratings yet
IP_Record-5
9 pages
dav 2 unit
No ratings yet
dav 2 unit
55 pages
Python Libraries
No ratings yet
Python Libraries
27 pages
14_Pandas
No ratings yet
14_Pandas
25 pages
Unit 2 notes-II
No ratings yet
Unit 2 notes-II
47 pages
dv_lab_manual_modified
No ratings yet
dv_lab_manual_modified
31 pages
Practical File: School Name School Logo
100% (1)
Practical File: School Name School Logo
35 pages
Practical Record 2 PYTHON AND SQL PROGRAMS - 2023
No ratings yet
Practical Record 2 PYTHON AND SQL PROGRAMS - 2023
76 pages
Assignment-1 (Python Pandas-Series Object and Data Frame: 1. Answer The Following
100% (1)
Assignment-1 (Python Pandas-Series Object and Data Frame: 1. Answer The Following
8 pages
Iteration
No ratings yet
Iteration
40 pages
IP-LAB-FILE-PYTHON
No ratings yet
IP-LAB-FILE-PYTHON
9 pages
MMPS Record IP
No ratings yet
MMPS Record IP
73 pages
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
No ratings yet
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
28 pages
NumPy and Pandas Tutorial
No ratings yet
NumPy and Pandas Tutorial
8 pages
Pandas Library
No ratings yet
Pandas Library
5 pages
AD3301 DEV Lab Manual
No ratings yet
AD3301 DEV Lab Manual
26 pages
Class 12 IP Final Practical
No ratings yet
Class 12 IP Final Practical
21 pages
Pandas_Dataframe_All_Operations_1735471870
No ratings yet
Pandas_Dataframe_All_Operations_1735471870
4 pages
Pandas
No ratings yet
Pandas
82 pages
DATA SCIENCE EXPERIMENTS
No ratings yet
DATA SCIENCE EXPERIMENTS
31 pages
batch2 ds
No ratings yet
batch2 ds
34 pages
Programs of Python Pandas
No ratings yet
Programs of Python Pandas
15 pages
Informatic Practices Hhw (3)
No ratings yet
Informatic Practices Hhw (3)
59 pages
EDA Progarm And Output
No ratings yet
EDA Progarm And Output
38 pages
Python Programs
No ratings yet
Python Programs
25 pages
Ilovepdf Merged (2) Merged
No ratings yet
Ilovepdf Merged (2) Merged
65 pages
Data Science - Unit-3-Part-2
No ratings yet
Data Science - Unit-3-Part-2
32 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
10 pages
Creation of Series Using List, Dictionary & Ndarray
No ratings yet
Creation of Series Using List, Dictionary & Ndarray
65 pages
Python CSBS Bhavya Lab Manual
No ratings yet
Python CSBS Bhavya Lab Manual
14 pages
IP 12th Chapter 3
No ratings yet
IP 12th Chapter 3
9 pages
EDA LAB MANUAL (1) (1)
No ratings yet
EDA LAB MANUAL (1) (1)
34 pages
Introducing Pandas String Operations & Plots
No ratings yet
Introducing Pandas String Operations & Plots
16 pages
Raj Series and Dataframe
No ratings yet
Raj Series and Dataframe
25 pages
National Public School: Name-Karan Choudhary Class-XII Subject - Informatics Practices (065) Board Roll No.
No ratings yet
National Public School: Name-Karan Choudhary Class-XII Subject - Informatics Practices (065) Board Roll No.
24 pages
DEV RECORD AIDS
No ratings yet
DEV RECORD AIDS
24 pages
Financial Analytics With Python
100% (1)
Financial Analytics With Python
40 pages
XII IP PRACTICAL LIST 2022-23-1
No ratings yet
XII IP PRACTICAL LIST 2022-23-1
23 pages
Practical File Part 1
No ratings yet
Practical File Part 1
17 pages
Ip Xii Practical File 2024
No ratings yet
Ip Xii Practical File 2024
44 pages
Pandas
No ratings yet
Pandas
13 pages
National Public School: Name-Karan Choudhary Class-XII Subject - Informatics Practices (065) Board Roll No.
No ratings yet
National Public School: Name-Karan Choudhary Class-XII Subject - Informatics Practices (065) Board Roll No.
35 pages
IP - Pandas 1 & 2 (Worksheet) Class 12
No ratings yet
IP - Pandas 1 & 2 (Worksheet) Class 12
16 pages
XII - Informatics Practices (LAB MANUAL)
100% (1)
XII - Informatics Practices (LAB MANUAL)
42 pages
Suryadatta National School Class 12 CBSE Informatics Practices Practicals List
No ratings yet
Suryadatta National School Class 12 CBSE Informatics Practices Practicals List
19 pages
Pandas DataFrame1
No ratings yet
Pandas DataFrame1
22 pages
IP Practical File Project
No ratings yet
IP Practical File Project
60 pages
Data frames pandas, handout 1 (1)
No ratings yet
Data frames pandas, handout 1 (1)
16 pages
1
No ratings yet
1
12 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Project Report On Multiplex - Ticket Booking System
50% (2)
Project Report On Multiplex - Ticket Booking System
32 pages
Day 5
50% (2)
Day 5
3 pages
Developer Technical Services
100% (1)
Developer Technical Services
104 pages
Istqb Notes
100% (2)
Istqb Notes
11 pages
Data Parallel Patterns
No ratings yet
Data Parallel Patterns
9 pages
Unit4 OS
No ratings yet
Unit4 OS
125 pages
Abstract-Windows 10 Is A Latest Operating System
No ratings yet
Abstract-Windows 10 Is A Latest Operating System
7 pages
Week2 Combine
No ratings yet
Week2 Combine
139 pages
Prashanthi
No ratings yet
Prashanthi
7 pages
Lisp - Tears of Joy Part 9
No ratings yet
Lisp - Tears of Joy Part 9
5 pages
How To Learn Angular
No ratings yet
How To Learn Angular
12 pages
Script HTM
No ratings yet
Script HTM
45 pages
Infrastructure Penetration Testing Checklist
100% (1)
Infrastructure Penetration Testing Checklist
6 pages
Flowgorithm
No ratings yet
Flowgorithm
14 pages
presentation
No ratings yet
presentation
4 pages
11 Steps To Better Software Design Today.: Object Calisthenics
No ratings yet
11 Steps To Better Software Design Today.: Object Calisthenics
13 pages
Jour Fixe Allplan Administration en
No ratings yet
Jour Fixe Allplan Administration en
28 pages
PPS UNIT 1.2 Python_Intro
No ratings yet
PPS UNIT 1.2 Python_Intro
65 pages
b161753 Shaik Sameer Oops Week5
No ratings yet
b161753 Shaik Sameer Oops Week5
8 pages
Performance Test Report Template
No ratings yet
Performance Test Report Template
9 pages
Oodp Unit 1 Notes Unit 1
No ratings yet
Oodp Unit 1 Notes Unit 1
42 pages
Resume Format 3 - 2column
No ratings yet
Resume Format 3 - 2column
1 page
JAVA - 3.1 Amar
No ratings yet
JAVA - 3.1 Amar
3 pages
All You Need To Know To Being and Doing Agile 1728113667
No ratings yet
All You Need To Know To Being and Doing Agile 1728113667
29 pages
Rapid Application Development Rapid Application Development: (RAD) Life Cycle Model (RAD) Life Cycle Model
No ratings yet
Rapid Application Development Rapid Application Development: (RAD) Life Cycle Model (RAD) Life Cycle Model
8 pages
Enhancing The Usability of Library Systemat CSIBER Using QRCode
No ratings yet
Enhancing The Usability of Library Systemat CSIBER Using QRCode
10 pages
Test Automation Engineer Masters Program - Curriculum
No ratings yet
Test Automation Engineer Masters Program - Curriculum
34 pages
fyp Presentation Of my attendance app made in flutter
No ratings yet
fyp Presentation Of my attendance app made in flutter
24 pages
Module 3
No ratings yet
Module 3
6 pages
SSCE CS PracticalsList 2024 2025
No ratings yet
SSCE CS PracticalsList 2024 2025
4 pages