0% found this document useful (0 votes)
64 views

IP Practical File

Uploaded by

3danielretard0
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views

IP Practical File

Uploaded by

3danielretard0
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 27

1.

Object1 Population stores the details of population in four metro cities of India and Object
AvgIncome stores the total average income reported in previous year in each of these metros.
Calculate income perCapita for each of these metro cities.

import pandas as pd
Pop= pd.Series([10927986, 12691836, 4631392, 4328063 ],
index = ['Delhi', 'Mumbai', 'Kolkata', 'Chennai'])
AvgInc= pd.Series([36360927986, 252325355, 4141577878, 34896899], index = ['Delhi',
'Mumbai', 'Kolkata', 'Chennai'])
perCapita = AvgInc / Pop
print ("Population in four metro cities ")
print (Pop)
print ("AvgIncome in four metro cities")
print (AvgInc)
print("Per Capita Income in four metro cities ")
print(perCapita)

Output:
Population in four metro cities
Delhi 10927986
Mumbai 12691836
Kolkata 4631392
Chennai 4328063
dtype: int64
AvgIncome in four metro cities
Delhi 36360927986
Mumbai 252325355
Kolkata 4141577878
Chennai 34896899
dtype: int64
Per Capita Income in four metro cities
Delhi 3327.321977
Mumbai 19.880918
Kolkata 894.240409
Chennai 8.062937
dtype: float64
2.Write a program to create a DataFrame to store weight, age and names of 3 people. Print
DataFrame and its transpose.

import pandas as pd
df = pd.DataFrame({'Weight': [78, 45, 67],
'Name': ['sam','arun', 'ajay'],'Age' : [56, 42,34]})
print('Original Dataframe')
print(df)
print('Transpose:')
print(df.T)

Output:
Original Dataframe
Weight Name Age
0 78 sam 56
1 45 arun 42
2 67 ajay 34
Transpose:
0 1 2
Weight 78 45 67
Name sam arun ajay
Age 56 42 34
3.Consider the following dataframe saleDf:
Target Sales
zoneA 56000 58000
zoneB 70000 68000
zoneC 75000 78000
zoneD 60000 61000
Write a program to add a column namely Orders having values 6000, 6700, 6200 and 6000
respectively for the zones A, B. C and D. The program should also add a new row for a new
zone ZoneE. Add some dummy values in this row.

import pandas as pd
saleDf=pd.DataFrame({"Target":[56000,70000,75000,60000],"Sales":
[58000,68000,78000,61000]},index= ["zoneA","zoneB","zoneC","zoneD"])
saleDf['Orders'] = [6000, 6700, 6200, 6000]
saleDf.loc['zoneE', :]= [50000, 45000, 5000]
print(saleDf)

Output:
Target Sales
zoneA 56000 58000
zoneB 70000 68000
zoneC 75000 78000
zoneD 60000 61000
4.From the dtf5 used above, create another DataFrame and it must not contain the column
“Population” and the row Bangalore.

import pandas as pd
data = {
'hospitals': [150, 540, 100, 34],
'population':[601200, 671100, 621100, 67110]}
df=pd.DataFrame(data,index=['delhi','banglore','kolkata','chennai'])
print(df)
del df['population']
df2=df.drop(['banglore'])
print(df2)

Output:

Hospitals Schools
Delhi 556.0 8335.0
Mumbai 773.0 7263.0
Kolkata 293.0 7238.0
Chennai 489.0 2726.0
5. Given a Series that stores the area of some states in km². Write code to find out the
biggest and smallest areas from the given Series. Given series has been created like this :
Ser1 pd. Series ( [34567, 890, 450, 67892, 34677, 78902, 256711, 678291, 637632, 25723,
2367, 11789, 345, 256517])

import pandas as pd
ser1 = pd.Series( [34567, 890, 450, 67892, 34677, 78902,256711, 678291, 637632, 25723,
2367, 11789, 345, 256517])
print("Top 3 biggest areas are:")
print(ser1.sort_values().tail(3))
print("3 smallest areas are :")
print(ser1.sort_values().head(3))

Output:

Top 3 biggest areas are:


6 256711
8 637632
7 678291
dtype: int64
3 smallest areas are :
12 345
2 450
1 890
dtype: int64
6.Write a program to create data series and then change the indexes of the Series object in
any random order.

import pandas as pd
import numpy as np
s1 = pd. Series(data = [10, 20,30,40,50], index = ['a', 'b', 'c', 'd', 'e'])
print("Original Data Series:")
print(s1)
s1= s1.reindex(index = ['b', 'c', 'd','a', 'e'])
print ("Data Series after changing the order of index:")
print(s1)

Output:

Original Data Series:


a 10
b 20
c 30
d 40
e 50
dtype: int64
Data Series after changing the order of index:
b 20
c 30
d 40
a 10
e 50
dtype: int64
7. Given a Series object s4. Write a program to change the values at its 2nd row(index1)
and 3rd row to 8000.

Program code:
import pandas as pd
s4= pd.Series([67000,56000,50000,52000])
print("Original Series object s4:")
print(s4)
s4[1:3]=8000
print("Series object s4 after changing value:")
print(s4)

Output:
Original Series object s4:
0 67000
1 56000
2 50000
3 52000
dtype: int64
Series object s4 after changing value:
0 67000
1 8000
2 8000
3 52000
dtype: int64
8. Given a Series object s5. Write a program to calculate the cubes of the Series values.

import pandas as pd
s= pd.Series([5,7,9])
print("series object s")
print(s)
print("Cubes of s values")
print(s**3)

Output:
series object s
0 5
1 7
2 9
dtype: int64
Cubes of s values
0 125
1 343
2 729
dtype: int64
9.Write a program to print the DataFrame df, one column at a time.

import pandas as pd
dict = { 'Name': ["Ram", "Pam", "Sam"],'Marks': [70, 95, 80]}
df = pd.DataFrame(dict, index = ['Rno.1', 'Rno.2', 'Rno.3'])
for i, j in df.iteritems():
print(j)
print("---------------------")

Output:
Rno.1 Ram
Rno.2 Pam
Rno.3 Sam
Name: Name, dtype: object
---------------------
Rno.1 70
Rno.2 95
Rno.3 80
Name: Marks, dtype: int64
10.Given a DataFrame dtf6.
Hospitals Schools
Delhi 267 7636
Mumbai 425 9776
Kolkata 375 2524
Chennai 274 1625
Write a program to display top two rows' values of 'Schools' column and last 3 values of
'Hospitals' column.

import pandas as pd
dtf6=pd.DataFrame({"hospital":[267,425,375,274],"schools":
[7636,9776,2524,1625]},index=["Delhi","Mumbai","Kolkata","Chennai"])
print(dtf6.Schools.head(2))
print(dtf6.Hospitals.tail(3))

Output:
Delhi 7636
Mumbai 9776
Name: schools, dtype: int64
Mumbai 425
Kolkata 375
Chennai 274
Name: hospital, dtype: int64
11.Given Dataframe df
Name Sex Position City Age Projects Budget
0 Rabita F Manager Bangalore 30 13 48
1 Evan M Programmer New delhi 27 17 13
2 Jia F Manager Chennai 32 16 32
3 Lalit M Manager Mumbai 40 20 21
4 Jaspreet M Programmer Chennai 28 21 17
5 suji F Programmer Bangalore 32 14 10
Write a program to print only the Name, Age and Position for all rows.

import pandas as pd
import numpy as np
df=pd.DataFrame({"Name":["Rabita","Evan","Jia","Lalit","Jaspreet","suji"],"Sex":
['F','M','F','M','M','F'],"Position":
['Manager','Programmer','Manager','Manager','Programmer','Programmer'],"City":
['Bangalore','New delhi','Chennai','Mumbai','Chennai','Bangalore'],"Age":
[30,27,32,40,28,32],"Projects":[13,17,16,20,21,14],"Budget":[48,13,32,21,17,10]})
for i, row in df.iterrows():
print(row['Name'], '\t', row["Age"], '\t', row['Position'])
Output:
Rabita 30 Manager
Evan 27 Programmer
Jia 32 Manager
Lalit 40 Manager
Jaspreet 28 Programmer
suji 32 Programmer
12. Given a series nfib that contains reversed Fibonacci numbers with Fibonacci numbers
as shown below:
[0, -1, -1, -2, -3, -5, -8, -13, -21, -34, 0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

Program code:
import matplotlib.pyplot as plt
import numpy as np
n= [0, -1, -1, -2, -3, -5, -8, -13, -21, -34, 0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
plt.plot(range(-10, 10), n, 'mo', markersize= 3, markeredgecolor = 'b',linestyle = 'solid')
plt.show()
Output:
13.Consider the reference table 3.1. Write a program to plot a bar chart from the
Medals won by Australia. In the same chart, plot medals won by India too.

import matplotlib.pyplot as plt


Info = ['Gold', 'Silver', 'Bronze', 'Total']
Aus = [80, 59, 59,198]
Ind = [26, 20, 20, 66]
plt.bar(Info, Aus)
plt.bar(Info, Ind)
plt.xlabel("Medal type")
plt.ylabel("Australia, India Medal count")
plt.show()

Output:
15. A survey gathers height and weight of 100 participants and recorded the participants
ages as:
Ages = [1, 1,2,3,5,7,8,9,10, 10,11,13,13,15, 16,17,18, 19,20, 21, 21,23, 24,24, 24, 25,25,
25,25,26, 26, 26,27,27, 27, 27,27, 29, 30, 30, 30, 30, 31,33, 34, 34, 34, 35, 36, 36, 37, 37,
37,38, 38, 39, 40,40, 41, 41,42, 43,45,45,46,46, 46, 47,48,48,49, 50, 51,51, 52, 52, 53, 54,
55,56,57,58,60, 61,63,65,66,68, 70, 72,74, 75,77,81, 83, 84,87,89,90,91]
Write a program to plot a histogram from above data with 20 bins

import matplotlib.pyplot as plt


age= [1, 1,2,3,5,7,8,9,10, 10,11,13,13, 15, 16,17,18, 19,20, 21,21,23,24,24,24, 25,25,25,
25,26,26, 26,27,27,27,27,27,29,30,30,30, 30, 31, 33, 34, 34, 34,35, 36, 36, 37, 37, 37, 38, 38,
39, 40,40,41, 41,42, 43,45,45,46,46, 46, 47,48, 48, 49,50, 51,51, 52, 52, 53, 54,
55,56,57,58,60, 61,63,65,66,68,70,72,74, 75,77,81,83,84,87,89,90,91]
plt.hist(age, bins=20)
plt.title ("Histogram")
plt.show()
Output:
16. Write a Python program to draw line charts from the given financial data of ABC Co. for
5 days in the form a DataFrame namely fdf as shown below:

Day1 Day2 Day3 Day4 Day5


0 74.25 56.03 59.30 69.00 89.65
1 76.06 68.71 72.07 78.47 79.65
2 69.50 62.89 77.65 65.53 80.75
3 72.55 56.42 66.46 76.85 85.08

import matplotlib.pyplot as plt


import numpy as np
import pandas as pd
d = pd.DataFrame({"day1":[74.25,76.06,69.50,72.55],"day2":
[56.03,68.71,62.89,56.42],"day3":[59.30,72.07,77.65,66.46],"day4":
[69.00,78.47,65.53,76.85],"day5":[89.65,79.65,80.75,85.08]})
x=np.arange(4)
plt.plot(x,d["day1"],label='day1')
plt.plot(x,d["day2"],label='day2')
plt.plot(x,d["day3"],label='day3')
plt.plot(x,d["day4"],label='day4')
plt.plot(x,d["day5"],label='day5')
plt.legend(loc='upper center')
plt.show()
Output:
17.Prof Awasthi is doing some research in the field of Environment. For some plotting
purposes, he has generated some Data as:

mu =100
sigma 15
x= mu + sigma numpy.random.randn(10000)
y = mu + 30 np.random.randn(10000)
Write a program to plot this data on a cumulative bar-stacked horizontal histogram with
both x and y.

import numpy as np
import matplotlib.pyplot as plt
mu = 100
sigma =15
x= mu + sigma*np.random.randn(10000)
y=mu +30*np.random.randn(10000)
plt.hist([x,y],bins = 100,histtype='barstacked', cumulative=True)
plt.title('Histogram')
plt.show()

Output:
18. Write a program to create a horizontal bar chart from to data sequences as given
below:
means = [20, 35, 30, 35, 27], stds= [2, 3, 4, 1, 2]
Make sure to show legends.

import matplotlib.pyplot as plt


import numpy as np
means=[20, 35, 30, 35, 27]
stds=[2, 3, 4, 1, 2]
indx =np.arange(len(means))
plt.barh(indx, means, color='y', label='mean')
plt.barh(indx +0.25, stds, color= 'blue', label= 'std')
plt.legend()
plt.show()

Output:
19. Write a program to read from a CSV file Employee.csv and create a dataframe from it
but dataframe should not use file's column header rather should use own column
numbers as 0, 1, 2 and so on.

import pandas as pd
df = pd.read_csv('Employee.csv', header=None)
print(df)
Output:

0 1 2 3
0 1001 trupti manager 5663665
1 1002 sam manger 46436
2 1003 pam ca 3634666
3 1004 arun clerk 252363
4 1005 shreya clerk 25346666
20.Write a program to read from a CSV file Employee.csv and create a dataframe from it
but dataframe should not use file's column header rather should use own column
headings as EmpID, EmpName, Designation and Salary. Also print the maximum salary
given to an employee

import pandas as pd
df = pd.read_csv('Employee.csv', header=None, names=['EmpID', 'EmpName', 'Designation',
'Salary'])

# Displaying the dataframe


print(df)

# Finding and printing the maximum salary


max_salary = df['Salary'].max()
print("The maximum salary is:", max_salary)

Output:

EmpID EmpName Designation Salary


0 1001 trupti manager 53665
1 1002 sam manger 46436
2 1003 pam ca 3634666
3 1004 arun clerk 252363
4 1005 shreya clerk 25346666

The Maximum salary is 25346666


21. Write a program to read from sport.csv and plot its competitions column against sport
column in form a line chart.

import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv(‘C:/Users/Arjun/OneDrive/Documents/sport.csv’)
plt.plot(df['Sport'], df['Competitions'], marker='o')
plt.title('Competitions vs Sport')
plt.xlabel('Sport')
plt.ylabel('Competitions')
plt.xticks(rotation=45)
plt.show()

Output:
22. Previous examples 9 and 10 created csv files without NaN values. Write a program to
store the data of allDf dataframe in a csv file along with NaN values stored as Null and
separator as’~’
import pandas as pd
data = {
'Name': ['Alice', 'Bob', 'Charlie', None],
'Age': [25, 30, None, 22],
'Salary': [50000, None, 60000, 45000]
}
allDf = pd.DataFrame(data)
allDf.to_csv('output_tilde.csv', sep='~', na_rep='Null', index=False)
print("CSV file with '~' separator has been created as 'output.csv'")

Output:
23.Write modified program of example 10. Take marks from user and fetch those records
which have marks more than input marks.

import pandas as pd
import mysql.connector as sqltor
mycon = sqltor.connect(host = "localhost", user = "root", passwd = "MyPass", database =
"test")
if mycon.is_connected():
mks = float(input("Enter marks :"))
qry = "Select * from student where marks > %s ; " % (mks,)
mdf = pd.read_sql( qry, mycon)
print("Student details with marks >", mks)
print(mdf)
else:
print("MySQL Connection problem")

Output:
Enter marks : 75
Student details with marks > 75.0
RollNo Name Marks Grade Section Project
0 103 Simran 81.2 A B Evaluated
1 106 Arsiya 91.6 A+ B submitted
24. Write a program in SQL to take the marks range from the user i.e., lower and upper
limit of the marks range and fetch those records from the student table having marks in
this range.
import pandas as pd
import mysql.connector as sqltor
mycon = sqltor.connect(host="localhost", user="root", passwd="MyPass", database="test")
if mycon.is_connected():
lmks = float(input("Enter lower limit of marks range: "))
hmks = float(input("Enter higher limit of marks range: "))
qry = "Select * from student where marks between %s and %s;" % (lmks, hmks)
mdf = pd.read_sql(qry, mycon)
print("Student details with marks in the range (", lmks, "-", hmks, "): ")
print(mdf)
else:
print("MySQL Connection problem")

Enter lower limit of marks range : 50


Enter higher limit of marks range : 75
Student details with marks in the range ( 50.0 , 75.0 ) :
RollNo Name Marks Grade Section Project
0 102 George 71.2 B A Submitted
1 104 Ali 61.2 B C Assigned
2 105 Kushal 51.6 C C Evaluated
25. Write a program in SQL to bring those students' details from the student table of
MySQL database `test` whose names begin with a specific letter. Take the letter as user
input.
import pandas as pd
import mysql.connector as sqltor
mycon = sqltor.connect(host="localhost", user="root", passwd="MyPass", database="test")
if mycon.is_connected():
letter = input("Enter beginning letter for the names: ")
qry = "Select * from student where name like '%s%%';" % (letter)
mdf = pd.read_sql(qry, mycon)
print("Student details with names beginning with", letter, "are:")
print(mdf)
else:
print("MySQL Connection problem")
Output:
Student details with names beginning with A are:
RollNo Name Marks Grade Section Project
0 101 Aliya 85.5 A A pending
1 104 Arjun 78.3 B C Evaluated
2 106 Arsiya 91.6 A+ B submitted
26. Write a program to write the dataframe allDf given below in sales table of test
database on MySQL. If the table exists, then the records should overwrite the table

import pandas as pd
from sqlalchemy import create_engine
import pymysql

# Creating the allDf dataframe


data = {
'Name': ['Purv', 'Paschim', 'Kendriya', 'Dakshin', 'Uttar', 'Rural'],
'Product': ['Oven', 'AC', 'AC', 'Oven', 'TV', 'Tubewell'],
'Target': [56000.0, 70000.0, 75000.0, 60000.0, None, None],
'Sales': [58000.0, 68000.0, 78000.0, 61000.0, None, None]
}
index_labels = ['zoneA', 'zoneB', 'zoneC', 'zoneD', 'zoneE', 'zoneF']
allDf = pd.DataFrame(data, index=index_labels)
engine = create_engine('mysql+pymysql://root:MyPass@localhost/test')
mycon = engine.connect()
allDf.to_sql('sales', mycon, index=True, if_exists='replace')
mycon.close()

print("Dataframe successfully written to 'sales' table.")

Output:
27. Write a program to write only the top 4 rows of the dataframe allDf used in previous
example, in sales2 of test database on MySQL. If the table exists, then the records should
get appended to the table.
import pandas as pd
from sqlalchemy import create_engine
import pymysql

# Creating connection to database


engine = create_engine('mysql+pymysql://root:MyPass@localhost/test')
mycon = engine.connect()
data = {
'Name': ['Purv', 'Paschim', 'Kendriya', 'Dakshin', 'Uttar', 'Rural'],
'Product': ['Oven', 'AC', 'AC', 'Oven', 'TV', 'Tubewell'],
'Target': [56000.0, 70000.0, 75000.0, 60000.0, None, None],
'Sales': [58000.0, 68000.0, 78000.0, 61000.0, None, None]
}
allDf = pd.DataFrame(data)
allDf.head(4).to_sql('sales2', mycon, index=False, if_exists='append')
Output:

You might also like