0% found this document useful (0 votes)
194 views38 pages

Final 1

The document contains a table of contents listing 27 topics related to Python, Pandas, and data analysis. Some topics include creating Series objects from lists, modifying Series values, calculating totals from Series, plotting charts from Series and DataFrames, and SQL queries.

Uploaded by

Mahil Pawar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
194 views38 pages

Final 1

The document contains a table of contents listing 27 topics related to Python, Pandas, and data analysis. Some topics include creating Series objects from lists, modifying Series values, calculating totals from Series, plotting charts from Series and DataFrames, and SQL queries.

Uploaded by

Mahil Pawar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Table of Contents -

S.No Topic Name


1 A Python list, namely section stores the section names…

2 Sequences section and contril store the section names…

3 Consider the two series objects s11 and s12 that you…

4 Consider the Series object s13 that stores…

5 Number of students in classes 11 and 12 in three streams…

6 Given a Series that stores the area of some states in km²

7 Find out the areas that are more than 50000 km²

8 Write a program to create a dataframe from a list

9 Write a program to create a dataframe from a list

10 Consider two series objects staff and salaries that

11 Given a DataFrame namely aid that stores the aid by NGOs

12 From the dtf5 used above, create another DataFrame

13 Marks is a list that stores marks of a student in 10 unit tests.

14 Tanushree is doing some research.

15 First 10 terms of a Fibonacci series are stored in a list namely fib:

16 Write a program to plot a bar chart from the medals won by Australia. In the same chart,
plot medals won by India too.

17 Write a program to plot a bar chart from the medals won by the top four countries.

18 Write a program to create a horizontal bar chart from two data sequences

19 Prof Awasthi is doing some research in the field of Environment.

20 Write a program to read from a CSV file Employee.csv

21 Consider the dataframe allDf as shown below:

22 QUERIES

23 TABLE GYM
24 TABLE SALESMAN

25 Give the output of following SQL statement:

26 TABLE SOFT DRINK

27 TABLE GARMENTS
1. A Python list, namely section stores the section names ('A', 'B', 'C', 'D') of class
12 in your school. Another list contri stores the contribution made by these
students to a charity fund endorsed by the school. Write code to create a Series
object that stores the contribution amount as the values and the section names as
the indexes.

import pandas as pd
section = ['A', 'B', 'C', 'D']
contri = [6700, 5600, 5000, 5200]
s11 = pd.Series (data = contri, index = section)
print (s11)

Output -
A 6700
B 5600
C 5000
D 5200
dtype: int64
2.Sequences section and contril store the section names ('A', 'B', 'C', 'D', 'E') and
contribution made by them respectively (6700, 5600, 5000, 5200, nil) for a charity.
Your school has decided to donate as much contribution as made by each section,
i.e., the donation will be doubled.

Write code to create a Series object that stores the contribution amount as the
values and the section names as the indexes with datatype as float32.

import pandas as pd
import numpy as np
section ['A', 'B', 'C', 'D', 'E']
contril = np.array([6700, 5600, 5000, 5200, np.NaN])
s12= pd.Series (data = contri12, index = section, dtype = np.float32)
print(s12)

Output -

A 6700.0
B 5600.0
C 5000.0
D 5200.0
E NaN
dtype: float32
3. Consider the two series objects s11 and s12 that you created in examples 11 and 12
respectively. Print the attributes of both these objects in a report form as shown below

import pandas as pd
# statements here to create objects s11 and $12 from previous examples

Print ("Attribute name :\t Object s11 \t Object s12")


print ("----------\t \t- It- -")
print("Data type(.dtype) :\t", s11.dtype, '\t\t', s12.dtype)
print("Shape (.shape) :\t", s11. shape, '\t\t', s12. shape)
print("No. of bytes (.nbytes) :\t", s11.nbytes, '\t\t', s12.nbytes)
print("No. of dimensions (.ndim) :\t", s11.ndim, "\t\t', s12.ndim)
print("Item size (.itemsize) :\t", s11.itemsize, '\t\t', s12.itemsize)
print("Has NaNs? (.hasnans) :\t", s11.hasnans, '\t\t', s12.hasnans)
print("Empty? (.empty) :\t", s11.empty, '\t\t', s12.empty)

Output -
4. Consider the Series object s13 that stores the contribution of each section, as
shown below:

A 6700
B 5600
C 5000
D 5200

Write code to modify the amount of section 'A' as 7600 and for sections 'C' and 'D'
as 7000. Print the changed object.

import pandas as pd
s13[0] = 7600
s13[2] 7000
print("Series object after modifying amounts:")
print(s13)

Output -
5. Number of students in classes 11 and 12 in three streams ('Science', 'Commerce'
and 'Humanities') are stored in two Series objects c11 and 12. Write code to find
total number of students in classes 11 and 12, stream wise.

SOLUTION

import pandas as pd
# creating Series objects
c11 = pd.Series (data = [30, 40, 50], index = ['Science', 'Commerce', 'Humanities'])
c12 = pd. Series (data = [37, 44, 45], index = [ 'Science', 'Commerce', 'Humanities'])
# adding two objects to get total no. of students print("Total no. of students")
print (c11+c12) # series objects arithmetic

Output -
Science 67
Commerce 84
Humanities 95
dtype: int64
6. Given a Series that stores the area of some states in km². Write code to find out
the biggest and smallest three areas from the given Series. Given series has been
created like this:

import pandas as pd
Ser1 = pd.Series( [34567, 890, 450, 67892, 34677, 78902, 256711, 678291,
637632, 25723, 2367, 11789, 345, 256517])
print("Top 3 biggest areas are :")
print (Ser1.sort_values().tail(3))
print("3 smallest areas are :")
print(Ser1.sort_values().head (3))

Output -
Top 3 biggest areas are :
6 256711
8 637632
7 678291
dtype: int64
3 smallest areas are :
12 345
2 450
1 890
dtype: int64
7. Find out the areas that are more than 50000 km²
import pandas as pd
Ser1= pd. Series ([34567, 890, 450, 67892, 34677, 78902, 256711, 678291,
637632, 25723, 2367, 11789, 345, 256517])
print (Ser1[Ser1 > 50000])

Output :-

3 67892
5 78902
6 256711
7 678291
8 637632
13 256517
dtype: int64
8. Write a program to create a dataframe from a list containing dictionaries of the
sales performance of four zonal offices. Zone names should be the row labels.

import pandas as pd
zoneA = {'Target' : 56000, 'Sales' : 58000}
zoneB = {'Target' : 70000, 'Sales' : 68000}
zoneC = {'Target' : 75000, 'Sales' : 78000}
zoneD = {'Target' : 60000, 'Sales' : 61000}
sales = [zoneA, zoneB, zoneC, zoneD]
saleDf = pd.DataFrame(sales,
index=['zoneA','zoneB','zoneC','zoneD']) print(saleDf)

Output -

Target Sales
Zone A 56000 58000
Zone B 70000 68000
Zone C 75000 78000
Zone D 60000 61000
9. Write a program to create a dataframe from a list containing 2 lists, each
containing Target and actual Sales figures of four zonal offices. Give appropriate
row labels.

import pandas as pd
Target = [56000, 70000, 75000, 60000]
Sales = [58000, 68000, 78000, 61000]
ZoneSales = [Target, Sales]
zsaleDf = pd.DataFrame (ZoneSales, columns = ['ZoneA', 'ZoneB', 'ZoneC',
'ZoneD'], print(zsaleDf)

Output -

ZoneA ZoneB ZoneC ZoneD

Target 56000 70000 75000 60000


Sales 58000 68000 78000 61000
10. Consider two series objects staff and salaries that store the number of people in
various office branches and salaries distributed in these branches, respectively.
Write a program to create another Series object that stores average salary per
branch and then create a DataFrame object from these Series objects.

import pandas as pd
Import numpy as np
staff=pd.Series([20,36,44])
salaries=pd.Series([27900,396800,563000])
avg=salaries/staff
org= {'people': staff, 'Amount': salaries, 'Average': avg}
dtf5= pd.DataFrame(org)
print(dtf5)

Output -
11. Given a DataFrame namely aid that stores the aid by NGOs for different states

Write a program to display the aid for


(1) Books and Uniform only (ii) Shoes only

import pandas as pd
# DataFrame aid created or loaded
print("Aid for books and uniform:")
print(aid[['Books', 'Uniform']])
print("Aid for shoes: ")
print (aid. Shoes)

Output -

Aid for books and uniform:

Books Uniform

Andhra 6189 610


Odisha 8208 508
M.P. 6149 611
U.P. 6157 457

Aid for shoes:

Andhra 8810
Odisha 6798
M.P. 9611
U.P. 6457

Name: Shoes, dtype: int64


12. From the dtf5 used above, create another DataFrame and it must not contain
the column
'Population' and the row Bangalore.
import pandas as pd
# DataFrame dtf5 created or loaded
dtf6 = pd.DataFrame(dtf5)
del dtf6['Population']
dtf6dtf6.drop(['Bangalore])
print(dtf6)

Output -

Hospitals Schools
Delhi 189.0 7916.0
Mumbai 208.0 8508.0
Kolkata 149.0 7226.0
Chennai 157.0 7617.0
13. Marks is a list that stores marks of a student in 10 unit tests. Write a program to
plot the student's performance in these 10 unit tests.

import matplotlib.pyplot as plt


week = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Marks = [12, 10, 10, 15, 17, 25, 12, 22, 35, 40]
plt.plot(week, Marks)
plt.xlabel('Week')
plt.ylabel('Unit Test marks')
plt.show()
14. Tanushree is doing some research. She has a stored line of Pascal’s triangle number
as ar2 as shown below:

ar2 = [1,7,21,35,35,21,7,1]

She wants to plot the sine graph (numpy.sin()), cosine(numpy.cost()) and tangent values
(numpy.tan()) for the same array (ar2)

She wants cyan color for sine plot line, red color for cosine plot line and the black color for
tangent plot line.

Also, the tangent line should be dashed,

Write a program to accomplish all this.

SOLUTION -

import matplotlib.pyplot as plt


import numpy as np
ar2 = [1, 7, 21, 35, 35, 21, 7, 1]
#calculating sin(), cos() and tan() values
s2 = np.sin(ar2)
c2 = np.cos(ar2)
t2 = np.tan(ar2)
# plotting line chart
plt.figure(figsize = (15, 7))
plt.plot(ar2, s2, 'c')
plt.plot(ar2, c2, 'r')
plt.plot(ar2, t2, 'k', linestyle = "dashed")
# Set the x axis label of the current axis.
plt.xlabel('Array values')
# Set the y axis label of the current axis. plt.ylabel('Sine, Cosine and Tangent Values')
plt.show()

Output -
15. First 10 terms of a Fibonacci series are stored in a list namely fib:

fib= [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

Write a program to plot Fibonacci terms and their square-roots with two separate lines on
the same plot,

The Fibonacci series should be plotted as a cyan line with 'o' markers having size as 5 and
edge-color as red.

The square-root series should be plotted as a black line with "+markers having size as 7
and edge-color as red.

SOLUTION -

import matplotlib.pyplot as plt


import numpy as np
fib= [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
sqfib= np.sqrt(fib)
plt.figure(figsize = (10, 7))
plt.plot(range(1,11), fib, 'co', markersize=5, linestyle = "solid", markeredgecolor='r')
plt.plot(range(1,11), sqfib, 'k+', markersize=7, linestyle = "solid", markeredgecolor='r')
plt.show()
Reference table 3.1.

Country Gold Silver Bronze Total

Australia 80 59 59 198

England 45 45 46 136

India 26 20 20 66

Canada 15 40 27 82

New Zealand 15 16 15 46

South Africa 13 11 13 37

Wales 10 12 14 36

Scotland 9 13 22 44

Nigeria 9 9 6 24

Cyprus 8 1 5 14
16. Consider the reference table 3.1.

Write a program to plot a bar chart from the medals won by Australia. In the same
chart, plot medals won by India too.

SOLUTION

import matplotlib.pyplot as plt


Info = [ 'Gold', 'Silver', 'Bronze', 'Total']
Australia = [80, 59, 59,198]
India = [26, 20, 20, 66]
plt.bar(Info, Australia)
plt.bar(Info, India)
plt.xlabel("Medal type")
plt.ylabel("Australia, India Medal count")
plt.show()

OUTPUT -
17. Consider the reference table 3.1. Write a program to plot a bar chart from the
medals won by the top four countries. Make sure that bars are separately visible.

import matplotlib.pyplot as plt


import numpy as np
plt.figure(figsize = (10, 7))
England=[45, 45, 46, 136]
Info = [ 'Gold', 'Silver', 'Bronze', 'Total']
Australia =[80, 59, 59,198]
India = [26, 20, 20, 66]
X = np.arange(len(Info))
Canada = [15, 40, 27, 82]
plt.bar(Info, Australia, width =.15)
plt.bar(X+0.15, England, width =.15)
plt.bar(X+0.30, India, width =.15)
plt.bar(X+0.45, Canada, width =.15)
plt.show()
18. Write a program to create a horizontal bar chart from two data sequences as
given below:

means = [20, 35, 30, 35, 27]


stds = [2, 3, 4, 1, 2]
Make sure to show legends.

import matplotlib.pyplot as plt


import numpy as np
means = [20, 35, 30, 35, 27]
stds = [2, 3, 4, 1, 2]
indx = np.arange(len(means))
plt.barh(indx, means, color = 'cyan', label = 'means')
plt.barh(indx +0.25, stds, color = 'olive', label = 'stds')
plt.legend()

OUTPUT -
19. Prof Awasthi is doing some research in the field of Environment. For some
plotting purposes, he has generated some data as:

import numpy as np
import matplotlib.pyplot as plt
mu = 100
sigma = 15
x = mu + sigma * np.random.randn(10000)
y = mu + 30* np.random.randn(10000)
plt.hist([x,y], bins = 100, histtype = 'barstacked')
plt.title('Research data Histogram')
plt.show()
20. Write a program to read from a CSV file Employee.csv and create a dataframe
from it but dataframe should not use the file's column header rather should use its
own column headings as EmpID, EmpName, Designation and Salary. Also print the
maximum salary given to an employee.

Employee.csv
Empno, Name, Designation, Salary 1001, Trupti, Manager, 56000
1002, Raziya, Manager, 55900
1003, Simran, Analyst, 35000
1004, Silviya, Clerk, 25000
1005, Suji, PR Officer, 31000

SOLUTION

import pandas as pd
edf = pd.read_csv("c:\\pywork\\Employee.csv",\
names['EmpID', 'EmpName', 'Designation', 'Salary'], skiprows = 1)
print(edf)
print("Maximum salary is", edf.Salary.max())

Output -

EmpID EmpName Designation Salary


0 1001 Trupti Manager 56000
1 1002 Raziya Manager 55900
2 1003 Simran Analyst 35000
3 1004 Silviya Clerk 25000
4 1005 Suji PR Officer 31000

Maximum salary is 56000


21. Consider the dataframe allDf as shown below:

Name Product Target Sales


zoneA Purv Oven 56000.0 58000.0
zoneB Paschim AC 70000.0 68000.0
zoneC Kendriya AC 75000.0 78000.0
zoneD Dakshin Oven 60000.0 61000.0
zoneE Uttar TV NaN NaN
zoneF Rural Tubewell NaN NaN

import pandas as pd
allDf.to_csv("c:\\pywork\\all.csv")

Output (all.csv file)

Csv file -
QUERIES -

Q1. Display all the records (all column) from table empl.

SOLUTION:

Select * from empl;

Q2. Display EmpNo and Ename of all employees from table in empl.

SOLUTION:

select empno, ename


from empl;

Q3. Display Ename, Sal and Sal added with comm from table empl.

SOLUTION:

select ename, sal, sal+comm


from empl;

Q4. Write a query to display employee name, salary and department number who are not
getting commission from table empl.

SOLUTION:

select ename, sal, deptno


from empl
where comm is NULL;

Q5. Write a query to display employee number, name, salary and salary 12 as Annual
Salary whose commission is not NULL from table empl.
SOLUTION:

select empno, ename, sal, sal+comm "Annual Salary"


from empl
where comm is NOT NULL;

Q6. List all department numbers in table empl.

SOLUTION:

select deptno
from empl;

Q7. List all unique department numbers in table empl.

SOLUTION:

select distinct deptno


from empl;

Q8. List details of all clerk who have not been assigned departments as yet.

SOLUTION:

Select*
from empl
where job='clerk'
and deptno IS NULL;

Q9. List the details of those employees who have four lettered names.

SOLUTION:

select*
from empl
where ename like'---';

Q10. List the details of all employees whose annual salary is between 25000-40000.

SOLUTION:

select*
from empl
where sal 12 between 25000
and 40000;

Q11. How many job types are offered to employees?


SOLUTION:

SELECT DISTINCT job


from empl;

Q12. List the details of employee who earns more Commission than their salaries.

SOLUTION:

select*
from empl
where comm>sal;

Q13. Write a query to display the name, job title and salary of employee who do not have
manager. SOLUTION:

select ename, job, sal from empl where mgr is null;

Q14. Write a query to display the name of employee whose name contains 'A' as third
alphabet.

SOLUTION:

select ename from empl where ename like "AN";

Q15. Write a query to display the name of employee whose name contains 'T' as a last
alphabet.

SOLUTION:

select ename from empl where ename like '%T";

Q16. Write a query to display the name of employee who is having 'L'as any alphabet of the
name.

SOLUTION:
select ename from empl where ename like '%L%";
TABLE GYM

Q1. Write a query to create table GYM.

SOLUTION:

create table gym


(ICODE CHAR(5),
INAME VARCHAR(20),
PRICE INT,
BRANDNAME VARCHAR(20));

Q2. To display the names of all the items whose name start with "A".

SOLUTION:

select INAME from gym where INAME like 'A%';

Q3. To display the ICODES and INAMES of all items, whose Brandname is reliable and
Coscore.

SOLUTION:

select icode, iname from gym


where brandname in ("reliable", "coscore");

Q4. To change the brandname to "Fit Trend India" of the item, whose ICODE as "G101".

SOLUTION:

update GYM1
Set BRANDNAME="Fit Trend India"
Where ICODE = "G101";
Q5. Add a new row for new item in gym with the details

"G107", "Vibro exerciser", 21000, "GTCFitness"

SOLUTION:

insert into gym


values ('G107', 'Vibro Exerciser',21000, 'GTCFitness');
Q1. Consider a table SALESMAN with the following data:

SALESMAN

SNO SNAME SALARY BONUS DATE OF JOIN

A01 Beena Mehta 30000 45.23 29-10-2019

A02 K. L. Sahay 50000 25.34 13-03-2018

B03 Nisha Thakkar 30000 35.0 18-03-2017

B04 Leela Yadav 80000 NULL 31-12-2018

C05 Gautam Gola 20000 NULL 23-01-1989

C06 Trapti Garg 70000 12.37 15-06-1987

D07 Neena Sharma 50000 27.89 18-03-1999

Write SQL queries using SQL functions to perform the following operations: (a) Display
salesman name and bonus after rounding off to zero decimal places.

a) To display the salesmen's names and bonuses after rounding them off to zero
decimal places.

b) To display the position of the occurrence of the string 'ta' in the salesmen's
names.

c) To display four characters from the salesmen's names starting from the
second character.

d) To display the month name for the date of joining of the salesmen.

e) To display the name of the weekday for the date of joining of the salesmen.
Given the following table:

No. Name Stipend Stream Avg Mark Grade Class

1 Karan 400.00 Medical 78.5 B 12B

2 Divakar 450.00 Commerce 89.2 A 11C

3 Divya 300.00 Commerce 68.6 C 12C

4 Arun 350.00 Humanities 73.1 B 12C

5 Sabina 500.00 Nonmedical 90.6 A 11A

6 John 400.00 Medical 75.4 B 12B

7 Robert 250.00 Humanities 61.4 C 11A

8 Rubina 450.00 Nonmedical 88.5 A 12A

9 Vikas 500.00 Nonmedical 92.0 A 12A

10 Mohan 300.00 Commerce 67.5 C 12C

Give the output of following SQL statement:


(i) SELECT TRUNCATE(AvgMark) FROM Student1 WHERE AvgMark < 75;
(ii) SELECT ROUND(AvgMark) FROM Student1 WHERE Grade = 'B';
(iii) SELECT CONCAT (Name, Stream) FROM Student1 WHERE Class = '12A';
(iv) SELECT RIGHT (Stream, 2) FROM Student1;

(i) It will return error because no argument is passed as decimal places to truncate.

(ii) Output:-

ROUND(AvgMark)

79
73

75

(iii) Output:-

CONCAT (Name,
Stream)

RubinaNonmedical

VikasNonmedical

(iv) Output:-

RIGHT (Stream, 2)

al

ce

ce

es

al

al

es

al

al
ce
Q1. Consider the following table named "SOFTDRINK". Write commands of SQL for (i) to
(iv).

Table: SOFTDRINK

(i) To display names and drink codes of those drinks that have more than 120 calories.
(ii) To display drink codes, names and calories of all drinks, in descending order of calories.
(iii) To display names and price of drinks that have price in the range 12 to 18 (both 12 and
18 included).
(iv) Increase the price of all drinks in the given table by 10%.

1. To display names and drink codes of those drinks that have


more than 120 calories.

Ans. SELECT DNAME, DRINKCODE FROM SOFTDRINK


WHERE CALORIES>120;

2. To display drink codes, names and calories of all drinks, in


descending order of calories.

Ans. SELECT DRINKCODE, DNAME, CALORIES FROM SOFTDRINK


ORDER BY CALORIES DESC;

3. To display names and price of drinks that have price in the


range of 12 to 18 (both 12 and 18 included).

Ans. SELECT DNAME, PRICE FROM SOFTDRINK


WHERE PRICE BETWEEN 12 AND 18;
4. Increase the price of all drinks in the given table by 10%.
Ans. UPDATE SOFTDRINK
SET PRICE = PRICE+0.10*PRICE;
Q2. Consider the following table named “GARMENTS”.

Table: GARMENTS

1. To display name of those garment that are available in ‘XL’ Size.


Ans. SELECT GNAME
FROM GARMENTS
WHERE SIZE=’XL’ ;

2. To display codes and names of those garments that have their


names staring with ‘Ladies’.
Ans. SELECT GCODE,GNAME
FROM GARMENTS
WHERE GANME LIKE ‘Ladies%;

3. To display garments names , codes and price of those garments


that have price in the range 1000.00 to 1500.00 ( both 1000.00 to
1500.00 included).
Ans. SELECT GNAME,GCODE,PRICE
FROM GARMENTS
WHERE PRICE BETWEEN 1000.00 AND 1500.00;
4. To change the colour of garment with code as 116 to ‘orange’.
Ans. UPDATE GARMENTS
SET COLOUR=’orange’
WHERE GCODE = 116;

You might also like