Ip - Capsule
Ip - Capsule
Ans:
0 1
1 2
2 2
3 7
4 Sachin
dtype: object
0 1
1 2
2 2
dtype: object
2 Write a program in python to find maximum value over index in Data frame.
Ans:
# importing pandas as pd
import pandas as pd
1|Pa g e
3 What are the purpose of following statements-
1. df.columns
2. df.iloc[ : , :-5]
3. df[2:8]
4. df[ :]
5. df.iloc[ : -4 , : ]
Ans:
2|Pa g e
3. It displays all columns with row index 2 to 7.
4. It will display entire dataframe with all rows and columns.
5. It will display all rows except the last 4 four rows.
4 Write a python program to sort the following data according to ascending order
of Age.
Name Age Designation
Sanjeev 37 Manager
Keshav 42 Clerk
Rahul 38 Accountant
Ans:
import pandas as pd
name=pd.Series(['Sanjeev','Keshav','Rahul'])
age=pd.Series([37,42,38])
designation=pd.Series(['Manager','Clerk','Accountant'])
d1={'Name':name,'Age':age,'Designation':designation}
df=pd.DataFrame(d1)
print(df)
df1=df.sort_values(by='Age')
print(df1)
Ans:
import pandas as pd
name=pd.Series(['Sanjeev','Keshav','Rahul'])
age=pd.Series([37,42,38])
designation=pd.Series(['Manager','Clerk','Accountant'])
d1={'Name':name,'Age':age,'Designation':designation}
df=pd.DataFrame(d1)
print(df)
3|Pa g e
df2=df.sort_values(by='Name',ascending=0)
print(df2)
6 Which of the following thing can be data in Pandas?
1. A python dictionary
2. An nd array
3. A scalar value
4. All of above
Ans:
Ans:
3. Value,size
Ans:
1. True
Ans:
4|Pa g e
4. None
Ans:
1. Dataframe
11 What will be the output of df.iloc[3:7,3:6]?
Ans:
It will display the rows with index 3 to 6 and columns with index 3 to 5 in a
dataframe ‘df’
12 How to select the rows where where age is missing?
1. df[df[‘age’].isnull]
2. df[df[‘age’]==NaN]
3. df[df[‘age’]==0]
4. None
Ans:
'Bidprice':[13,12,7,10,17,15],
'Runs':[1000,2400,900,200,3600,3700]}
df=pd.DataFrame(d)
print(df)
print(df.iloc[:2,:])
print(df.iloc[-3:,:])
14 Write a command to Find most expensive Player.
Ans:
print(df[df['BidPrice']==df['BidPrice'].max()])
5|Pa g e
15 Write a command to Sort all players according to BidPrice.
Ans:
print(df.sort_values(by='BidPrice'))
1. Mathematician
2. Statistician
3. Software Programmer
4. All of the above
Ans:
4 All the above
18 What is the built-in database used for python?
1. Mysql
2. Pysqlite
3. Sqlite3
4. Pysqln
Ans:
3 Sqlite3
19 How can you drop columns in python that contain NaN?
Ans:
df1.dropna(axis=1)
6|Pa g e
20 How can you drop all rows that contains NaN?
Ans:
df1.dropna(axis=0)
21 A Series is array, which is labelled and type.
Ans:
Ans:
4 All
Ans:
4.6
25 How many rows the resultant data frame will have?
import pandas as pd
df1=pd.DataFrame({‘key’:[‘a’,’b’,’c’,’d’], ‘value’:[1,2,3,4]})
df2=pd.DataFrame({‘key’:[‘a’,’b’,’e’,’b’], ‘value’:[5,6,7,8]})
df3=df1.merge(df2, on=’key’, how=’inner’)
1. 3
2. 4
3. 5
4. 6
Ans:
1. 3
26 How many rows the resultant data frame will have?
7|Pa g e
import pandas as pd
df1=pd.DataFrame({‘key’:[‘a’,’b’,’c’,’d’], ‘value’:[1,2,3,4]})
df2=pd.DataFrame({‘key’:[‘a’,’b’,’e’,’b’], ‘value’:[5,6,7,8]})
df3=df1.merge(df2, on=’key’, how=’right’)
1. 3
2. 4
3. 5
4. 6
Ans:
2. 4
27 How many rows the resultant data frame will have?
import pandas as pd
df1=pd.DataFrame({‘key’:[‘a’,’b’,’c’,’d’], ‘value’:[1,2,3,4]})
df2=pd.DataFrame({‘key’:[‘a’,’b’,’e’,’b’], ‘value’:[5,6,7,8]})
df3=df1.merge(df2, on=’key’, how=’left’)
1. 3
2. 4
3. 5
4. 6
Ans:
3. 5
Ans:
pop()
29 A is an interactive way to quickly summarize large amount of data.
Ans:
Pivoting
30 Method is used to rename the existing indexes in a data frame.
Ans:
rename
31 Attribute that can prohibit to create a new data frame in
sort_values() method.
Ans:
Inplace
32 Write a program in python to calculate the sum of marks in CS subject in a
given dataset-
‘CS’:[45,55,78,95,99,97], ‘IP’:[87,89,98,94,78,77]
Ans:
d1={ ‘CS’:[45,55,78,95,99,97], ‘IP’:[87,89,98,94,78,77] }
df=pd.DataFrame(d1)
print(df['CS'].sum())
8|Pa g e
33 Write a python program to create a data frame with headings (CS and IP) from
the list given below-
[[79,92][86,96],[85,91],[80,99]]
Ans:
l=[[10,20],[20,30],[30,40]]
df=pd.DataFrame(l,columns=['CS','IP'])
print(df)
34 How you can find the total number of rows and columns in a data frame.
Ans:
df.shape
35 MaxTemp MinTemp City RainFall
45 30 Delhi 25.6
34 24 Guwahati 41.5
48 34 Chennai 36.8
32 22 Bangluru 40.2
44 29 Mumbai 38.5
39 37 Jaipur 24.9
Ans:
print(df.sum(axis=0))
36 Based on the above data frame df, Write a command to compute mean of
column MaxTemp.
Ans:
Print(df['MaxTemp'].mean())
37 Based on the above data frame df, Write a command to compute average
MinTemp, RainFall for first 4 rows.
Ans:
df[['MinTemp', 'Rainfall’]][:4].mean()
38 Which method is used to read the data from MySQL database through Data
Frame?
Ans:
read_sql_query()
Ans:
execute()
40 What will be the output of following code?
9|Pa g e
import pandas as pd
df = pd.DataFrame([45,50,41,56], index = [True, False, True, False])
print(df.iloc[True])
Ans:
It will display error message like- Cannot index by location index with a non-integer
key because iloc accept only integer index.
10 | P a g e
Two functions for pivoting are: pivot() and pivot_table()
48. Write a python code to create a dataframe with appropriate headings from the
list given below:
['S101', 'Amy', 70], ['S102', 'Risha', 69], ['S104', 'Susan', 75], ['S105','George',
82]
Ans import pandas as pd
L=[['S101','Amy',70], ['S102','Risha',69], ['S104','Susan',75], ['S105','George',82]]
df=pd.DataFrame(L,index=[1,2,3,4],columns=['ID','Name','Points'])
print(df)
49. Consider the following dataframe, and answer the questions given below:
import pandas as pd
df = pd.DataFrame({“Quarter1":[2000, 4000, 5000, 4400, 10000],
"Quarter2":[5800, 2500, 5400, 3000, 2900],
"Quarter3":[20000, 16000, 7000, 3600, 8200],
"Quarter4":[1400, 3700, 1700, 2000, 6000]})
Write the code to find mean value from above dataframe df over the index and
column axis. (Skip NaN value)
Ans print(df.mean(axis=0,skipna=True))
print(df.mean(axis=1,skipna=True))
50. Use sum() function to find the sum of all the values over the index axis.
Ans print(df.sum(axis=0))
51. Find the median of the dataframe df.
Ans print(df.median())
52. Find the output of the following code:
import pandas as pd
data = [{'a': 10, 'b': 20},{'a': 6, 'b': 32, 'c': 22}]
df1 = pd.DataFrame(data,columns=['a','b'])
df2 = pd.DataFrame(data,columns=['a','b1'])
print(df1)
print(df2)
Ans a b
0 10 20
1 6 32
a b1
0 10 NaN
1 6 NaN
53.
11 | P a g e
54. To add dataframes df1 and df2.
Ans print(df1.add(df2))
56. To change index label of df1 from 0 to zero and from 1 to one.
Ans df1=df1.rename(index={0:'zero',1:'one'})
58. For the given code fill in the blanks so that we get the desired output with
maximum value for Quantity and Average Value for Cost:
import pandas as pd
import numpy as np
d={'Product':['Apple','Pear','Banana','Grapes'],'Quantity':[100,150,200,250],
'Cost':[1000,1500,1200,900]}
df = pd.DataFrame(d)
df1 =
print(df1)
Quantity 250.0
Cost 1150.0
dtype: float64
Ans
df1=pd.DataFrame([df['Quantity'].max(),df['Cost'].mean()],index=['Quantity','Cost'])
12 | P a g e
Ans import pandas as pd
df1=pd.DataFrame({'Icecream':['Vanila','ButterScotch','Caramel'] ,
'Cookies':['Goodday','Britannia', 'Oreo']})
df2=pd.DataFrame({'Chocolate':['DairyMilk','Kitkat'],'Icecream':['Vanila','ButterScotc
h'],'Cookies':['Hide and Seek','Britannia'})
df2.reindex_like(df1)
print(df2)
Chocolate Icecream Cookies
0 DairyMilk Vanila Hide and Seek
1 Kitkat ButterScotch Britannia
Ans print(df1.add(df2))
13 | P a g e
64. To sort df1 by Second column in descending order.
Ans df1=df1.sort_values(by=’Second’,ascending=False)
Ans df2=df2.rename(index={0:’a’,1:’b’,2:’c’,3:’d’})
66. To display those rows in df1 where value of third column is more than 45.
Ans print(df1[df1[‘Third’]>45])
14 | P a g e
72. Write a command to display the names of students who have qualified.
Ans print(df[df['qualify']=='yes'].name)
73. Consider the following DataFrame df and answer the questions given below:
Ans df=df.rename(index={0:'Zero',1:'One',2:'Two',3:'Three'})
74. Write command to add one more row to the data frame with data [5,12,33,3]
15 | P a g e
75.
Emp_ID Name Dept Salary Status
100 Kabir IT 34000 Regular
110 Rishav Finance 28500 Regular
120 Seema IT 13500 Contract
130 David IT 41000 Regular
140 Ruchi HRD 17000 Contract
Consider the above Data frame as df.
Write a Python Code to calculate the average salary of the Regular employees
and the Contract employees separately.
Ans print(df.groupby('Status').mean().Salary)
76. Write a Python Code to print the dataframe in the descending order of Salary.
Ans df=df.sort_values(by='Salary',ascending=False)
print(df)
77. Write a Python Code to update the Salary of all Contract employees to Rs
19000
Ans df.Salary[df.Status=='Contract']=19000
78. Write a Python Code to display the maximum salary of the “Contract” staff.
Ans print(df[df['Status']=='Contract'].max().Salary)
Ans print(df.iloc[3:4,:])
81. Write a Python Code to display the maximum salary of all employees in the
‘IT’ department.
Ans print(df[df.Dept=='IT'].max().Salary)
82. Write a Python Code to delete the 1st and the last record.
Ans df=df.drop([0,4])
16 | P a g e
Ans print(df[df>50].count().sum())
85. Write Python Code to count the number of even numbers and number of odd
numbers in the dataframe.
Ans print('No of Even Numbers:',df[df%2==0].count().sum())
print('No of Odd Numbers:',df[df%2==1].count().sum())
Ans print(df.groupby('State').sum().Sales)
Ans print(df.groupby('employee').sum().Sales)
89. Write Python Program to find average sales on both employee and state wise.
Ans print(df.groupby(['employee','State']).sum().Sales)
90. Write Python Program to find mean,median and minimum sale statewise.
Ans print(df.groupby('State').mean().Sales)
print(df.groupby('State').median().Sales)
print(df.groupby('State').min().Sales)
17 | P a g e