Python MCQs
Python MCQs
1. Scenario: You need to load Departments.csv into a DataFrame df1 and Tools.csv into df2. Which library
must be imported first?
o b) import numpy as np
o c) import pandas as pd
2. Scenario: After loading df1 and df2 using pd.read_csv(), you want to verify the first row of df1. Which
command achieves this?
o a) print(df1.first())
o b) print(df1.loc[1])
o c) print(df1.head(1))
3. Scenario: Imagine Tools.csv has 10 columns and 500 rows (excluding the header). After loading it into df2,
what would df2.shape return?
o a) (10, 500)
o b) (500, 10)
o c) (501, 10)
4. Scenario: If Departments.csv failed to load correctly due to an incorrect file path, what type of error would
Python typically raise?
o a) TypeError
o b) ValueError
o c) FileNotFoundError
5. Scenario: You need to merge df1 (Departments) and df2 (Tools) keeping all rows from df2. Which
pd.merge parameter is crucial for this?
o a) how='inner'
o c) how='outer', on='Abb'
o d) There are 50 missing (NaN) values specifically in the 'Department' column. Answer: d) There
are 50 missing (NaN) values specifically in the 'Department' column.
7. Scenario: You need to create a Python dictionary dept_map where keys are 'Abb' and values are
'Department' from df1 to help fill missing values. Assuming 'Abb' is unique in df1, which code snippet
works?
o a) dept_map = df1.groupby('Abb')['Department'].to_dict()
o c) dept_map = df1.set_index('Abb')['Department'].to_dict()
8. Scenario: To fill missing 'Department' values in merged_df using the dept_map created previously, which is
the most appropriate Pandas method?
o a) merged_df['Department'].fillna(merged_df['Abb'].apply(lambda x: dept_map.get(x)),
inplace=True)
o b) merged_df['Department'] = merged_df['Department'].replace(np.nan,
merged_df['Abb'].map(dept_map))
o c) merged_df['Department'].fillna(merged_df['Abb'].map(dept_map), inplace=True)
o d) merged_df.loc[merged_df['Department'].isnull(), 'Department'] =
merged_df['Abb'].map(dept_map) Answer: c)
merged_df['Department'].fillna(merged_df['Abb'].map(dept_map), inplace=True) (Option 'd' also
works, 'a' uses less direct apply, 'b' uses replace which is less standard for NaN filling with a map).
9. Scenario: After filling missing values, you want to confirm there are no NaNs left in the entire DataFrame
merged_df. Which command returns True if there are absolutely no missing values?
o a) merged_df.isnull().sum().sum() == 0
o b) merged_df.notnull().all().all()
o c) merged_df.isna().any().any() == False
10. Scenario: You need to save the cleaned merged DataFrame to a CSV file, excluding the DataFrame index.
Which parameter in to_csv() achieves this?
o a) index=False
o b) header=False
o c) save_index=False
11. Scenario: You need to remove duplicate rows based only on the combination of 'Abb' and 'Tool' columns in
merged_df. Which command is correct?
o a) unique_df = merged_df.drop_duplicates()
12. Scenario: merged_df has shape (1000, 12). After running unique_df =
merged_df.drop_duplicates(subset=['Abb', 'Tool']), unique_df.shape is (950, 12). How many rows were
identified as duplicates and removed?
o a) 950
o b) 12
o c) 1000
13. Scenario: When removing duplicates using drop_duplicates(subset=['Abb', 'Tool']), which duplicate row is
kept by default?
o d) No rows are kept if duplicates exist. Answer: b) The first occurring row (controlled by the keep
parameter, which defaults to 'first').
14. Scenario: You save the unique_df to a CSV named '12345-Unique.csv', where 12345 is your CRN. Which
pandas function call achieves this?
o a) unique_df.save_csv('12345-Unique.csv', index=False)
o b) unique_df.to_excel('12345-Unique.csv', index=False)
o c) unique_df.to_csv('12345-Unique.csv', index=False)
15. Scenario: You need to find the department with the highest variety (count of unique values) of 'Analysis'
types using the unique_df. Which code snippet finds the count of unique analysis types per department?
o a) unique_df.groupby('Department')['Analysis'].count()
o b) unique_df.groupby('Department')['Analysis'].value_counts()
o c) unique_df.groupby('Department')['Analysis'].nunique()
o d) unique_df['Department'].nunique() Answer: c)
unique_df.groupby('Department')['Analysis'].nunique()
16. Scenario: Following the previous question, how do you get the name of the department with the maximum
unique count? Let the result of the previous step be stored in a Series analysis_variety.
o a) analysis_variety.max()
o b) analysis_variety.idxmax()
o c) analysis_variety.sort_values(ascending=False).index[0]
17. Scenario: Task (d)(ii) asks for the "percentage of updating of each tool". Assuming the 'Updated' column
contains Boolean values (True/False) or 1/0, how could you calculate the percentage of entries for each
unique tool name that are marked as 'Updated' (True/1)?
o a) unique_df.groupby('Tool')['Updated'].mean() * 100
o b) unique_df['Updated'].value_counts(normalize=True) * 100
o c) unique_df.groupby('Tool')['Updated'].sum() / unique_df.groupby('Tool')['Updated'].count() *
100
o d) Both a and c Answer: d) Both a and c (mean() on boolean/1-0 data calculates the proportion of
True/1s).
18. Scenario: If a specific tool 'ToolX' appears 10 times in unique_df, and 3 of these entries have Updated ==
True, what would unique_df.groupby('Tool')['Updated'].mean().loc['ToolX'] return?
o a) 3
o b) 0.3
o c) 30
o d) 7 Answer: b) 0.3 (The mean of [True, True, True, False, False, False, False, False, False, False]
treated as [1, 1, 1, 0, 0, 0, 0, 0, 0, 0] is 3/10 = 0.3).
19. Scenario: To create a bar chart showing the count of tools per department, you first need to calculate these
counts. Which code prepares the data counts_per_dept for plotting?
o a) counts_per_dept = unique_df.groupby('Department')['Tool'].nunique()
o b) counts_per_dept = unique_df['Department'].value_counts()
o c) counts_per_dept = unique_df.groupby('Department').size()
o d) Both b and c Answer: d) Both b and c (value_counts() on the Department column or grouping by
Department and using size() or count() will give the number of rows/tool entries per department).
20. Scenario: You have the counts_per_dept Series. Which command using pandas plotting interface generates
the required vertical bar chart?
o a) counts_per_dept.plot(kind='pie')
o b) counts_per_dept.plot(kind='bar')
o c) counts_per_dept.plot(kind='line')
21. Scenario: For the pie chart of the 'Analysis' column distribution, what data does the size of each slice
represent?
o d) The number of tools performing that analysis type. Answer: c) The relative frequency
(percentage) of each analysis type in the dataset.
22. Scenario: Which code generates the data needed for the 'Analysis' pie chart?
o a) analysis_counts = unique_df.groupby('Analysis').size()
o b) analysis_counts = unique_df['Analysis'].value_counts()
o c) analysis_counts = unique_df['Analysis'].unique()
23. Scenario: Task (e)(iii) requires a bar plot showing the number of tools marked as "Updated". If you
interpret this as comparing the total count of 'Updated' entries vs 'Not Updated' entries, what data source
(Series) would you plot?
o a) unique_df['Updated'].value_counts()
o b) unique_df.groupby('Updated').size()
o c) unique_df.groupby('Tool')['Updated'].sum()
o a) updated_counts.plot(kind='pie')
o b) updated_counts.plot(kind='barh')
o c) updated_counts.plot(kind='line')
25. Scenario: If the 'Date' column was loaded as strings instead of datetime objects, which Pandas function is
used to convert it correctly?
o a) pd.to_datetime(unique_df['Date'])
o b) unique_df['Date'].astype('datetime64[ns]')
o c) pd.convert_dtypes(unique_df['Date'])
26. Scenario: You want to calculate the standard deviation of the number of tools used per department. Which
sequence of operations is needed?
o b) Group by 'Department', count 'Tool' (size()), then apply .std() to the resulting Series.
o d) Use unique_df.describe() and find the 'std' row for 'Department'. Answer: b) Group by
'Department', count 'Tool' (size()), then apply .std() to the resulting Series.
27. Scenario: Suppose you want to find if there's a correlation between the number of tools a department uses
and the variety of analysis types it employs. Which correlation method in Pandas would be suitable after
calculating these two series (tools_count and analysis_variety)?
o a) tools_count.corr(analysis_variety)
o c) np.correlate(tools_count, analysis_variety)
o d) Both a and b provide the correlation coefficient between the two measures. Answer: d) Both a
and b provide the correlation coefficient between the two measures.
28. Scenario: Which NumPy function could be used to efficiently check if any value in the 'Updated' column
(once converted to boolean) is True?
o a) np.sum(unique_df['Updated'])
o b) np.any(unique_df['Updated'])
o c) np.all(unique_df['Updated'])
29. Scenario: Imagine you want to select all rows from unique_df where the 'Tool desc' column contains the
word "AI". Which Pandas string method is appropriate?
o a) unique_df[unique_df['Tool desc'].contains('AI')]
o b) unique_df[unique_df['Tool desc'].str.contains('AI')]
30. Scenario: To calculate the median 'first used' Date for tools within each 'Analysis' type, you would group
by 'Analysis' and then apply which aggregation function to the 'Date' column (assuming it's datetime)?
o a) .mean()
o b) .count()
o c) .median()
o d) .mode() Answer: c) .median()
31. Scenario: If you create a new column 'Years Since First Use' based on the 'Date' column (datetime) and the
current date (pd.Timestamp.now()), which expression calculates this approximately?
o a) (pd.Timestamp.now() - unique_df['Date']).dt.years
o b) (pd.Timestamp.now().year - unique_df['Date'].dt.year)
o d) Both b and c provide valid ways to estimate years (b is simpler integer difference, c is more
precise float). Answer: d) Both b and c provide valid ways to estimate years (b is simpler integer
difference, c is more precise float).
o d) Primarily used for merging DataFrames. Answer: b) Less efficient than vectorized operations in
Pandas/NumPy.
33. Scenario: If df1 had 10 departments and df2 had tools used by only 8 of these departments, what would be
the result of an inner merge on 'Abb'?
o d) An error because not all keys match. Answer: b) Rows corresponding to only the 8 departments
present in df2.
34. Scenario: Which Python data structure is returned by df1.set_index('Abb')['Department'] before calling
.to_dict()?
o a) A NumPy array
o b) A Python list
o c) A Pandas DataFrame
35. Scenario: To find tools used only by the 'Education' department, which approach is most direct using
pandas filtering and grouping?
36. Scenario: If the 'Updated' column contained strings like 'Yes', 'No', 'YES', 'no', what's the first step using
pandas string methods before mapping to Boolean?
o a) unique_df['Updated'].str.upper()
o b) unique_df['Updated'].str.lower()
o c) unique_df['Updated'].str.capitalize()
37. Scenario: Displaying the .shape before and after removing duplicates directly quantifies:
o b) The number of rows removed due to duplication based on the specified columns.
o d) The change in memory usage of the DataFrame. Answer: b) The number of rows removed due to
duplication based on the specified columns.
38. Scenario: Which library is most commonly used alongside Pandas for numerical operations and underpins
many Pandas functionalities?
o a) Matplotlib
o b) SciPy
o c) NumPy
39. Scenario: If you were asked to build a function that takes a department name as input and returns a list of
unique tools used by that department from unique_df, which code structure would be appropriate?
40. Scenario: To quickly get summary statistics (count, mean, std, min, max, quartiles) for numerical columns
potentially present in unique_df (if any existed), which Pandas method is used?
o a) .info()
o b) .describe()
o c) .head()
42. Scenario: You want to filter unique_df to show only the tools used by the 'Health' department OR the
'Education' department. Which code works?
o b) unique_df[unique_df['Department'].isin(['Health', 'Education'])]
43. Scenario: After finding the department with the highest variety of 'Analysis' types (let's say its name is
stored in dept_max_variety), how would you filter unique_df to show only the rows corresponding to this
specific department?
o a) unique_df[unique_df['Department'] == dept_max_variety]
o d) unique_df.groupby('Department').get_group(dept_max_variety)
44. Scenario: Imagine the 'Date' column is already converted to datetime objects. How would you find the
number of tools first used specifically in the year 2020?
o a) unique_df[unique_df['Date'].dt.year == 2020].shape[0]
o c) sum(unique_df['Date'].dt.year == 2020)
o d) All of the above (with a note about potential KeyError for b) Answer: d) All of the above (with a
note about potential KeyError for b) - a and c are generally safer.
45. Scenario: You want to count how many unique tools have a description ('Tool desc') longer than 50
characters. Which is the correct approach?
47. Scenario: You need to create a Series showing the most frequent 'Analysis' type used by each department.
Which combination of Pandas methods is most suitable?
o c) unique_df.groupby('Department')['Analysis'].describe()['top']
o d) Both a (with handling for ties) and c provide a way to get the most frequent type per department.
Answer: d) Both a (with handling for ties) and c provide a way to get the most frequent type per
department. (c is often simpler if only one mode is needed).
48. Scenario: What pandas command would select all columns except for 'Tool desc' and 'Output' from
unique_df?
o b) unique_df.select(lambda col: col not in ['Tool desc', 'Output'], axis=1) (Select is not standard)
49. Scenario: If you wanted to see if any 'Tool' name appears within its own 'Tool desc' column (e.g., tool
'Analyzer' is mentioned in its description), how might you check this for the first 10 rows? (Requires
combining columns row-wise)
o a) unique_df.head(10).apply(lambda row: row['Tool'] in row['Tool desc'], axis=1)
o b) unique_df['Tool'].head(10).isin(unique_df['Tool desc'].head(10)) (Checks if Tool name equals
description)
o c) [unique_df['Tool'][i] in unique_df['Tool desc'][i] for i in range(10)] (Assuming default integer
index)
o d) Both a and c (a is more robust to index changes) Answer: d) Both a and c (a is more robust to
index changes)
50. Scenario: You want to replace the 'Analysis' type 'Descriptive' with 'Summary' and 'Predictive' with
'Forecast' only in the 'Analysis' column of unique_df. Which command works best?
o a) unique_df['Analysis'].replace({'Descriptive': 'Summary', 'Predictive': 'Forecast'}, inplace=True)
o b) unique_df['Analysis'].map({'Descriptive': 'Summary', 'Predictive': 'Forecast'}) (Map replaces non-
matching with NaN unless default is set)
o c) unique_df.replace({'Analysis': {'Descriptive': 'Summary', 'Predictive': 'Forecast'}}, inplace=True)
o d) Both a and c Answer: d) Both a and c (a targets the specific column, c targets the value within the
specified column dictionary).
Python Practical MCQs — Al-Beruni City Case Study Context
MCQ 1
A) import panda as pd
B) import pandas as pd
C) import pandas.dataframe
D) from pandas import *
Answer: B
MCQ 2
If the file Departments.csv is located in the working directory, which code will load it into DataFrame df1?
A) df1 = pd.readfile('Departments.csv')
B) df1 = pd.read_csv('Departments.csv')
C) df1 = pd.load_csv('Departments.csv')
D) df1 = pd.readfile_csv('Departments.csv')
Answer: B
MCQ 3
To load Tools.csv file in df2 and view the first row only:
A) df2.head()
B) df2.loc[0]
C) df2.iloc[0]
D) df2.head(1)
Answer: D
MCQ 4
Which Python library is essential for numerical analysis and array operations in this assignment?
A) Matplotlib
B) Pandas
C) NumPy
D) Seaborn
Answer: C
MCQ 5
Which of the following is the correct command to display column names of df1?
A) df1.column_names()
B) df1.columns
C) df1.column()
D) df1.col_names
Answer: B
MCQ 6
To merge df1 and df2 such that all rows of df2 must appear in merged data:
Answer: C
MCQ 7
A) df.isnull().sum()
B) df.isnull.count()
C) df.checknull()
D) df.isna()
Answer: A
MCQ 8
If 'Department' column has missing values and we have a dictionary mapping, which method should be used to fill
it?
A) fillna()
B) map().fillna()
C) replace()
D) dropna()
Answer: B
MCQ 9
After filling missing values, which command ensures there are no missing values?
A) df.isna().sum()
B) df.isnull().sum() == 0
C) df.dropna()
D) df.notnull().sum()
Answer: B
MCQ 10
A) df.to_csv('filename.csv')
B) df.to_csv('filename.csv', index=True)
C) df.save_csv('filename.csv')
D) df.to_csv('filename.csv', index=False)
Answer: D
MCQ 11
Answer: A
MCQ 12
A) df.shape()
B) df.size()
C) df.shape
D) df.count()
Answer: C
MCQ 13
A) df.to_csv('CRN-Unique.csv', index=False)
B) df.save('CRN-Unique.csv')
C) df.save_csv('CRN-Unique.csv')
D) df.to_csv('CRN_Unique.csv')
Answer: A
MCQ 14
To find the department with the highest variety of 'Analysis' types:
A) df['Analysis'].value_counts()
B) df.groupby('Department')['Analysis'].nunique().idxmax()
C) df.groupby('Analysis')['Department'].count()
D) df['Department'].nunique()
Answer: B
MCQ 15
Answer: A
MCQ 16
A) df['Department'].plot(kind='bar')
B) df['Department'].value_counts().plot(kind='bar')
C) sns.barplot(x='Department', y='Tool Name', data=df)
D) plt.bar(df['Department'], df['Tool Name'])
Answer: B
MCQ 17
A) df['Analysis'].value_counts().plot.pie()
B) plt.pie(df['Analysis'])
C) df['Analysis'].plot(kind='pie')
D) sns.pieplot(df['Analysis'])
Answer: A
MCQ 18
A) sns.countplot(x='Updated', data=df)
B) sns.barplot(x='Updated', y='Tool Name', data=df)
C) df['Updated'].plot(kind='bar')
D) plt.bar('Updated', 'Tool Name')
Answer: A
MCQ 19
A) plt.xticks(rotation=45)
B) plt.xlabels(45)
C) plt.xlabel(rotation=45)
D) plt.rotate(45)
Answer: A
MCQ 20
A) autopct='%1.1f%%'
B) data_label='inside'
C) plt.labels(inside=True)
D) df.plot.pie(labels='inside')
Answer: A
MCQ 21
To find total number of tools used per department after removing duplicates:
A) df['Department'].value_counts()
B) df.groupby('Department')['Tool Name'].count()
C) df.groupby('Tool Name')['Department'].count()
D) df['Tool Name'].count()
Answer: B
MCQ 22
A) df['Population'].mean()
B) df.mean('Population')
C) np.mean(df['Population'])
D) Both A and C
Answer: D
MCQ 23
Answer: D
MCQ 24
A) df.corr()
B) df.correlation()
C) df.cov()
D) df.group_corr()
Answer: A
MCQ 25
A) df[df['Updated'] == 'No']
B) df.loc[df['Updated'] == 'No']
C) df.query("Updated == 'No'")
D) All of the above
Answer: D
MCQ 26
Answer: D
MCQ 27
Answer: D
MCQ 28
A) df['Population'].median()
B) df.median('Population')
C) np.median(df['Population'])
D) Both A and C
Answer: D
MCQ 29
A) df['Column'] = df['Column'].str.strip()
B) df['Column'] = df['Column'].strip()
C) df['Column'] = df.strip('Column')
D) df['Column'].remove_whitespace()
Answer: A
MCQ 30
A) df.fillna('Unknown')
B) df.replace(np.nan, 'Unknown')
C) df['Column'] = df['Column'].fillna('Unknown')
D) All of the above
Answer: D
MCQ 31
Which pandas function returns basic statistics like count, mean, std, min, max?
A) df.stats()
B) df.describe()
C) df.summary()
D) df.explain()
Answer: B
MCQ 32
A) df.reset_index()
B) df.reset_index(drop=True, inplace=True)
C) df.index_reset()
D) df.drop_index()
Answer: B
MCQ 33
A) Line Chart
B) Histogram
C) Pie Chart
D) Bar Chart
Answer: B
MCQ 34
Answer: A
MCQ 35
A) pandas
B) numpy
C) mlxtend
D) seaborn
Answer: C
MCQ 36
A) Regression
B) Clustering
C) Classification
D) Time Series Forecasting
Answer: D
MCQ 37
Answer: B
MCQ 38
A) df.info()
B) df.summary()
C) df.describe()
D) df.columns
Answer: A
MCQ 39
Answer: C
MCQ 40
A) df.groupby('Department')['Population'].sum()
B) df.sum('Population').groupby('Department')
C) df.group('Department').sum('Population')
D) groupby(df['Department'])
Answer: A
MCQ 41
Answer: A
MCQ 42
A) Clustering
B) Classification
C) Time Series
D) PCA
Answer: B
MCQ 43
A) df.dropna(how='all')
B) df.dropna(all=True)
C) df.drop_allna()
D) df.remove_blank_rows()
Answer: A
MCQ 44
Answer: A
MCQ 45
A) df['Population'].var()
B) np.var(df['Population'])
C) df.var('Population')
D) Both A and B
Answer: D
MCQ 46
MCQ 47
A) df.tail()
B) df.head()
C) df[-5:]
D) Both A and C
Answer: D
MCQ 48
For supply chain optimization model using Python, which technique is used?
A) Linear Programming
B) Clustering
C) Regression
D) Market Basket
Answer: A
MCQ 49
A) df.dtypes
B) df.types()
C) df.datatypes()
D) df.columns.dtypes
Answer: A
MCQ 50
A) pd.to_datetime(df['Date'])
B) df['Date'].astype('datetime')
C) df['Date'].convert('datetime')
D) Both A and B
Answer: A