0% found this document useful (0 votes)

27 views21 pages

Python MCQs

The document contains multiple-choice questions (MCQs) focused on Python data analysis using Pandas, covering topics such as data loading, merging, handling missing values, removing duplicates, data analysis, visualization, and advanced scenarios. Each question presents a practical scenario with options, and the correct answer is provided for each. The document serves as a study guide for individuals looking to enhance their skills in data analysis with Python and Pandas.

Uploaded by

Hʌɱʑʌ Awʌŋ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views21 pages

Python MCQs

Uploaded by

Hʌɱʑʌ Awʌŋ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Al-Beruni City: Python Data Analysis MCQs (Practical Scenarios)

Section 1: Data Loading and Initial Inspection

1. Scenario: You need to load Departments.csv into a DataFrame df1 and Tools.csv into df2. Which library
must be imported first?

o a) import matplotlib.pyplot as plt

o b) import numpy as np

o c) import pandas as pd

o d) import csv Answer: c) import pandas as pd

2. Scenario: After loading df1 and df2 using pd.read_csv(), you want to verify the first row of df1. Which
command achieves this?

o a) print(df1.first())

o b) print(df1.loc[1])

o c) print(df1.head(1))

o d) print(df1.iloc[1]) Answer: c) print(df1.head(1))

3. Scenario: Imagine Tools.csv has 10 columns and 500 rows (excluding the header). After loading it into df2,
what would df2.shape return?

o a) (10, 500)

o b) (500, 10)

o c) (501, 10)

o d) (500, 11) Answer: b) (500, 10)

4. Scenario: If Departments.csv failed to load correctly due to an incorrect file path, what type of error would
Python typically raise?

o a) TypeError

o b) ValueError

o c) FileNotFoundError

o d) KeyError Answer: c) FileNotFoundError

Section 2: Merging and Missing Values

5. Scenario: You need to merge df1 (Departments) and df2 (Tools) keeping all rows from df2. Which
pd.merge parameter is crucial for this?

o a) how='inner'

o b) how='left', left_on='Abb', right_on='Abb' (assuming df1 is left)

o c) how='outer', on='Abb'

o d) how='right', left_on='Abb', right_on='Abb' (assuming df1 is left) Answer: d) how='right',

left_on='Abb', right_on='Abb' (Keeps all rows from the right DataFrame, df2)
6. Scenario: After merging, you run merged_df.isnull().sum(). The output shows Department 50. What does
this indicate?

o a) The 'Department' column has 50 unique values.

o b) 50 rows have missing values across all columns.

o c) The 'Department' column contains the string "50" in some rows.

o d) There are 50 missing (NaN) values specifically in the 'Department' column. Answer: d) There
are 50 missing (NaN) values specifically in the 'Department' column.

7. Scenario: You need to create a Python dictionary dept_map where keys are 'Abb' and values are
'Department' from df1 to help fill missing values. Assuming 'Abb' is unique in df1, which code snippet
works?

o a) dept_map = df1.groupby('Abb')['Department'].to_dict()

o b) dept_map = dict(df1[['Abb', 'Department']].values)

o c) dept_map = df1.set_index('Abb')['Department'].to_dict()

o d) dept_map = {row['Abb']: row['Department'] for index, row in df1.iterrows()} Answer: c)

dept_map = df1.set_index('Abb')['Department'].to_dict() (Option 'd' also works but is generally less
efficient than vectorized methods like 'c'). Option 'b' requires specific structuring. 'a' is for group
objects.

8. Scenario: To fill missing 'Department' values in merged_df using the dept_map created previously, which is
the most appropriate Pandas method?

o a) merged_df['Department'].fillna(merged_df['Abb'].apply(lambda x: dept_map.get(x)),
inplace=True)

o b) merged_df['Department'] = merged_df['Department'].replace(np.nan,
merged_df['Abb'].map(dept_map))

o c) merged_df['Department'].fillna(merged_df['Abb'].map(dept_map), inplace=True)

o d) merged_df.loc[merged_df['Department'].isnull(), 'Department'] =
merged_df['Abb'].map(dept_map) Answer: c)
merged_df['Department'].fillna(merged_df['Abb'].map(dept_map), inplace=True) (Option 'd' also
works, 'a' uses less direct apply, 'b' uses replace which is less standard for NaN filling with a map).

9. Scenario: After filling missing values, you want to confirm there are no NaNs left in the entire DataFrame
merged_df. Which command returns True if there are absolutely no missing values?

o a) merged_df.isnull().sum().sum() == 0

o b) merged_df.notnull().all().all()

o c) merged_df.isna().any().any() == False

o d) All of the above Answer: d) All of the above

10. Scenario: You need to save the cleaned merged DataFrame to a CSV file, excluding the DataFrame index.
Which parameter in to_csv() achieves this?

o a) index=False

o b) header=False
o c) save_index=False

o d) no_index=True Answer: a) index=False

Section 3: Removing Duplicates

11. Scenario: You need to remove duplicate rows based only on the combination of 'Abb' and 'Tool' columns in
merged_df. Which command is correct?

o a) unique_df = merged_df.drop_duplicates()

o b) unique_df = merged_df.drop_duplicates(subset=['Abb', 'Tool'])

o c) unique_df = merged_df.remove_duplicates(on=['Abb', 'Tool'])

o d) unique_df = merged_df[~merged_df.duplicated(subset=['Abb', 'Tool'])] Answer: b) unique_df =

merged_df.drop_duplicates(subset=['Abb', 'Tool']) (Option 'd' also works but 'b' is more direct).

12. Scenario: merged_df has shape (1000, 12). After running unique_df =
merged_df.drop_duplicates(subset=['Abb', 'Tool']), unique_df.shape is (950, 12). How many rows were
identified as duplicates and removed?

o a) 950

o b) 12

o c) 1000

o d) 50 Answer: d) 50 (1000 - 950)

13. Scenario: When removing duplicates using drop_duplicates(subset=['Abb', 'Tool']), which duplicate row is
kept by default?

o a) The last occurring row.

o b) The first occurring row.

o c) A randomly selected row.

o d) No rows are kept if duplicates exist. Answer: b) The first occurring row (controlled by the keep
parameter, which defaults to 'first').

14. Scenario: You save the unique_df to a CSV named '12345-Unique.csv', where 12345 is your CRN. Which
pandas function call achieves this?

o a) unique_df.save_csv('12345-Unique.csv', index=False)

o b) unique_df.to_excel('12345-Unique.csv', index=False)

o c) unique_df.to_csv('12345-Unique.csv', index=False)

o d) pd.write_csv(unique_df, '12345-Unique.csv', index=False) Answer: c) unique_df.to_csv('12345-

Unique.csv', index=False)

Section 4: Data Analysis

15. Scenario: You need to find the department with the highest variety (count of unique values) of 'Analysis'
types using the unique_df. Which code snippet finds the count of unique analysis types per department?

o a) unique_df.groupby('Department')['Analysis'].count()
o b) unique_df.groupby('Department')['Analysis'].value_counts()

o c) unique_df.groupby('Department')['Analysis'].nunique()

o d) unique_df['Department'].nunique() Answer: c)
unique_df.groupby('Department')['Analysis'].nunique()

16. Scenario: Following the previous question, how do you get the name of the department with the maximum
unique count? Let the result of the previous step be stored in a Series analysis_variety.

o a) analysis_variety.max()

o b) analysis_variety.idxmax()

o c) analysis_variety.sort_values(ascending=False).index[0]

o d) Both b and c Answer: d) Both b and c

17. Scenario: Task (d)(ii) asks for the "percentage of updating of each tool". Assuming the 'Updated' column
contains Boolean values (True/False) or 1/0, how could you calculate the percentage of entries for each
unique tool name that are marked as 'Updated' (True/1)?

o a) unique_df.groupby('Tool')['Updated'].mean() * 100

o b) unique_df['Updated'].value_counts(normalize=True) * 100

o c) unique_df.groupby('Tool')['Updated'].sum() / unique_df.groupby('Tool')['Updated'].count() *
100

o d) Both a and c Answer: d) Both a and c (mean() on boolean/1-0 data calculates the proportion of
True/1s).

18. Scenario: If a specific tool 'ToolX' appears 10 times in unique_df, and 3 of these entries have Updated ==
True, what would unique_df.groupby('Tool')['Updated'].mean().loc['ToolX'] return?

o a) 3

o b) 0.3

o c) 30

o d) 7 Answer: b) 0.3 (The mean of [True, True, True, False, False, False, False, False, False, False]
treated as [1, 1, 1, 0, 0, 0, 0, 0, 0, 0] is 3/10 = 0.3).

Section 5: Data Visualization

19. Scenario: To create a bar chart showing the count of tools per department, you first need to calculate these
counts. Which code prepares the data counts_per_dept for plotting?

o a) counts_per_dept = unique_df.groupby('Department')['Tool'].nunique()

o b) counts_per_dept = unique_df['Department'].value_counts()

o c) counts_per_dept = unique_df.groupby('Department').size()

o d) Both b and c Answer: d) Both b and c (value_counts() on the Department column or grouping by
Department and using size() or count() will give the number of rows/tool entries per department).

20. Scenario: You have the counts_per_dept Series. Which command using pandas plotting interface generates
the required vertical bar chart?
o a) counts_per_dept.plot(kind='pie')

o b) counts_per_dept.plot(kind='bar')

o c) counts_per_dept.plot(kind='line')

o d) counts_per_dept.plot.barh() Answer: b) counts_per_dept.plot(kind='bar')

21. Scenario: For the pie chart of the 'Analysis' column distribution, what data does the size of each slice
represent?

o a) The number of departments using that analysis type.

o b) The average usage date of that analysis type.

o c) The relative frequency (percentage) of each analysis type in the dataset.

o d) The number of tools performing that analysis type. Answer: c) The relative frequency
(percentage) of each analysis type in the dataset.

22. Scenario: Which code generates the data needed for the 'Analysis' pie chart?

o a) analysis_counts = unique_df.groupby('Analysis').size()

o b) analysis_counts = unique_df['Analysis'].value_counts()

o c) analysis_counts = unique_df['Analysis'].unique()

o d) Both a and b Answer: d) Both a and b

23. Scenario: Task (e)(iii) requires a bar plot showing the number of tools marked as "Updated". If you
interpret this as comparing the total count of 'Updated' entries vs 'Not Updated' entries, what data source
(Series) would you plot?

o a) unique_df['Updated'].value_counts()

o b) unique_df.groupby('Updated').size()

o c) unique_df.groupby('Tool')['Updated'].sum()

o d) Both a and b Answer: d) Both a and b

24. Scenario: Given the data updated_counts = unique_df['Updated'].value_counts(), which command

generates the bar plot comparing the counts of True/False (or Yes/No) values?

o a) updated_counts.plot(kind='pie')

o b) updated_counts.plot(kind='barh')

o c) updated_counts.plot(kind='line')

o d) updated_counts.plot(kind='bar') Answer: d) updated_counts.plot(kind='bar')

Section 6: Python/Pandas Concepts & Advanced Scenarios

25. Scenario: If the 'Date' column was loaded as strings instead of datetime objects, which Pandas function is
used to convert it correctly?

o a) pd.to_datetime(unique_df['Date'])

o b) unique_df['Date'].astype('datetime64[ns]')
o c) pd.convert_dtypes(unique_df['Date'])

o d) Both a and b Answer: d) Both a and b

26. Scenario: You want to calculate the standard deviation of the number of tools used per department. Which
sequence of operations is needed?

o a) Calculate value_counts() on 'Department', then apply .std().

o b) Group by 'Department', count 'Tool' (size()), then apply .std() to the resulting Series.

o c) Calculate standard deviation directly on the 'Tool' column.

o d) Use unique_df.describe() and find the 'std' row for 'Department'. Answer: b) Group by
'Department', count 'Tool' (size()), then apply .std() to the resulting Series.

27. Scenario: Suppose you want to find if there's a correlation between the number of tools a department uses
and the variety of analysis types it employs. Which correlation method in Pandas would be suitable after
calculating these two series (tools_count and analysis_variety)?

o a) tools_count.corr(analysis_variety)

o b) pd.DataFrame({'tools': tools_count, 'variety': analysis_variety}).corr()

o c) np.correlate(tools_count, analysis_variety)

o d) Both a and b provide the correlation coefficient between the two measures. Answer: d) Both a
and b provide the correlation coefficient between the two measures.

28. Scenario: Which NumPy function could be used to efficiently check if any value in the 'Updated' column
(once converted to boolean) is True?

o a) np.sum(unique_df['Updated'])

o b) np.any(unique_df['Updated'])

o c) np.all(unique_df['Updated'])

o d) np.mean(unique_df['Updated']) Answer: b) np.any(unique_df['Updated'])

29. Scenario: Imagine you want to select all rows from unique_df where the 'Tool desc' column contains the
word "AI". Which Pandas string method is appropriate?

o a) unique_df[unique_df['Tool desc'].contains('AI')]

o b) unique_df[unique_df['Tool desc'].str.contains('AI')]

o c) unique_df[unique_df['Tool desc'].find('AI') != -1]

o d) unique_df[unique_df['Tool desc'].match('AI')] Answer: b) unique_df[unique_df['Tool

desc'].str.contains('AI')]

30. Scenario: To calculate the median 'first used' Date for tools within each 'Analysis' type, you would group
by 'Analysis' and then apply which aggregation function to the 'Date' column (assuming it's datetime)?

o a) .mean()

o b) .count()

o c) .median()
o d) .mode() Answer: c) .median()

31. Scenario: If you create a new column 'Years Since First Use' based on the 'Date' column (datetime) and the
current date (pd.Timestamp.now()), which expression calculates this approximately?

o a) (pd.Timestamp.now() - unique_df['Date']).dt.years

o b) (pd.Timestamp.now().year - unique_df['Date'].dt.year)

o c) (pd.Timestamp.now() - unique_df['Date']) / np.timedelta64(1, 'Y')

o d) Both b and c provide valid ways to estimate years (b is simpler integer difference, c is more
precise float). Answer: d) Both b and c provide valid ways to estimate years (b is simpler integer
difference, c is more precise float).

32. Scenario: Applying a function row-by-row using .apply(..., axis=1) is generally:

o a) More efficient than vectorized operations in Pandas/NumPy.

o b) Less efficient than vectorized operations in Pandas/NumPy.

o c) The only way to perform complex row-wise calculations.

o d) Primarily used for merging DataFrames. Answer: b) Less efficient than vectorized operations in
Pandas/NumPy.

33. Scenario: If df1 had 10 departments and df2 had tools used by only 8 of these departments, what would be
the result of an inner merge on 'Abb'?

o a) Rows corresponding to all 10 departments.

o b) Rows corresponding to only the 8 departments present in df2.

o c) Rows corresponding to the 2 departments only present in df1.

o d) An error because not all keys match. Answer: b) Rows corresponding to only the 8 departments
present in df2.

34. Scenario: Which Python data structure is returned by df1.set_index('Abb')['Department'] before calling
.to_dict()?

o a) A NumPy array

o b) A Python list

o c) A Pandas DataFrame

o d) A Pandas Series Answer: d) A Pandas Series

35. Scenario: To find tools used only by the 'Education' department, which approach is most direct using
pandas filtering and grouping?

o a) unique_df.groupby('Tool').filter(lambda x: (x['Department'] == 'Education').all() and

len(x['Department'].unique()) == 1)

o b) tool_counts = unique_df.groupby('Tool')['Department'].nunique(); single_dept_tools =

tool_counts[tool_counts == 1].index; unique_df[(unique_df['Tool'].isin(single_dept_tools)) &
(unique_df['Department'] == 'Education')]

o c) unique_df[unique_df['Department'] == 'Education']['Tool'].unique() (This gets tools used by

Education, not only Education)
o d) Both a and b achieve the goal (b is often more readable). Answer: d) Both a and b achieve the
goal (b is often more readable).

36. Scenario: If the 'Updated' column contained strings like 'Yes', 'No', 'YES', 'no', what's the first step using
pandas string methods before mapping to Boolean?

o a) unique_df['Updated'].str.upper()

o b) unique_df['Updated'].str.lower()

o c) unique_df['Updated'].str.capitalize()

o d) unique_df['Updated'].str.strip() Answer: b) unique_df['Updated'].str.lower() (Or upper, to

standardize case).

37. Scenario: Displaying the .shape before and after removing duplicates directly quantifies:

o a) The number of missing values handled.

o b) The number of rows removed due to duplication based on the specified columns.

o c) The number of columns remaining after cleaning.

o d) The change in memory usage of the DataFrame. Answer: b) The number of rows removed due to
duplication based on the specified columns.

38. Scenario: Which library is most commonly used alongside Pandas for numerical operations and underpins
many Pandas functionalities?

o a) Matplotlib

o b) SciPy

o c) NumPy

o d) Scikit-learn Answer: c) NumPy

39. Scenario: If you were asked to build a function that takes a department name as input and returns a list of
unique tools used by that department from unique_df, which code structure would be appropriate?

o a) def get_tools(dept_name): return unique_df[unique_df['Department'] ==

dept_name]['Tool'].tolist()

o b) def get_tools(dept_name): return unique_df[unique_df['Department'] ==

dept_name]['Tool'].unique().tolist()

o c) def get_tools(dept_name): return

unique_df.groupby('Department')['Tool'].unique().loc[dept_name].tolist()

o d) Both b and c Answer: d) Both b and c

40. Scenario: To quickly get summary statistics (count, mean, std, min, max, quartiles) for numerical columns
potentially present in unique_df (if any existed), which Pandas method is used?

o a) .info()

o b) .describe()

o c) .head()

o d) .corr() Answer: b) .describe()

41. Scenario: You want to add a column IsHealthDept which is True if Department is 'Health' and False
otherwise. Which is a correct way?

o a) unique_df['IsHealthDept'] = unique_df['Department'] == 'Health'

o b) unique_df['IsHealthDept'] = unique_df['Department'].apply(lambda x: True if x == 'Health' else

False)

o c) unique_df['IsHealthDept'] = np.where(unique_df['Department'] == 'Health', True, False)

o d) All of the above Answer: d) All of the above

42. Scenario: You want to filter unique_df to show only the tools used by the 'Health' department OR the
'Education' department. Which code works?

o a) unique_df[(unique_df['Department'] == 'Health') & (unique_df['Department'] == 'Education')]

o b) unique_df[unique_df['Department'].isin(['Health', 'Education'])]

o c) unique_df.query("Department == 'Health' | Department == 'Education'")

o d) Both b and c Answer: d) Both b and c

43. Scenario: After finding the department with the highest variety of 'Analysis' types (let's say its name is
stored in dept_max_variety), how would you filter unique_df to show only the rows corresponding to this
specific department?

o a) unique_df[unique_df['Department'] == dept_max_variety]

o b) unique_df.loc[dept_max_variety] (Incorrect indexing for this)

o c) unique_df.filter(like=dept_max_variety, axis=0) (Incorrect use of filter)

o d) unique_df.groupby('Department').get_group(dept_max_variety)

o e) Both a and d Answer: e) Both a and d

44. Scenario: Imagine the 'Date' column is already converted to datetime objects. How would you find the
number of tools first used specifically in the year 2020?

o a) unique_df[unique_df['Date'].dt.year == 2020].shape[0]

o b) unique_df['Date'].dt.year.value_counts().loc[2020] (Might raise KeyError if no tools from 2020)

o c) sum(unique_df['Date'].dt.year == 2020)

o d) All of the above (with a note about potential KeyError for b) Answer: d) All of the above (with a
note about potential KeyError for b) - a and c are generally safer.

45. Scenario: You want to count how many unique tools have a description ('Tool desc') longer than 50
characters. Which is the correct approach?

o a) unique_df[unique_df['Tool desc'].str.len() > 50]['Tool'].count() (Counts rows, not unique tools)

o b) unique_df[unique_df['Tool desc'].str.len() > 50]['Tool'].nunique()

o c) sum(unique_df['Tool desc'].str.len() > 50) (Counts rows)

o d) len(unique_df[unique_df['Tool desc'].str.len() > 50]) (Counts rows) Answer: b)

unique_df[unique_df['Tool desc'].str.len() > 50]['Tool'].nunique()
46. Scenario: How would you calculate the total number of tool entries that are both for the 'Public Safety'
department and have the 'Analysis' type 'Predictive'?

o a) unique_df.query("Department == 'Public Safety' and Analysis == 'Predictive'").shape[0]

o b) len(unique_df[(unique_df['Department'] == 'Public Safety') & (unique_df['Analysis'] ==

'Predictive')])

o c) sum((unique_df['Department'] == 'Public Safety') & (unique_df['Analysis'] == 'Predictive'))

o d) All of the above Answer: d) All of the above

47. Scenario: You need to create a Series showing the most frequent 'Analysis' type used by each department.
Which combination of Pandas methods is most suitable?

o a) unique_df.groupby('Department')['Analysis'].mode().reset_index(level=1, drop=True) (Mode can

return multiple values if tied, needs handling)

o b) unique_df.groupby('Department')['Analysis'].value_counts().idxmax() (This finds the overall max

combo, not per dept)

o c) unique_df.groupby('Department')['Analysis'].describe()['top']

o d) Both a (with handling for ties) and c provide a way to get the most frequent type per department.
Answer: d) Both a (with handling for ties) and c provide a way to get the most frequent type per
department. (c is often simpler if only one mode is needed).

48. Scenario: What pandas command would select all columns except for 'Tool desc' and 'Output' from
unique_df?

o a) unique_df.drop(columns=['Tool desc', 'Output'])

o b) unique_df.select(lambda col: col not in ['Tool desc', 'Output'], axis=1) (Select is not standard)

o c) unique_df.loc[:, ~unique_df.columns.isin(['Tool desc', 'Output'])]

o d) Both a and c Answer: d) Both a and c

49. Scenario: If you wanted to see if any 'Tool' name appears within its own 'Tool desc' column (e.g., tool
'Analyzer' is mentioned in its description), how might you check this for the first 10 rows? (Requires
combining columns row-wise)
o a) unique_df.head(10).apply(lambda row: row['Tool'] in row['Tool desc'], axis=1)
o b) unique_df['Tool'].head(10).isin(unique_df['Tool desc'].head(10)) (Checks if Tool name equals
description)
o c) [unique_df['Tool'][i] in unique_df['Tool desc'][i] for i in range(10)] (Assuming default integer
index)
o d) Both a and c (a is more robust to index changes) Answer: d) Both a and c (a is more robust to
index changes)

50. Scenario: You want to replace the 'Analysis' type 'Descriptive' with 'Summary' and 'Predictive' with
'Forecast' only in the 'Analysis' column of unique_df. Which command works best?
o a) unique_df['Analysis'].replace({'Descriptive': 'Summary', 'Predictive': 'Forecast'}, inplace=True)
o b) unique_df['Analysis'].map({'Descriptive': 'Summary', 'Predictive': 'Forecast'}) (Map replaces non-
matching with NaN unless default is set)
o c) unique_df.replace({'Analysis': {'Descriptive': 'Summary', 'Predictive': 'Forecast'}}, inplace=True)
o d) Both a and c Answer: d) Both a and c (a targets the specific column, c targets the value within the
specified column dictionary).
Python Practical MCQs — Al-Beruni City Case Study Context

Data Loading & Library Import MCQs

MCQ 1

Which command is correct to import Pandas library for data manipulation?

A) import panda as pd
B) import pandas as pd
C) import pandas.dataframe
D) from pandas import *

Answer: B

MCQ 2

If the file Departments.csv is located in the working directory, which code will load it into DataFrame df1?

A) df1 = pd.readfile('Departments.csv')
B) df1 = pd.read_csv('Departments.csv')
C) df1 = pd.load_csv('Departments.csv')
D) df1 = pd.readfile_csv('Departments.csv')

Answer: B

MCQ 3

To load Tools.csv file in df2 and view the first row only:

A) df2.head()
B) df2.loc[0]
C) df2.iloc[0]
D) df2.head(1)

Answer: D

MCQ 4

Which Python library is essential for numerical analysis and array operations in this assignment?

A) Matplotlib
B) Pandas
C) NumPy
D) Seaborn

Answer: C

MCQ 5

Which of the following is the correct command to display column names of df1?
A) df1.column_names()
B) df1.columns
C) df1.column()
D) df1.col_names

Answer: B

Data Cleaning & Merging MCQs

MCQ 6

To merge df1 and df2 such that all rows of df2 must appear in merged data:

A) pd.merge(df1, df2, how="inner")

B) pd.merge(df1, df2, how="outer")
C) pd.merge(df1, df2, how="right")
D) pd.merge(df1, df2, how="left")

Answer: C

MCQ 7

To check total missing values in each column of a dataframe:

A) df.isnull().sum()
B) df.isnull.count()
C) df.checknull()
D) df.isna()

Answer: A

MCQ 8

If 'Department' column has missing values and we have a dictionary mapping, which method should be used to fill
it?

A) fillna()
B) map().fillna()
C) replace()
D) dropna()

Answer: B

MCQ 9

After filling missing values, which command ensures there are no missing values?

A) df.isna().sum()
B) df.isnull().sum() == 0
C) df.dropna()
D) df.notnull().sum()
Answer: B

MCQ 10

To save updated dataframe to a new csv file without index:

A) df.to_csv('filename.csv')
B) df.to_csv('filename.csv', index=True)
C) df.save_csv('filename.csv')
D) df.to_csv('filename.csv', index=False)

Answer: D

Duplicate Handling MCQs

MCQ 11

Command to remove duplicates based on 'Abb' and 'Tool Name' fields:

A) df.drop_duplicates(['Abb', 'Tool Name'], inplace=True)

B) df.drop_duplicates()
C) df.unique()
D) df.dropna()

Answer: A

MCQ 12

To check the shape of dataframe:

A) df.shape()
B) df.size()
C) df.shape
D) df.count()

Answer: C

MCQ 13

To save the dataframe after removing duplicates as per assignment requirement:

A) df.to_csv('CRN-Unique.csv', index=False)
B) df.save('CRN-Unique.csv')
C) df.save_csv('CRN-Unique.csv')
D) df.to_csv('CRN_Unique.csv')

Answer: A

Data Analysis MCQs

MCQ 14
To find the department with the highest variety of 'Analysis' types:

A) df['Analysis'].value_counts()
B) df.groupby('Department')['Analysis'].nunique().idxmax()
C) df.groupby('Analysis')['Department'].count()
D) df['Department'].nunique()

Answer: B

MCQ 15

To calculate % of tools updated:

A) (df['Updated']=='Yes').sum() / len(df) * 100

B) df['Updated'].mean()
C) df['Updated'].value_counts() / df['Updated'].sum()
D) df['Updated'].count() / df['Updated'].sum()

Answer: A

Data Visualization MCQs

MCQ 16

To plot the count of tools per department using Matplotlib:

A) df['Department'].plot(kind='bar')
B) df['Department'].value_counts().plot(kind='bar')
C) sns.barplot(x='Department', y='Tool Name', data=df)
D) plt.bar(df['Department'], df['Tool Name'])

Answer: B

MCQ 17

To generate a pie chart for the 'Analysis' column:

A) df['Analysis'].value_counts().plot.pie()
B) plt.pie(df['Analysis'])
C) df['Analysis'].plot(kind='pie')
D) sns.pieplot(df['Analysis'])

Answer: A

MCQ 18

To create a bar plot for tools marked as "Updated" using Seaborn:

A) sns.countplot(x='Updated', data=df)
B) sns.barplot(x='Updated', y='Tool Name', data=df)
C) df['Updated'].plot(kind='bar')
D) plt.bar('Updated', 'Tool Name')
Answer: A

MCQ 19

To rotate x-axis labels by 45 degrees for better readability:

A) plt.xticks(rotation=45)
B) plt.xlabels(45)
C) plt.xlabel(rotation=45)
D) plt.rotate(45)

Answer: A

MCQ 20

To add values inside the slices of a pie chart in Matplotlib:

A) autopct='%1.1f%%'
B) data_label='inside'
C) plt.labels(inside=True)
D) df.plot.pie(labels='inside')

Answer: A

MCQ 21

To find total number of tools used per department after removing duplicates:

A) df['Department'].value_counts()
B) df.groupby('Department')['Tool Name'].count()
C) df.groupby('Tool Name')['Department'].count()
D) df['Tool Name'].count()

Answer: B

MCQ 22

To calculate mean of numerical column 'Population':

A) df['Population'].mean()
B) df.mean('Population')
C) np.mean(df['Population'])
D) Both A and C

Answer: D

MCQ 23

To calculate standard deviation of numerical column:

A) df['Population'].std()
B) df.std('Population')
C) np.std(df['Population'])
D) Both A and C

Answer: D

MCQ 24

To check correlation between numerical columns in dataframe:

A) df.corr()
B) df.correlation()
C) df.cov()
D) df.group_corr()

Answer: A

MCQ 25

To apply filter and show records where 'Updated' is 'No':

A) df[df['Updated'] == 'No']
B) df.loc[df['Updated'] == 'No']
C) df.query("Updated == 'No'")
D) All of the above

Answer: D

MCQ 26

In an ETL process, which is part of "Extract" in Python?

A) Reading csv using pandas

B) Using API to fetch data
C) SQL Query execution
D) All of the above

Answer: D

MCQ 27

In "Transform" step of ETL, you perform:

A) Handling Missing Values

B) Data Cleaning
C) Column Renaming
D) All of the above

Answer: D
MCQ 28

To calculate median of numerical column:

A) df['Population'].median()
B) df.median('Population')
C) np.median(df['Population'])
D) Both A and C

Answer: D

MCQ 29

To remove unwanted whitespaces from string columns:

A) df['Column'] = df['Column'].str.strip()
B) df['Column'] = df['Column'].strip()
C) df['Column'] = df.strip('Column')
D) df['Column'].remove_whitespace()

Answer: A

MCQ 30

To replace missing values with 'Unknown':

A) df.fillna('Unknown')
B) df.replace(np.nan, 'Unknown')
C) df['Column'] = df['Column'].fillna('Unknown')
D) All of the above

Answer: D

MCQ 31

Which pandas function returns basic statistics like count, mean, std, min, max?

A) df.stats()
B) df.describe()
C) df.summary()
D) df.explain()

Answer: B

MCQ 32

To reset index after dropping rows:

A) df.reset_index()
B) df.reset_index(drop=True, inplace=True)
C) df.index_reset()
D) df.drop_index()
Answer: B

MCQ 33

Which visualization is best to show distribution of numerical column?

A) Line Chart
B) Histogram
C) Pie Chart
D) Bar Chart

Answer: B

MCQ 34

To export only selected columns to CSV:

A) df[['col1', 'col2']].to_csv('output.csv', index=False)

B) df.select(['col1','col2']).to_csv('output.csv')
C) df[['col1','col2']].save('output.csv')
D) df['col1','col2'].export_csv()

Answer: A

MCQ 35

In Market Basket Analysis using Python, which library is commonly used?

A) pandas
B) numpy
C) mlxtend
D) seaborn

Answer: C

MCQ 36

For predictive maintenance model, which technique is preferred?

A) Regression
B) Clustering
C) Classification
D) Time Series Forecasting

Answer: D

MCQ 37

Which function is used to create new calculated column in dataframe?

A) df.create_column()
B) df['New_Column'] = ...
C) df.new_column()
D) df.add_column()

Answer: B

MCQ 38

Which method provides highest-level summary of dataframe?

A) df.info()
B) df.summary()
C) df.describe()
D) df.columns

Answer: A

MCQ 39

In scatter plot to show correlation between two numerical columns:

A) sns.scatterplot(x='col1', y='col2', data=df)

B) plt.scatter(df['col1'], df['col2'])
C) Both A and B
D) df.plot.scatter('col1','col2')

Answer: C

MCQ 40

To group data by Department and calculate sum of 'Population':

A) df.groupby('Department')['Population'].sum()
B) df.sum('Population').groupby('Department')
C) df.group('Department').sum('Population')
D) groupby(df['Department'])

Answer: A

MCQ 41

To create a bar chart using Seaborn:

A) sns.barplot(x='Department', y='Population', data=df)

B) sns.histplot(x='Department', y='Population', data=df)
C) sns.countplot(x='Department', data=df)
D) plt.bar(x='Department', height='Population')

Answer: A
MCQ 42

In fraud detection model using Python, which technique is most suitable?

A) Clustering
B) Classification
C) Time Series
D) PCA

Answer: B

MCQ 43

To drop rows where all values are NaN:

A) df.dropna(how='all')
B) df.dropna(all=True)
C) df.drop_allna()
D) df.remove_blank_rows()

Answer: A

MCQ 44

To create pivot table in pandas:

A) df.pivot_table(index='Department', values='Population', aggfunc='sum')

B) df.pivot('Department', 'Population')
C) df.create_pivot('Department')
D) df.group_pivot()

Answer: A

MCQ 45

To calculate variance of numerical column:

A) df['Population'].var()
B) np.var(df['Population'])
C) df.var('Population')
D) Both A and B

Answer: D

MCQ 46

To filter dataframe where Population > 1000:

A) df[df['Population'] > 1000]

B) df.loc[df['Population'] > 1000]
C) df.query("Population > 1000")
D) All of the above
Answer: D

MCQ 47

To show last 5 rows of dataframe:

A) df.tail()
B) df.head()
C) df[-5:]
D) Both A and C

Answer: D

MCQ 48

For supply chain optimization model using Python, which technique is used?

A) Linear Programming
B) Clustering
C) Regression
D) Market Basket

Answer: A

MCQ 49

To check datatype of all columns:

A) df.dtypes
B) df.types()
C) df.datatypes()
D) df.columns.dtypes

Answer: A

MCQ 50

To convert a column to datetime format:

A) pd.to_datetime(df['Date'])
B) df['Date'].astype('datetime')
C) df['Date'].convert('datetime')
D) Both A and B

Answer: A

12 Information Practices Text Book Preeti Arora
No ratings yet
12 Information Practices Text Book Preeti Arora
45 pages
Python - DataScience Question - Paper
No ratings yet
Python - DataScience Question - Paper
5 pages
Standard DIN 5008 For Cover Letter Deutschland
No ratings yet
Standard DIN 5008 For Cover Letter Deutschland
1 page
Malay Annals
No ratings yet
Malay Annals
387 pages
Bitcoin Inheritance Planning
No ratings yet
Bitcoin Inheritance Planning
1 page
Computer Fraud
0% (2)
Computer Fraud
29 pages
Empowerment Technology: Guided Learning Activity Kit
100% (3)
Empowerment Technology: Guided Learning Activity Kit
16 pages
MCQ
No ratings yet
MCQ
8 pages
12 IP Dataframe and Pyplot Notes
No ratings yet
12 IP Dataframe and Pyplot Notes
14 pages
Python - Final 1
No ratings yet
Python - Final 1
17 pages
Ip - Capsule
No ratings yet
Ip - Capsule
17 pages
PYTHON PROGRAMMING: Data Handling
No ratings yet
PYTHON PROGRAMMING: Data Handling
12 pages
Document (4) - 1
No ratings yet
Document (4) - 1
15 pages
Chapter 2 Python Pandas
No ratings yet
Chapter 2 Python Pandas
8 pages
IP - Pandas 1 & 2 (Worksheet) Class 12
No ratings yet
IP - Pandas 1 & 2 (Worksheet) Class 12
16 pages
More Practice Questions For DataFrame
No ratings yet
More Practice Questions For DataFrame
9 pages
MCQ On Dataframe
No ratings yet
MCQ On Dataframe
11 pages
Pyq Solution
No ratings yet
Pyq Solution
12 pages
Python Scenario Based Interview QA
No ratings yet
Python Scenario Based Interview QA
3 pages
Worksheet - Pandas
100% (1)
Worksheet - Pandas
16 pages
Create A Pandas Series From A Dictionary of Values and An Ndarray
No ratings yet
Create A Pandas Series From A Dictionary of Values and An Ndarray
15 pages
Analystics Data Cleaning Questions Interview
No ratings yet
Analystics Data Cleaning Questions Interview
8 pages
101 Onwards On Python Pandas and Pyplot
No ratings yet
101 Onwards On Python Pandas and Pyplot
33 pages
Python ClassXII AI
No ratings yet
Python ClassXII AI
4 pages
Pandas
No ratings yet
Pandas
5 pages
Cs Sem V Dav Upc 32347507 Sl. No. Qp. 4432 Dec '23
No ratings yet
Cs Sem V Dav Upc 32347507 Sl. No. Qp. 4432 Dec '23
16 pages
Analyzing Data Using Python - Cleaning and Analyzing Data in Pandas
No ratings yet
Analyzing Data Using Python - Cleaning and Analyzing Data in Pandas
81 pages
Pandas Cheat Sheet Final
No ratings yet
Pandas Cheat Sheet Final
1 page
Pandas Test
No ratings yet
Pandas Test
6 pages
DAV Previous Year
No ratings yet
DAV Previous Year
7 pages
Creation of Series Using List, Dictionary & Ndarray
No ratings yet
Creation of Series Using List, Dictionary & Ndarray
65 pages
Data Frame 100 Questions
No ratings yet
Data Frame 100 Questions
16 pages
Lab Session 07: Perform Following Operations Using Pandas
No ratings yet
Lab Session 07: Perform Following Operations Using Pandas
4 pages
Cheatsheet Panda
No ratings yet
Cheatsheet Panda
6 pages
AI Practical 2025
No ratings yet
AI Practical 2025
14 pages
Dav Pyq 2023
No ratings yet
Dav Pyq 2023
15 pages
Ip Practice Test (14in)
No ratings yet
Ip Practice Test (14in)
9 pages
Python Unit 2 Question Bank
No ratings yet
Python Unit 2 Question Bank
5 pages
PYQ Data Analysis and Visualisation Using Python GE May 2024
No ratings yet
PYQ Data Analysis and Visualisation Using Python GE May 2024
6 pages
12 Ip Pa2 2024-25
No ratings yet
12 Ip Pa2 2024-25
7 pages
DAI 101 Tutorial
No ratings yet
DAI 101 Tutorial
12 pages
Unit 1 Python Pandas
No ratings yet
Unit 1 Python Pandas
20 pages
DAP Module4 Notes
No ratings yet
DAP Module4 Notes
17 pages
Python Pandas MCQs
No ratings yet
Python Pandas MCQs
7 pages
QP DAV 3rd Sem Dec 2023
No ratings yet
QP DAV 3rd Sem Dec 2023
12 pages
Code Explanation For Date Types
No ratings yet
Code Explanation For Date Types
8 pages
Python Unit Iv - Pandas
No ratings yet
Python Unit Iv - Pandas
36 pages
Ip pb1 QP Ms Agra Set A
No ratings yet
Ip pb1 QP Ms Agra Set A
17 pages
List of Practical Ip065 Xii Session 2025 CKC Academy
No ratings yet
List of Practical Ip065 Xii Session 2025 CKC Academy
19 pages
Attachment
No ratings yet
Attachment
4 pages
28 03 2024 Sample Paper Grade 12 Informatics Practices 2023 24
No ratings yet
28 03 2024 Sample Paper Grade 12 Informatics Practices 2023 24
8 pages
Chapter 2 - Python Pandas II
No ratings yet
Chapter 2 - Python Pandas II
71 pages
List of Practical Ip065 Xii Session 2025 CKC Academy
No ratings yet
List of Practical Ip065 Xii Session 2025 CKC Academy
19 pages
Data Handling Using Pandas - Revision Notes
No ratings yet
Data Handling Using Pandas - Revision Notes
6 pages
Unit-II Data Science QB
No ratings yet
Unit-II Data Science QB
33 pages
Adobe Scan 11-Jul-2025
No ratings yet
Adobe Scan 11-Jul-2025
8 pages
B. Sc. H Computer S FkQNyBB
No ratings yet
B. Sc. H Computer S FkQNyBB
6 pages
12 Pandas
100% (1)
12 Pandas
21 pages
Practical File Questions With Answers
No ratings yet
Practical File Questions With Answers
7 pages
CH-6 Data Loading, Storage, and File Formats
No ratings yet
CH-6 Data Loading, Storage, and File Formats
163 pages
7 - Introduction To Data Science in Python
No ratings yet
7 - Introduction To Data Science in Python
7 pages
DXE 24gksmknvj
No ratings yet
DXE 24gksmknvj
16 pages
Holy Innocents Public School Term-1
No ratings yet
Holy Innocents Public School Term-1
6 pages
Exp 3
No ratings yet
Exp 3
10 pages
Ichimoku Screener
No ratings yet
Ichimoku Screener
3 pages
Programming Concepts - Ii: Ruchi Sharma
No ratings yet
Programming Concepts - Ii: Ruchi Sharma
18 pages
18mcs35e U3
No ratings yet
18mcs35e U3
26 pages
Before You Begin: Prerequisites For Configuring Cisco Unified CME
No ratings yet
Before You Begin: Prerequisites For Configuring Cisco Unified CME
18 pages
Manual Cronos
No ratings yet
Manual Cronos
250 pages
The Home Computer Advanced Course
100% (1)
The Home Computer Advanced Course
24 pages
COGCONS Home Automation
No ratings yet
COGCONS Home Automation
17 pages
Name/Link To Resource Type of Resource Primary Type of Learning Features
No ratings yet
Name/Link To Resource Type of Resource Primary Type of Learning Features
1 page
Connection Two Computers With LAN Cable For Drive Sharing: Bahasa Inggris II - MI - 1720005 - Prestentation 2
No ratings yet
Connection Two Computers With LAN Cable For Drive Sharing: Bahasa Inggris II - MI - 1720005 - Prestentation 2
6 pages
Autosar Adaptive
100% (1)
Autosar Adaptive
5 pages
Top Piping Design Software Packages
100% (1)
Top Piping Design Software Packages
5 pages
Btech It III Year r22
No ratings yet
Btech It III Year r22
305 pages
DSP Algorithm Arch., For Tele Comm.
No ratings yet
DSP Algorithm Arch., For Tele Comm.
212 pages
Machine Design
No ratings yet
Machine Design
68 pages
Entris II Essential Line
No ratings yet
Entris II Essential Line
303 pages
Navis Works Training in Hyderabad
No ratings yet
Navis Works Training in Hyderabad
3 pages
LTN8716K-P16 - User Manual
No ratings yet
LTN8716K-P16 - User Manual
207 pages
Mariella Villaflor - Training Agreement Form
No ratings yet
Mariella Villaflor - Training Agreement Form
9 pages
Advanced Algorithms & Data Structures: Lecturer: Karimzhan Nurlan Berlibekuly
No ratings yet
Advanced Algorithms & Data Structures: Lecturer: Karimzhan Nurlan Berlibekuly
28 pages
eMAG Marketplace API Documentation v4.0.0
No ratings yet
eMAG Marketplace API Documentation v4.0.0
34 pages
The Data WareHouse ETL Toolkit - Chapter 05
100% (1)
The Data WareHouse ETL Toolkit - Chapter 05
40 pages
Goubera 08 Mayo 2024
No ratings yet
Goubera 08 Mayo 2024
4 pages
EET 303 M3 - Compressed
No ratings yet
EET 303 M3 - Compressed
32 pages
Sap - S4hana 2023
No ratings yet
Sap - S4hana 2023
13 pages
5 Problems - Solution
No ratings yet
5 Problems - Solution
23 pages