Python Scenario Based Interview QA
Python Scenario Based Interview QA
1. Scenario: You receive a CSV file with missing values, inconsistent casing, and duplicate
Answer:
import pandas as pd
df = pd.read_csv('data.csv')
# Remove duplicates
df = df.drop_duplicates()
df['Name'] = df['Name'].str.title()
df = df.fillna(method='ffill')
2. Scenario: You have a sales dataset with columns: Date, Product, and Revenue. How would
you find the top 3 products with the highest average monthly revenue?
Answer:
df['Date'] = pd.to_datetime(df['Date'])
df['Month'] = df['Date'].dt.to_period('M')
top_products =
monthly_avg.groupby('Product')['Revenue'].mean().sort_values(ascending=False).head(3)
3. Scenario: How would you prepare customer data with demographic info and activity logs
for a churn prediction model?
Answer:
4. Scenario: You suspect some products have incorrect prices in a dataset. How would you
Answer:
Q1 = df['Price'].quantile(0.25)
Q3 = df['Price'].quantile(0.75)
IQR = Q3 - Q1
df = df[~df.index.isin(outliers.index)]
5. Scenario: You have two datasets: users.csv and transactions.csv. How would you
Answer:
users = pd.read_csv('users.csv')
transactions = pd.read_csv('transactions.csv')
spending = merged.groupby('user_id')['amount'].sum()
6. Scenario: You have daily temperature data. How would you visualize trends and seasonal
patterns?
Answer:
df['Date'] = pd.to_datetime(df['Date'])
df.set_index('Date', inplace=True)
plt.figure(figsize=(10,5))
plt.plot(df['Temperature'])
plt.xlabel('Date')
plt.ylabel('Temperature')
plt.show()
7. Scenario: You have a column 'Country' with many categories. How would you prepare this
Answer:
8. Scenario: Your dataset has a column 'Join_Date'. What features can you extract from it?
Answer:
df['Join_Date'] = pd.to_datetime(df['Join_Date'])
df['Year'] = df['Join_Date'].dt.year
df['Month'] = df['Join_Date'].dt.month
df['Weekday'] = df['Join_Date'].dt.day_name()
df['Join_Quarter'] = df['Join_Date'].dt.quarter