0% found this document useful (0 votes)
8 views3 pages

Python Scenario Based Interview QA

The document presents various scenario-based Python interview questions tailored for freshers applying for data analysis roles. Each scenario includes a specific data-related challenge, such as cleaning data, analyzing sales, preparing datasets for churn prediction, identifying outliers, merging datasets, visualizing trends, and preparing categorical data for machine learning. The document provides sample code snippets and methodologies to address these scenarios effectively.

Uploaded by

gauri pingat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views3 pages

Python Scenario Based Interview QA

The document presents various scenario-based Python interview questions tailored for freshers applying for data analysis roles. Each scenario includes a specific data-related challenge, such as cleaning data, analyzing sales, preparing datasets for churn prediction, identifying outliers, merging datasets, visualizing trends, and preparing categorical data for machine learning. The document provides sample code snippets and methodologies to address these scenarios effectively.

Uploaded by

gauri pingat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Scenario-Based Python Interview Questions for Data Analysis Roles (Freshers)

1. Scenario: You receive a CSV file with missing values, inconsistent casing, and duplicate

rows. How would you clean this data using Python?

Answer:

import pandas as pd

df = pd.read_csv('data.csv')

# Remove duplicates

df = df.drop_duplicates()

# Standardize casing (e.g., for a 'Name' column)

df['Name'] = df['Name'].str.title()

# Handle missing values

df = df.fillna(method='ffill')

2. Scenario: You have a sales dataset with columns: Date, Product, and Revenue. How would

you find the top 3 products with the highest average monthly revenue?

Answer:

df['Date'] = pd.to_datetime(df['Date'])

df['Month'] = df['Date'].dt.to_period('M')

monthly_avg = df.groupby(['Month', 'Product'])['Revenue'].mean().reset_index()

top_products =

monthly_avg.groupby('Product')['Revenue'].mean().sort_values(ascending=False).head(3)

3. Scenario: How would you prepare customer data with demographic info and activity logs
for a churn prediction model?

Answer:

- Handle missing values

- Convert categorical features to numeric using pd.get_dummies()

- Normalize/scale numerical features

- Merge datasets if activity logs are separate

- Label churn (e.g., Churn = 1 if customer left, else 0)

4. Scenario: You suspect some products have incorrect prices in a dataset. How would you

identify and handle outliers?

Answer:

Q1 = df['Price'].quantile(0.25)

Q3 = df['Price'].quantile(0.75)

IQR = Q3 - Q1

outliers = df[(df['Price'] < Q1 - 1.5*IQR) | (df['Price'] > Q3 + 1.5*IQR)]

df = df[~df.index.isin(outliers.index)]

5. Scenario: You have two datasets: users.csv and transactions.csv. How would you

combine them to analyze total spending per user?

Answer:

users = pd.read_csv('users.csv')

transactions = pd.read_csv('transactions.csv')

merged = pd.merge(users, transactions, on='user_id')

spending = merged.groupby('user_id')['amount'].sum()

6. Scenario: You have daily temperature data. How would you visualize trends and seasonal

patterns?
Answer:

import matplotlib.pyplot as plt

df['Date'] = pd.to_datetime(df['Date'])

df.set_index('Date', inplace=True)

plt.figure(figsize=(10,5))

plt.plot(df['Temperature'])

plt.title('Daily Temperature Trends')

plt.xlabel('Date')

plt.ylabel('Temperature')

plt.show()

7. Scenario: You have a column 'Country' with many categories. How would you prepare this

for machine learning?

Answer:

# Use OneHotEncoder or pd.get_dummies

df = pd.get_dummies(df, columns=['Country'], drop_first=True)

8. Scenario: Your dataset has a column 'Join_Date'. What features can you extract from it?

Answer:

df['Join_Date'] = pd.to_datetime(df['Join_Date'])

df['Year'] = df['Join_Date'].dt.year

df['Month'] = df['Join_Date'].dt.month

df['Weekday'] = df['Join_Date'].dt.day_name()

df['Join_Quarter'] = df['Join_Date'].dt.quarter

You might also like