0% found this document useful (0 votes)
6 views

Python_Questions__BA

The document provides a series of Python interview questions and answers tailored for a Business Analyst role, focusing on data manipulation using Pandas. Key topics include loading and filtering CSV files, data aggregation, handling missing values, and performing SQL-like operations in Pandas. It also covers automation of reporting processes and analyzing user behavior through Python scripts.

Uploaded by

shiva gautam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Python_Questions__BA

The document provides a series of Python interview questions and answers tailored for a Business Analyst role, focusing on data manipulation using Pandas. Key topics include loading and filtering CSV files, data aggregation, handling missing values, and performing SQL-like operations in Pandas. It also covers automation of reporting processes and analyzing user behavior through Python scripts.

Uploaded by

shiva gautam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Python Interview Questions for Business Analyst Role (CarInfo)

Q: How do you load a large CSV file using Python and filter rows based on a condition?

A: Use pandas:

'''python

import pandas as pd

df = pd.read_csv('data.csv')

filtered = df[df['column_name'] > 100]

'''

Q: What's the difference between .loc[] and .iloc[] in Pandas?

A: '.loc[]' is label-based indexing; '.iloc[]' is integer-position based. Example:

'''python

df.loc[1] # by label/index name

df.iloc[1] # by position

'''

Q: How would you group data by a column and calculate aggregate metrics like sum or mean?

A: '''python

df.groupby('category_column')['value_column'].sum()

df.groupby('category_column')['value_column'].mean()

'''

Q: How can you merge two dataframes in Pandas? What types of joins are available?

A: Use 'pd.merge()'. Joins: inner, outer, left, right.

'''python

pd.merge(df1, df2, on='id', how='left')

'''
Q: Explain how you would create a new column based on conditions from other columns.

A: '''python

df['new_col'] = df['col1'].apply(lambda x: 'High' if x > 100 else 'Low')

'''

Q: Can you pivot and unpivot a table using Pandas? When would you do that?

A: 'pivot_table()' reshapes data; use 'melt()' to unpivot.

'''python

pd.pivot_table(df, values='val', index='row', columns='col')

df.melt(id_vars=['id'])

'''

Q: How do you handle missing, duplicate, or outlier values in a dataset?

A: '''python

df.dropna(), df.fillna(0), df.duplicated(), df.drop_duplicates()

# Outliers: use IQR or Z-score

'''

Q: How would you convert a column from string to datetime, and extract parts like day, month?

A: '''python

df['date'] = pd.to_datetime(df['date_str'])

df['day'] = df['date'].dt.day

'''

Q: How can you detect and remove outliers using IQR or Z-score in Python?

A: '''python

Q1 = df['col'].quantile(0.25)

Q3 = df['col'].quantile(0.75)

IQR = Q3 - Q1
filtered = df[(df['col'] >= Q1 - 1.5 * IQR) & (df['col'] <= Q3 + 1.5 * IQR)]

'''

Q: You're asked to replicate a SQL GROUP BY + HAVING clause in Pandas. How would you do it?

A: '''python

grouped = df.groupby('group_col').agg({'val_col': 'sum'})

filtered = grouped[grouped['val_col'] > 100]

'''

Q: How can you perform a window function (like moving average) in Pandas?

A: '''python

df['moving_avg'] = df['value'].rolling(window=3).mean()

'''

Q: Have you ever automated a reporting process in Python? How did you schedule or trigger it?

A: Use 'schedule', 'APScheduler', or cron jobs. Write a script and schedule it with cron (Linux) or Task

Scheduler (Windows).

Q: How would you connect to a SQL database and run a query using Python?

A: '''python

import mysql.connector

conn = mysql.connector.connect(...)

cursor = conn.cursor()

cursor.execute('SELECT * FROM table')

'''

Q: How would you send automated email reports or alert messages from a Python script?

A: Use 'smtplib' or libraries like 'yagmail'. Example:

'''python
import smtplib

# setup SMTP and send

'''

Q: Write a function to calculate the percentage change between two values.

A: '''python

def percent_change(old, new):

return ((new - old) / old) * 100

'''

Q: How do you read a JSON file and extract specific fields?

A: '''python

import json

with open('file.json') as f:

data = json.load(f)

print(data['key'])

'''

Q: What are Python list comprehensions? Give an example.

A: A concise way to create lists:

'''python

squares = [x**2 for x in range(10)]

'''

Q: A dashboard shows drop in user renewals. How will you use Python to investigate the reason?

A: Analyze churn data, compare previous periods, check feature usage, demographics, and user activity.

Q: You notice a spike in app installs but no rise in purchases. What Python scripts would you run to

analyze user behavior?


A: Cohort analysis, funnel drop-off analysis, session tracking, and check conversion ratios.

Q: You need to identify which car brand's insurance sales dropped in the last month. How will you do

this using Python + SQL?

A: Query monthly sales by brand using SQL, load into Pandas, and compare month-over-month sales using

grouping and diff().

You might also like