0% found this document useful (0 votes)
33 views

SQL Python PowerBI Questions and Answers

Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

SQL Python PowerBI Questions and Answers

Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

SQL, Python, and Power BI Interview

Questions & Answers


SQL Questions

1. Trend Analysis in Customer Orders


Query:
SELECT CustomerID, AVG(OrderValue) AS AverageOrderValue
FROM Orders
WHERE OrderDate >= DATEADD(MONTH, -6, GETDATE())
GROUP BY CustomerID;

2. Customer Retention Insights


Query:
SELECT CustomerID, COUNT(SubscriptionID) AS RenewalCount
FROM Subscriptions
WHERE DATEDIFF(YEAR, StartDate, GETDATE()) = 1
GROUP BY CustomerID
HAVING COUNT(SubscriptionID) > 2;

3. Churn Rate Calculation


Query:
WITH UserActivityRecent AS (
SELECT UserID, MAX(ActivityDate) AS LastActivityDate
FROM UserActivity
GROUP BY UserID
)
SELECT MONTH(LastActivityDate) AS Month,
COUNT(UserID) / (SELECT COUNT(*) FROM UserActivityRecent) AS ChurnRate
FROM UserActivityRecent
WHERE DATEDIFF(DAY, LastActivityDate, GETDATE()) > 30
GROUP BY MONTH(LastActivityDate);

4. Data Partitioning
Partitioning in SQL improves query performance by dividing large tables into smaller, more
manageable pieces based on a key column, typically time or categorical data. This allows for
more efficient data retrieval and query optimization. Example:
CREATE TABLE Orders (
OrderID INT,
OrderDate DATE,
OrderValue DECIMAL(10, 2)
)
PARTITION BY RANGE (YEAR(OrderDate));

5. Recursive Queries
Query:
WITH EmployeeHierarchy AS (
SELECT EmployeeID, ManagerID, Department
FROM Employees
WHERE ManagerID IS NULL
UNION ALL
SELECT e.EmployeeID, e.ManagerID, e.Department
FROM Employees e
INNER JOIN EmployeeHierarchy eh ON e.ManagerID = eh.EmployeeID
)
SELECT * FROM EmployeeHierarchy;

Python Questions

1. Data Cleaning Pipeline


Python function to clean a dataset:
import pandas as pd

def clean_data(df):
# Remove duplicates
df = df.drop_duplicates()
# Handle missing values
df = df.fillna(method='ffill')
# Standardize column names
df.columns = df.columns.str.lower().str.replace(' ', '_')
return df

2. Natural Language Processing (NLP)


Script to analyze sentiment of customer reviews using TextBlob:
from textblob import TextBlob
import pandas as pd

def analyze_sentiment(review):
blob = TextBlob(review)
return blob.sentiment.polarity

df = pd.read_csv('customer_reviews.csv')
df['sentiment'] = df['review'].apply(analyze_sentiment)
3. Data Sampling
To create a stratified sample ensuring key category proportions are maintained:
import pandas as pd

def stratified_sample(df, stratify_column, sample_size):


return df.groupby(stratify_column, group_keys=False).apply(
lambda x: x.sample(int(sample_size * len(x) / len(df)))
)

4. Parallel Processing
Python program for parallel processing using multiprocessing:
import multiprocessing

def process_data(data):
return data * 2

if __name__ == '__main__':
data = [1, 2, 3, 4, 5]
with multiprocessing.Pool(processes=4) as pool:
results = pool.map(process_data, data)
print(results)

Advantages: Parallel processing reduces the overall execution time by utilizing multiple CPU
cores.

5. Database Interaction
Python script to connect to MySQL and save data into Excel:
import mysql.connector
import pandas as pd

def fetch_data_from_db():
conn = mysql.connector.connect(
host='localhost',
user='root',
password='password',
database='mydb'
)
query = 'SELECT * FROM table_name'
df = pd.read_sql(query, conn)
df.to_excel('output.xlsx', index=False)
conn.close()
Power BI Questions

1. Integration with External Tools


External tools like DAX Studio or Tabular Editor can enhance Power BI development by
providing additional functionality for querying, optimizing models, and performing
advanced DAX debugging. They can be used to fine-tune performance and gain deeper
insights into the data model.

2. Parameterized Reports
To create parameterized reports in Power BI, you can use parameters to filter data based on
user input. These parameters can be used in query filters or DAX measures to customize
report results, such as date ranges or regions.

3. Power BI vs. Tableau


Power BI is generally more affordable and integrates well with other Microsoft tools,
making it ideal for organizations heavily using the Microsoft ecosystem. Tableau offers
more advanced visualization options, but Power BI is often recommended for more cost-
effective and integrated enterprise environments.

4. Aggregations in Power BI Models


Aggregations in Power BI allow large datasets to be summarized for faster reporting. By
pre-aggregating data, Power BI can query smaller datasets, improving performance in
reports that need quick responses.

5. Gateway Configuration
To configure an On-Premises Data Gateway in Power BI, download and install the gateway
on a server, configure it with your Power BI account, and connect to your on-premises data
sources. Troubleshooting connectivity issues often involves checking network/firewall
settings or reconfiguring the gateway to resolve connectivity problems.

You might also like