Recently Asked Data Analyst Interview Questions
Recently Asked Data Analyst Interview Questions
1. How would you use SQL to retrieve data, Python to clean it, and Power BI to
visualize it in a dashboard?
• SQL: Use SQL to extract relevant data from the database, writing queries to retrieve
specific columns, filter data, and apply aggregations as needed.
• Python: Use libraries like Pandas to clean the data—handle missing values,
standardize formats, and ensure data consistency. Automate repetitive cleaning
tasks with custom scripts.
• Power BI: Load the cleaned data into Power BI, create calculated measures and
columns, and design an interactive dashboard using visuals like line charts, bar
graphs, and KPIs.
2. Suppose you have a dataset in Excel. How would you analyze and visualize
the data in Power BI using SQL as a data source?
• Connect Power BI to the SQL database as the primary source and use SQL queries
to pull the data.
• Integrate Excel by importing the dataset into Power BI. Use relationships to join
tables or append data.
• Create a data model and use Power BI’s DAX formulas for deeper analysis. Visualize
the data through dynamic dashboards, incorporating slicers for interactivity.
3. How would you handle data from multiple sources (SQL, Excel, CSV files) to
prepare a unified dataset for analysis?
• Extract data from SQL using queries, load Excel files and CSVs using tools like
Python (Pandas) or Power Query in Power BI.
• Clean, transform, and consolidate the datasets by standardizing column names,
handling missing values, and merging them using joins or unions.
• Load the unified dataset into Power BI and create a robust data model for analysis.
4. How would you calculate the total sales per region using SQL, create a
Power BI dashboard, and visualize the trends over time using Python?
• SQL: Write a query to calculate total sales per region using a GROUP BY clause.
• Power BI: Load the SQL output into Power BI, create measures for total sales, and
use line or area charts to visualize trends.
• Python: Use libraries like Matplotlib or Seaborn for trend analysis, applying
statistical techniques like rolling averages or growth rates.
5. Given a dataset in Excel, how would you identify the top 5 customers based
on sales, and then visualize the results in Power BI?
• Import the Excel dataset into Power BI or Python. Use DAX formulas or Pandas to
calculate sales per customer.
• Identify the top 5 customers using TOPN in Power BI or Python's .nlargest()
method.
• Visualize results with bar charts or pie charts in Power BI, labeling the top
customers for clarity.
6. How would you use SQL to identify and remove duplicates, clean missing
values in Python, and then create a dynamic report in Power BI?
• SQL: Use ROW_NUMBER() or DISTINCT to identify duplicates and retain only the latest
or unique records.
• Python: Handle missing values with Pandas, applying methods like .fillna() for
imputation or .dropna() to remove incomplete rows.
• Power BI: Load the cleaned dataset into Power BI, design calculated measures, and
create an interactive dashboard with visuals and slicers.
• Python:
• import pandas as pd
• # Load and clean dataset
• df = pd.read_csv("sales_data.csv")
• df.drop_duplicates(inplace=True)
• df.fillna(value={"Sales": 0}, inplace=True)
•
• # Save cleaned data
• df.to_csv("cleaned_data.csv", index=False)
• SQL: Load the cleaned data into the database and use GROUP BY queries to
aggregate sales by region or product.
• Power BI: Import the aggregated data and create visualizations like clustered bar
charts or KPI cards.
9. How would you use Power BI to perform a time series analysis on sales
data, pulling the data from a SQL database, and calculate key performance
metrics using Python?
10. How would you handle time-based data in SQL, visualize trends in Power
BI, and perform statistical analysis using Python to calculate growth rates?
• SQL: Use date functions like YEAR(), MONTH(), or DATEPART() to aggregate time-
based data.
• Power BI: Import SQL output and use visuals like area or line charts to show trends.
Use slicers to filter by year or month.
• Python: Perform regression analysis or calculate growth rates using pct_change()
in Pandas or curve_fit() from Scipy. Use Matplotlib to visualize the analysis.
Each answer demonstrates a structured and professional approach, combining technical skills and
domain knowledge for the data analyst role.