Walmart Data Analyst Interview Experience
Walmart Data Analyst Interview Experience
This question tests your understanding of Python data structures like sets and dictionaries:
• Dictionaries: Key-value pairs are used to store occurrences of each unique element
efficiently.
Using these, you can identify unique elements and count their occurrences.
Code
# Sample list
data = [1, 2, 2, 3, 4, 4, 4, 5]
unique_values = set(data)
print("Occurrences:", occurrences)
Output:
Occurrences: {1: 1, 2: 2, 3: 1, 4: 3, 5: 1}
• Grouping and Aggregation: Use groupby() to group data by products and calculate
total sales using aggregation functions like sum().
Code
import pandas as pd
# Sample datasets
products = pd.DataFrame({
})
sales = pd.DataFrame({
})
valid_promotions = merged_data[merged_data['promotion_valid']]
total_sales = valid_promotions.groupby('product_name')['sales'].sum()
print(total_sales)
Output:
product_name
A 200
B 150
D 50
• Tuples: Ordered, immutable, allows duplicates. Used for fixed collections of items.
• Sets: Unordered, mutable, no duplicates. Ideal for membership tests and unique
element extraction.
• Dictionaries: Unordered, mutable, key-value pairs. Excellent for fast lookups and
association of data.
Code
# List
my_list = [1, 2, 3, 3]
print("List:", my_list)
# Tuple
my_tuple = (1, 2, 3, 3)
print("Tuple:", my_tuple)
# Set
my_set = {1, 2, 3, 3}
# Dictionary
my_dict = {'a': 1, 'b': 2, 'c': 3}
print("Dictionary:", my_dict)
Output:
css
CopyEdit
List: [1, 2, 3, 3]
Tuple: (1, 2, 3, 3)
POWER BI
• Import Mode:
o The report becomes static and doesn’t reflect real-time changes in the
source unless refreshed.
o Data stays in the source system, and queries are sent to fetch data as
needed.
• Slicers:
o Interactive visuals that allow users to filter data directly on the dashboard.
o Example: A slicer for "Year" allows selecting specific years to filter all linked
visuals.
• Visual-Level Filters:
o Filters applied to specific visuals rather than the entire page or report.
o Not interactive for end-users but provide control over what data is displayed
in a specific visual.
o Example: A filter applied to a bar chart to display only sales > $10,000.
Impact:
Slicers enhance user interactivity, allowing dynamic filtering, while visual-level filters
provide static control for specific visuals.
• RLS restricts data access based on roles, ensuring that users or groups see only the
data they are authorized to view.
• Implementation Steps:
1. Define roles in Power BI Desktop: Use DAX expressions to filter data based on
user criteria (e.g., Region = "North").
Example:
To restrict regional managers to see only their respective region's data, create a role with a
DAX filter:
[Region] = USERPRINCIPALNAME()
• Paginated Reports:
o Data is displayed across multiple pages, with precise control over layout.
• When to Use:
o When you need formatted, printable outputs that may span multiple pages.
Example: A paginated report would be ideal for generating monthly sales invoices for a
large number of customers.
SQL
1. Find the Second-Highest Salary in a Department
Theoretical Explanation
• ROW_NUMBER(): Assigns a unique sequential number to each row within a
partition of data.
• DENSE_RANK(): Assigns ranks to rows in a partition, but ties receive the same rank.
There are no gaps in ranks.
To find the second-highest salary in each department, partition data by department_id and
order salaries in descending order, then filter for rank = 2.
WITH RankedSalaries AS (
SELECT
department_id,
employee_id,
salary,
FROM employees
FROM RankedSalaries
WHERE rank = 2;
Query
SELECT
user_id,
transaction_date,
COUNT(*) AS total_transactions
FROM transactions
This involves:
1. Joining the projects table with the employees table to calculate the number of
employees per project.
Assume Tables
• projects(project_id, budget)
• employees(employee_id, project_id)
Query
WITH ProjectEmployeeCount AS (
SELECT
p.project_id,
p.budget,
COUNT(e.employee_id) AS total_employees
FROM projects p
),
BudgetRatio AS (
SELECT
project_id,
budget,
total_employees,
CASE
ELSE 0
END AS budget_per_employee
FROM ProjectEmployeeCount
FROM BudgetRatio