📘 Study Note: Data Analysis Learning Path (Excel + SQL)
Part 1: Excel for Data Analysis
1.1 Introduction to Excel
What is Excel & Why It’s Important in Data Analysis
Excel Interface & Navigation
Best Practices for Structuring Data
1.2 Core Excel Functions
Basic Arithmetic & Logical Functions (SUM, IF, AND, OR)
Text Functions (LEFT, RIGHT, CONCATENATE, TEXT)
Date & Time Functions
1.3 Data Cleaning Techniques
Removing Duplicates
Text-to-Columns
Find & Replace
Data Validation
Dealing with Missing Data
1.4 Data Analysis with Excel
Sorting & Filtering
Conditional Formatting
Lookup Functions (VLOOKUP, HLOOKUP, XLOOKUP, INDEX-MATCH)
Named Ranges
1.5 Pivot Tables & Charts
Creating Pivot Tables
Slicers & Timelines
Grouping and Aggregating Data
Creating Visual Charts (Bar, Pie, Line, etc.)
1.6 Advanced Excel Features
Power Query (Getting & Transforming Data)
Power Pivot & Data Models
Introduction to DAX
What-If Analysis (Goal Seek, Scenario Manager)
1.7 Excel Tips, Tricks, and Shortcuts
Keyboard Shortcuts
Dynamic Ranges
Error Handling with Functions (IFERROR, ISNA)
Part 2: SQL for Data Analysis
2.1 Introduction to SQL
What is SQL & Its Role in Data Analysis
Database Concepts (Tables, Schemas, Relationships)
SQL Syntax Overview
2.2 Basic SQL Queries
SELECT, FROM, WHERE
Filtering with Logical Operators
Sorting with ORDER BY
2.3 Working with Multiple Tables
JOIN Types (INNER, LEFT, RIGHT, FULL)
Aliasing & Subqueries
Using UNION, INTERSECT, EXCEPT
2.4 Data Aggregation & Grouping
GROUP BY & HAVING
Aggregate Functions (COUNT, SUM, AVG, MAX, MIN)
Window Functions Overview (ROW_NUMBER, RANK, PARTITION BY)
2.5 Data Cleaning in SQL
Handling NULLs
Data Type Conversion
String Manipulation (TRIM, SUBSTR, REPLACE)
Removing Duplicates with DISTINCT
2.6 Advanced SQL Techniques
CTEs (Common Table Expressions)
Temporary Tables
Nested Queries
Using CASE for Conditional Logic
2.7 SQL Optimization Basics
Indexing
Query Execution Plans
Best Practices for Writing Efficient Queries
🟦 1.1 Introduction to Excel
🔹 What is Excel & Why It’s Important in Data Analysis
Excel is a spreadsheet application developed by Microsoft that allows users to store, organize, and
analyze data using rows and columns. It’s been the OG sidekick of analysts, accountants, and data
nerds for decades.
Why Excel matters in data analysis:
Accessibility: Almost everyone has access to Excel. No need for server setups or
complicated installations.
Ease of Use: Drag, drop, click. You can analyze and visualize without writing a single line
of code.
Versatility: You can do everything from basic budgeting to complex data modeling.
Integrations: Works well with other Microsoft tools and even connects with databases and
APIs via Power Query.
💡 Think of Excel as the Swiss Army knife of data tools—basic enough for daily use, powerful enough
for serious data crunching.
🔹 Excel Interface & Navigation
Before we start writing formulas like a mad scientist, let’s look around the lab:
Key Elements of the Interface:
Ribbon: The top menu with tabs like Home, Insert, Formulas, Data, etc. Each tab contains
tools grouped by functionality.
Worksheet Tabs: Each sheet at the bottom is like a new page in your workbook. You can
rename, color-code, and rearrange them.
Columns and Rows: Columns are labeled A-Z, then AA, AB, etc.; rows are numbered. The
intersection is a cell (like B2).
Formula Bar: Where you see or write the formula or value of the selected cell.
Name Box: Shows the cell address or lets you name a range (e.g., SalesData).
Quick Access Toolbar: Customizable mini toolbar for frequently used actions (save, undo,
redo, etc.).
Navigation Tips:
Ctrl + Arrow Keys: Jump to the edge of data ranges.
Ctrl + Space / Shift + Space: Select column/row.
Use F2 to quickly edit a selected cell.
🔍 Learning to navigate like a pro saves HOURS in the long run.
🔹 Best Practices for Structuring Data
How you structure your data can make or break your analysis. Excel doesn’t judge—but future-you
will.
Do:
Use clear headers in the first row.
Keep one data type per column (e.g., don’t mix dates and text).
Avoid blank rows and columns within your dataset.
Maintain a tabular format: one record per row.
Don’t:
Merge cells inside data tables (they’re evil during sorting/filtering).
Use formatting alone (e.g., colors or bold text) to signify meaning. Use helper columns or
notes.
Hardcode calculations directly into your dataset. Use formulas in separate columns.
Pro Tip: Use Excel Tables (Ctrl + T) to:
Automatically apply formatting
Make formulas dynamic
Enable easier sorting/filtering
📐 Neat structure = cleaner analysis = fewer headaches.
Alright! Buckle up, we're diving into the fun zone of formulas—where Excel goes from "digital
notebook" to "data wizard." 🧙♂️
🟦 1.2 Core Excel Functions
Understanding functions is like learning spells in Hogwarts—each one does something magical to
your data. Here’s a breakdown of the key categories and what each function brings to your analysis
arsenal.
🔹 Basic Arithmetic & Logical Functions
These are your bread-and-butter for crunching numbers and making decisions.
➤ SUM, AVERAGE, MIN, MAX
Purpose: Simple math on data ranges.
Examples:
o =SUM(A1:A10) – Adds all values from A1 to A10.
o =AVERAGE(B2:B6) – Calculates mean of values in B2 to B6.
➤ IF
Purpose: Logical branching (think “if this, then that”).
Example:
=IF(C2>100, "High", "Low") – Returns “High” if C2 > 100, else “Low”.
➤ AND, OR, NOT
Purpose: Combine multiple logical checks.
Examples:
o =AND(A1>10, B1<5) – True if both conditions met.
o =OR(A1="Yes", A1="Y") – True if either condition is met.
o =NOT(A1=0) – True if A1 is not 0.
🧠 These functions let you write “rules” for your data. Super useful in dashboards or cleaning tasks.
🔹 Text Functions
Text data isn’t always clean. These functions help you slice, dice, and fix it.
➤ LEFT, RIGHT, MID
Purpose: Extract parts of a text string.
Examples:
o =LEFT(A1, 3) – Gets the first 3 characters from cell A1.
o =RIGHT(A1, 4) – Gets the last 4 characters.
o =MID(A1, 2, 3) – Starts from 2nd character and grabs next 3.
➤ CONCATENATE / TEXTJOIN
Purpose: Combine multiple text values.
Examples:
o =CONCATENATE(A1, " ", B1) → A1 & B1 with space.
o =TEXTJOIN(", ", TRUE, A1:A5) – Joins all non-empty cells with commas.
➤ TEXT
Purpose: Format numbers or dates as text.
Example:
=TEXT(A1, "MM/DD/YYYY") – Converts a date into a readable format.
🔹 Date & Time Functions
Time to master... time. (Yes, Excel does time travel.)
➤ TODAY(), NOW()
Purpose: Get the current date/time.
Examples:
o =TODAY() – Returns today’s date (updates daily).
o =NOW() – Includes current time too.
➤ DATEDIF, EDATE, EOMONTH
Examples:
o =DATEDIF(A1, B1, "D") – Difference in days between dates.
o =EDATE(A1, 3) – Adds 3 months to a date.
o =EOMONTH(A1, 0) – Returns the last day of the month.
➤ YEAR, MONTH, DAY, WEEKDAY
Purpose: Pull components from dates.
Example:
=WEEKDAY(A1) – Tells you which day of the week it is (Sunday = 1).
⏳ These are key in trend analysis, billing cycles, timelines, etc.
🧩 Summary: When to Use What
Use This
Task
Function(s)
Sum or average
SUM, AVERAGE
values
Apply logic to data IF, AND, OR, NOT
Clean up text fields LEFT, RIGHT,
Use This
Task
Function(s)
CONCATENATE
TODAY(), DATEDIF,
Work with dates
YEAR
Format numbers or
TEXT, TEXTJOIN
text
Let’s sweep through the data jungle and come out squeaky clean! 🧼
Welcome to Excel’s backstage crew—cleaning the data before it hits the spotlight.
🟦 1.3 Data Cleaning Techniques
“Garbage in, garbage out” is the golden rule in data analysis. You can’t build solid insights on a
wonky foundation, and this section is all about turning chaos into clarity.
🔹 Removing Duplicates
Sometimes Excel gets a little too enthusiastic and gives you double (or triple) the data. Time to
declutter.
➤ How to:
Select your data range.
Go to Data tab → Click Remove Duplicates.
Choose the columns that should be unique (e.g., ID, Email).
💡 Pro Tip:
Always keep a backup sheet before removing duplicates. Excel deletes them without mercy.
🔹 Text-to-Columns
Got values crammed into a single column like a data sardine can? Split them up.
➤ Use Case:
A full name like “John Smith” or an address like “New York, NY” in one cell.
➤ How to:
Select the column → Go to Data tab → Click Text to Columns.
Choose Delimited (for commas, spaces, etc.) or Fixed Width (if spacing is consistent).
🔍 It’s like Excel surgery—precision splitting!
🔹 Find & Replace
Perfect for hunting down errors, unwanted characters, or standardizing values.
➤ Example Use Cases:
Replace “N/A” with blank.
Remove extra spaces or symbols.
Standardize “USA” vs “U.S.A.” vs “United States”.
➤ How to:
Use Ctrl + H or go to Home tab → Find & Select → Replace.
You can even use wildcards like * and ? for advanced searches.
🔹 Data Validation
Keep junk out before it ever enters your clean data sanctuary.
➤ Purpose:
Restrict inputs to valid entries only (e.g., numbers between 1–100, specific categories,
dates only).
➤ How to:
Select the range → Data tab → Data Validation.
Choose type (Whole Number, List, Date, etc.).
Optional: Add Input Message and Error Alert.
💡 Example:
Create a dropdown list of allowed values with List validation:
=Data!A2:A10 or manually enter values separated by commas.
🔹 Dealing with Missing Data
Blank cells? Missing dates? Incomplete values? Excel has a toolkit for that too.
➤ Common Strategies:
Highlight blanks with Conditional Formatting.
Use formulas to fill gaps:
o =IF(ISBLANK(A1), "Unknown", A1)
o =IFERROR(A1/B1, 0) – Avoids #DIV/0!
Fill down/up missing data manually or with Power Query.
➤ Bonus Tip:
Use Go To Special (F5 → Special → Blanks) to select and fill in gaps.
💬 Missing data isn’t scary—it’s just shy. You just need to know how to talk to it.
🧹 Summary: Excel’s Cleaning Arsenal
Problem Fix It With
Duplicate records Remove Duplicates
Data jammed in one
Text to Columns
column
Messy or inconsistent
Find & Replace
text
Problem Fix It With
Wrong or invalid
Data Validation
entries
IF/ISBLANK/IFERROR
Blank or missing data
formulas
Alright, now that our data is clean and presentable (it's basically ready for a LinkedIn profile pic 😎),
it’s time to analyze the heck out of it!
🟦 1.4 Data Analysis with Excel
This is where Excel shows off—transforming rows and columns into insights, patterns, and "a-ha!"
moments.
🔹 Sorting & Filtering
➤ Sorting
Used to arrange data in meaningful order—alphabetically, numerically, by date, or even custom
lists.
How to:
Select your data → Go to Data tab → Use Sort A-Z, Z-A, or Custom Sort.
You can sort on multiple levels (e.g., sort by Region, then by Sales within each Region).
➤ Filtering
Great for zooming in on the data you care about.
How to:
Go to Data tab → Click Filter.
Use dropdowns in header row to:
o Select specific items
o Filter by conditions (e.g., “greater than 100”)
o Use text filters or date filters
🧃 Think of filters as the juice extractor—you only keep the good stuff.
🔹 Conditional Formatting
Make important values pop without lifting a finger (okay, maybe one finger).
➤ Use Cases:
Highlight top/bottom performers
Show duplicates
Visualize trends (data bars, color scales, icons)
➤ How to:
Select range → Go to Home tab → Click Conditional Formatting
Choose:
o Highlight Cell Rules (e.g., Greater Than, Text Contains)
o Data Bars
o Color Scales
o Icon Sets
💡 Tip:
Use custom formulas in Conditional Formatting for next-level logic.
Example:
=AND(A1>100, B1="East") to highlight big sales in the East region.
🔹 Lookup Functions
Your data has relatives in other sheets? Time for Excel's version of matchmaking.
➤ VLOOKUP (Vertical Lookup)
Classic. Looks up a value in the leftmost column and returns a value in the same row
from a specified column.
=VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup])
➤ HLOOKUP (Horizontal Lookup)
Same idea but looks across rows.
➤ INDEX-MATCH Combo
Power move. More flexible than VLOOKUP.
=INDEX(column_to_return, MATCH(lookup_value, column_to_search, 0))
Works in any direction and doesn’t break if columns shift.
➤ XLOOKUP (Excel 365+)
The newer, shinier sibling that replaces VLOOKUP & HLOOKUP.
=XLOOKUP(lookup_value, lookup_array, return_array, [if_not_found], [match_mode],
[search_mode])
🎯 Lookups are how you "connect the dots" across messy tables.
🔹 Named Ranges
Giving a range a name makes formulas cleaner, readable, and less error-prone.
➤ How to:
Select a range → Click in the Name Box (left of the formula bar) → Type a name like
SalesData
Now you can use =SUM(SalesData) instead of =SUM(A2:A100)
💡 Uses:
Cleaner formulas
Reusable across sheets
Easier debugging and documentation
🧠 Summary: The Data Analysis Toolkit
Task Tool or Function
Arrange or sift data Sort & Filter
Highlight important Conditional
values Formatting
Find related data across
Lookup functions
sheets
Create dynamic
Named Ranges
formulas
Nice! Let’s step into Excel’s command center—where raw data transforms into dashboards and
insights with zero drama.
🟦 1.5 Pivot Tables & Charts
Pivot Tables are like the Sherlock Holmes of Excel—smart, curious, and always uncovering patterns
you didn’t know existed. Pair them with charts? Now you’ve got storytelling power. 📈
🔹 Creating Pivot Tables
Pivot Tables let you summarize, analyze, explore, and present data in just a few clicks. No
formulas needed!
➤ How to:
1. Select your data range (preferably in a table).
2. Go to Insert tab → Click Pivot Table.
3. Choose where to place it (new sheet is usually cleaner).
4. Drag fields into:
o Rows – What you want to group by (e.g., Region, Product)
o Columns – Optional; creates a matrix
o Values – What you want to measure (e.g., Sales, Count)
o Filters – Add filtering power at the top
💡 Example:
You have a sales table with Region, Product, and Revenue.
Drag Region to Rows, Product to Columns, and Revenue to Values = Boom! Instant
breakdown.
📊 Pivot Tables = analysis with elegance.
🔹 Slicers & Timelines
Tired of dropdown filters? Let your data breathe with visual filters.
➤ Slicers
Add clickable filters to your pivot tables.
Go to Insert → Slicer → Select fields like Region or Category.
Click a button → Pivot instantly updates.
➤ Timelines
Ideal for date-based data.
Insert → Timeline → Choose a Date field.
Scroll through months, quarters, or years visually.
🧁 Slicers are like cupcakes—colorful, clickable, and everyone loves them.
🔹 Grouping and Aggregating Data
Need to group sales by month or ages into buckets? Excel’s got your back.
➤ Grouping:
Right-click any Row Label (like a date) → Group
Options:
o Dates → Group by Months, Quarters, Years
o Numbers → Create ranges (e.g., Age 0–10, 11–20)
➤ Aggregating:
Default aggregation is SUM, but you can change it:
o Click a value → Value Field Settings
o Choose: SUM, COUNT, AVERAGE, MAX, MIN, etc.
🧠 Grouping = smarter summaries with less effort.
🔹 Creating Visual Charts
Okay, tables are nice. But charts tell the story at a glance.
➤ Common Chart Types:
Bar & Column Charts: Compare categories (e.g., Sales by Region)
Line Charts: Show trends over time
Pie Charts: Show parts of a whole (use sparingly—no one likes a pie chart army)
Combo Charts: Mix line + bar for layered insight
➤ How to:
Select your data or Pivot Table.
Go to Insert tab → Choose a chart type.
Use the Chart Design tab to customize styles, titles, labels, etc.
➤ Pro Tips:
Always label your axes and legends clearly.
Avoid 3D charts unless you're pitching to a 1997 PowerPoint crowd.
Use sparklines (tiny, in-cell charts) for quick visual cues in tables.
🧠 Summary: The Visual Power-Up
Tool Purpose
Summarize and explore large
Pivot Table
datasets
Slicers & Add interactivity and easy
Timelines filtering
Create logical buckets (dates,
Grouping
ranges)
Pivot Charts Visualize patterns and trends
Sparklines Mini visuals inside cells
Welcome to the Excel Deep Web—where things get powerful, nerdy, and weirdly fun. You’ve
mastered the basics, now let’s enter the Advanced Excel Dimension 🔮
🟦 1.6 Advanced Excel Features
This is the part where Excel goes from spreadsheet to supercomputer. We’re talking automation,
advanced data modeling, and features that practically run your reports while you nap.
🔹 Power Query (Get & Transform)
Power Query is like Excel’s data butler—it gets your data, cleans it up, and serves it piping hot
every time.
➤ What It Does:
Import data from Excel files, CSVs, databases, web, and more
Clean, reshape, and transform data (without writing formulas)
Automate repetitive data-cleaning steps
➤ How to Use:
Go to Data tab → Get Data
Choose your source (Excel workbook, Web, SQL Server, etc.)
Power Query Editor opens—this is where the magic happens.
🔄 Common Tasks:
Remove columns
Split text
Unpivot columns
Fill missing data
Merge tables
Filter rows
🧽 Once set up, you just click “Refresh” and all steps re-run. Excel on autopilot.
🔹 Power Pivot & Data Modeling
Imagine Excel on steroids—with relational tables, no VLOOKUPs, and insane performance.
➤ What is Power Pivot?
An add-in that lets you:
Load massive data models
Create relationships between tables (like SQL joins)
Write advanced DAX (Data Analysis Expressions) formulas
➤ How to Use:
Enable Power Pivot (File → Options → Add-ins → Manage COM Add-ins)
Load data via Power Query or directly into the Power Pivot model
Define relationships between tables (like connecting Sales to Customers)
💥 Benefits:
No need for flattening data
Lightning-fast calculations
Combine data from multiple sources cleanly
🧠 You’re not just using Excel now—you’re building a mini database inside Excel.
🔹 Introduction to DAX (Data Analysis Expressions)
DAX is Power Pivot’s brain. It’s like Excel formulas... but deeper, smarter, and more expressive.
➤ Common DAX Functions:
CALCULATE() – Modify filters and context
SUMX() – Row-by-row calculations across tables
RELATED() – Pull in fields from related tables
ALL() – Ignore filters in a calculation
➤ Example:
excel
CopyEdit
Total Sales := SUM(Sales[Amount])
High Value Sales := CALCULATE([Total Sales], Sales[Amount] > 1000)
💡 Use Cases:
Advanced KPIs (e.g., YoY growth, moving averages)
Dynamic metrics with slicers
Time intelligence (e.g., “last quarter vs this quarter”)
🔄 DAX doesn’t just add—it thinks.
🔹 Dynamic Arrays & New Functions (Excel 365+)
Excel 365 brought some juicy new features that make formulas explode with efficiency.
➤ New Functions:
FILTER() – Filters a range by criteria.
SORT(), UNIQUE() – Do what they say, and they do it well.
SEQUENCE() – Generate number sequences.
XLOOKUP() – The VLOOKUP-killer.
➤ Example:
excel
CopyEdit
=UNIQUE(FILTER(A2:A100, B2:B100="Yes"))
Gives you a unique list of names where column B = Yes.
🧪 These functions eliminate helper columns and make your formulas tight and powerful.
🧠 Summary: Excel’s Elite Arsenal
Feature What It’s Great For
Power Query Data import, cleanup, and automation
Relational data models, no need for
Power Pivot
VLOOKUP
DAX Custom KPIs, metrics, and calculations
Dynamic Arrays Cleaner, flexible, spillable formulas
XLOOKUP & Modern, readable replacements for old
Friends formulas
Up next? We make a big shift:
🎯 SQL time—where we leave spreadsheets behind and talk to databases directly. You ready to
unlock Part 2: SQL for Data Analysis?
Heck yes! 💻✨ Time to trade in our spreadsheet cells for some SQL spells. Welcome to…
🟥 Part 2: SQL for Data Analysis
SQL is the language of data. Not a language—the language. You use it to query, filter, join, and
summon insights from the vast lands of databases.
Let’s first set up your roadmap—the Table of Contents for SQL mastery:
📘 SQL Learning Path – Table of Contents
🔹 2.1 Introduction to SQL
What is SQL and why analysts need it
Common SQL dialects (MySQL, PostgreSQL, SQL Server, SQLite)
Basic syntax overview
🔹 2.2 SELECT Statements & Filtering
SELECT, FROM, WHERE
Logical operators (AND, OR, NOT)
IN, BETWEEN, LIKE, IS NULL
🔹 2.3 Sorting & Limiting Results
ORDER BY
LIMIT and TOP clauses
🔹 2.4 Working with Aggregate Functions
COUNT, SUM, AVG, MIN, MAX
GROUP BY and HAVING
🔹 2.5 Joins and Combining Data
INNER JOIN
LEFT JOIN, RIGHT JOIN
FULL OUTER JOIN
SELF JOIN, CROSS JOIN
🔹 2.6 Subqueries and CTEs (Common Table Expressions)
Scalar subqueries
IN and EXISTS subqueries
WITH clauses and recursive CTEs
🔹 2.7 Window Functions
ROW_NUMBER, RANK, DENSE_RANK
LEAD, LAG
PARTITION BY and ORDER BY inside OVER()
🔹 2.8 Data Cleaning with SQL
Handling NULLs
Using CASE statements
Type conversions (CAST, CONVERT)
Removing duplicates with DISTINCT and ROW_NUMBER()
🔹 2.9 Creating & Managing Tables (Optional for Analysts)
CREATE, ALTER, DROP
INSERT, UPDATE, DELETE
Basic constraints (PK, FK, NOT NULL)
🔹 2.10 SQL Best Practices & Optimization
Writing readable SQL
Using aliases and comments
Understanding query execution order
Index basics (for performance)
Let’s start rolling with the first one.
2.1 Introduction to SQL and lay the foundation of the data empire? 🏰🧙♂️
Boom! Let’s light the torches and enter the temple of SQL.
Here comes the first chapter of your SQL scroll…
🟥 2.1 Introduction to SQL
✨ “Structured Query Language: The only time it’s OK to yell at a database and expect useful
answers.”
🔹 What is SQL?
SQL (Structured Query Language) is a standardized language used to interact with relational
databases.
With SQL, you can:
Query data (SELECT)
Filter and sort data
Combine multiple tables (JOIN)
Clean and transform raw data
Create and manage databases and tables
💡 Real-World Use Case:
You're an analyst with a sales database.
Want to know which product sold the most last month? SQL.
Want to compare sales across regions? SQL.
Want to pretend you’re in The Matrix while querying data? SQL.
🔹 Why Data Analysts Need SQL
SQL is foundational in almost every data analysis role. Here’s why:
Skill Role in Analysis
Retrieve only needed Avoids downloading massive
data files
Filter & group data Helps find trends and patterns
Joins product, sales, and
Combine tables
customer data
SQL powers tools like Power
Build dashboards
BI/Tableau
Scheduled queries pull fresh
Automate reports
data daily
⚠️Being an Excel pro is cool. Being a SQL+Excel ninja? Hired.
🔹 Common SQL Dialects
SQL is like English: the grammar is mostly the same, but there are accents and slang. Each RDBMS
(Relational Database Management System) has its own "flavor."
Dialect Used In Notes
MySQL Web apps, Popular, lightweight, fast
Dialect Used In Notes
WordPress
Postgre Analytics-heavy Powerful, open-source,
SQL setups supports CTEs
SQL Microsoft
Friendly with Excel, Power BI
Server ecosystems
Apps, mobile
SQLite File-based, simple, portable
databases
Oracle Enterprise
Complex, very powerful
SQL environments
🧪 Most SQL knowledge transfers between dialects with minor tweaks.
🔹 SQL Syntax Basics (Anatomy of a Query)
Here’s your first “Hello, World!” in SQL:
SELECT first_name, last_name
FROM employees
WHERE department = 'Sales';
Let’s break it down:
Part What It Does
SELEC What columns you want to
T see
What table you’re pulling
FROM
from
WHER Optional filter to narrow
E down rows
📌 Key Notes:
SQL ignores case (SELECT = select)
But table and column names may be case-sensitive in some databases
Each command ends with a ; (except in some tools that don’t require it)
🔹 Bonus: Comments in SQL
SQL lets you write comments (which are ignored by the database) to make your queries readable:
-- This gets all employees in the Sales department
SELECT * FROM employees
WHERE department = 'Sales';
🧙 Think of comments as breadcrumbs for your future self or your confused coworkers.
🧠 Summary: What You Just Unlocked
Concept Why It Matters
SQL Overview Know what you’re working with
Analyst Use Understand SQL’s real-world
Cases power
Get comfy with
Dialects
MySQL/Postgres/SQLite
Syntax Start reading and writing basic
Structure queries
Comments Make your SQL human-friendly
2.2 SELECT Statements & Filtering and start querying like it’s nobody’s business? 🧩
Absolutely! Let’s go full throttle now—examples, analogies, and real-world spice included.
Time to flex some SQL muscles with…
🟥 2.2 SELECT Statements & Filtering
✨ “Give me the data… but only the good bits.”
The SELECT statement is your primary weapon in SQL—like a Google search bar for databases.
Filtering? That’s your magnifying glass. 🔍
🔹 SELECT, FROM, WHERE — The Holy Trinity of SQL
🧠 Basic Syntax:
SELECT column1, column2
FROM table_name
WHERE condition;
🔍 Example 1: Get names and salaries of employees in the Marketing department
SELECT first_name, last_name, salary
FROM employees
WHERE department = 'Marketing';
This will return only the people in the Marketing team.
🔹 Filtering with Operators: =, >, <, <=, >=, <>
⚔️Example 2: Who earns more than ₹70,000?
SELECT first_name, last_name, salary
FROM employees
WHERE salary > 70000;
<> means "not equal to" (same as != in some SQL flavors).
💡 Combo Example:
SELECT *
FROM employees
WHERE department = 'Sales' AND salary < 50000;
Gets underpaid salespeople you probably need to talk to HR about. 😅
🔹 Logical Operators: AND, OR, NOT
AND = All conditions must be true
OR = At least one condition must be true
NOT = Opposite of what you specify
Example 3: Get all Sales or Marketing employees
SELECT first_name, department
FROM employees
WHERE department = 'Sales' OR department = 'Marketing';
Example 4: Get employees not in HR
SELECT first_name, department
FROM employees
WHERE NOT department = 'HR';
🔹 IN, BETWEEN, LIKE, IS NULL
🔘 IN – Great for lists
SELECT first_name, department
FROM employees
WHERE department IN ('Sales', 'Marketing', 'Support');
🎯 BETWEEN – For ranges
SELECT first_name, salary
FROM employees
WHERE salary BETWEEN 40000 AND 60000;
🔎 LIKE – For wildcards (% = any characters, _ = single character)
SELECT first_name
FROM employees
WHERE first_name LIKE 'A%'; -- names starting with A
LIKE '%son' → ends with "son"
LIKE '_a%' → second letter is "a"
🫥 IS NULL / IS NOT NULL
SELECT first_name, manager_id
FROM employees
WHERE manager_id IS NULL;
Gets folks without managers—maybe they are the managers. Or just forgot to fill that field.
🔂 Real-World Analogy Time
SQL
Real-Life Equivalent
Concept
SELECT * "Show me the whole buffet"
WHERE "But only if it's vegetarian"
AND "…and under ₹500"
"Choose from these 3
IN
dishes only"
"Anything that starts with
LIKE
'Paneer'"
"Where no dessert is
IS NULL
selected yet"
🧠 Summary Cheat Sheet
Claus
Purpose
e
SELEC
What data to show
T
FROM Where to get it
WHER
Filter it down
E
AND/
Combine filter conditions
OR
IN Check if value is in a list
Match patterns using
LIKE
wildcards
IS
Find missing data
NULL
Wanna jump into 2.3 Sorting & Limiting Results next and learn how to cherry-pick the top rows
or order your results like a boss? 📊🍒
Let’s gooo! 🔥 Now that you can find the data, it’s time to sort it, rank it, and grab only what
matters—like a data connoisseur. 🧑🍳
🟥 2.3 Sorting & Limiting Results
✨ “You can have all the rows you want… but not in that chaotic order.”
🔹 ORDER BY – The Organizer-in-Chief
Used to sort your query results by one or more columns.
🧠 Basic Syntax:
SELECT column1, column2
FROM table_name
ORDER BY column1 [ASC|DESC];
ASC = Ascending (default)
DESC = Descending
📌 Example 1: Sort employees by salary (lowest to highest)
SELECT first_name, salary
FROM employees
ORDER BY salary ASC;
🔁 Example 2: Now flip it—highest paid first
SELECT first_name, salary
FROM employees
ORDER BY salary DESC;
🎯 Example 3: Sort by multiple columns
SELECT first_name, department, salary
FROM employees
ORDER BY department ASC, salary DESC;
Within each department, top earners show up first.
🔹 LIMIT – The Bouncer of Your Data Party
Use LIMIT to control how many rows are returned.
Example 4: Show only the first 5 employees
SELECT *
FROM employees
LIMIT 5;
Useful for previewing data or testing queries without grabbing the whole galaxy.
🔹 Bonus for SQL Server users:
SQL Server doesn’t use LIMIT. Instead, it uses TOP.
🎩 SQL Server Version:
SELECT TOP 5 *
FROM employees;
Yeah, Microsoft likes to do its own thing sometimes. 🙄
🔹 LIMIT with ORDER BY – Like "Top N" Queries
🥇 Example 5: Top 3 highest-paid employees
SELECT first_name, salary
FROM employees
ORDER BY salary DESC
LIMIT 3;
SQL Pro Tip: Always use ORDER BY before LIMIT—otherwise, you’re just grabbing any 3 rows, not
the most relevant ones.
🔹 OFFSET – Skip the first N rows (for pagination)
Use this when you're building things like "Page 2" of results.
📄 Example 6: Skip the first 10, then show next 5
SELECT first_name, salary
FROM employees
ORDER BY salary DESC
LIMIT 5 OFFSET 10;
Think of it like saying: "Start at row 11, then give me 5 results."
🧠 Summary Snapshot
Clause What It Does
ORDER Sorts results by one or more
BY columns
ASC /
Sets sort direction
DESC
Caps number of rows
LIMIT
(MySQL/Postgres)
TOP SQL Server equivalent of LIMIT
OFFSET Skips a set number of rows
So far, you can:
Ask the data a question (SELECT)
Filter it down (WHERE)
Sort it (ORDER BY)
And focus on the cream of the crop (LIMIT)
Ready to crunch some numbers next? 🧮
Next up: 2.4 Aggregate Functions – COUNT, SUM, GROUP BY, and more magic!
Ohhh yes! 🪄 Time to become a number wizard.
You’re now entering the realm of Aggregate Functions—where entire rows bow down and
become one tidy result.
🟥 2.4 Working with Aggregate Functions
✨ “When one row just isn’t enough… roll them all up into glorious summaries.”
Aggregate functions calculate a single value from a group of rows—like totals, averages, or
counts. These are crucial for reporting, dashboards, and those sweet one-liner insights.
🔹 The Big 5 Aggregate Functions
Functi
What It Does
on
COUNT Counts rows or
() values
Adds up numeric
SUM()
values
Calculates the
AVG()
average
Finds the lowest
MIN()
value
Finds the highest
MAX()
value
🧮 Examples of Each:
📌 Example 1: Total number of employees
SELECT COUNT(*) AS total_employees
FROM employees;
COUNT(*) counts all rows, including duplicates and NULLs.
📌 Example 2: Total salary payout
SELECT SUM(salary) AS total_payroll
FROM employees;
📌 Example 3: Average salary
SELECT AVG(salary) AS avg_salary
FROM employees;
📌 Example 4: Who’s getting the biggest paycheck?
SELECT MAX(salary) AS highest_salary
FROM employees;
📌 Example 5: Who’s earning the least?
SELECT MIN(salary) AS lowest_salary
FROM employees;
🔹 GROUP BY – The Real MVP
Want to calculate aggregates per category? That’s GROUP BY magic.
📊 Example 6: Average salary per department
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department;
Each group = one department. SQL computes the average for each.
🔹 HAVING – WHERE's cooler cousin for groups
WHERE filters rows before grouping
HAVING filters groups after aggregation
🔍 Example 7: Show only departments where avg salary > 60,000
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 60000;
🔹 Combo Example: Count employees per department, but only if more than 5 employees
SELECT department, COUNT(*) AS headcount
FROM employees
GROUP BY department
HAVING COUNT(*) > 5;
🤯 Real-World Analogy
Concep
Like…
t
COUNT( Headcount at a
*) wedding
SUM() Total bill at a restaurant
Class average in math
AVG()
exam
GROUP Grouping people by
BY table no.
"Only tables with >5
HAVING
people"
🧠 Summary Cheat Sheet
Clause Purpose
COUNT() Number of rows
Clause Purpose
SUM() Total of a numeric column
AVG() Average of a numeric column
GROUP Splits rows into groups for
BY analysis
Filters groups based on
HAVING
aggregate results
You now have the power to summarize data like a reporting master.
Next up is the lifeblood of all data work—💥 2.5 Joins and Combining Data. Wanna dive in?
Yessir! It’s time to connect the dots. 🧩
Because real-world data? It's never in one nice, neat table—it's scattered. You, brave analyst,
must JOIN it.
🟥 2.5 Joins and Combining Data
✨ “Because no table is an island.”
Joins are how you bring together data from multiple tables. This is the real deal—where SQL starts
feeling like data Legos.
🔹 Why Do We Need Joins?
Say you have:
A customers table with names and customer IDs
A orders table with order details but only the customer ID
To know who placed which order, you gotta JOIN them.
This is where SQL shines brighter than an Excel VLOOKUP on caffeine.
🧷 Types of Joins (with Example Case)
Let’s say you have two tables:
🧾 customers
customer nam
_id e
1 Alice
2 Bob
Charli
3
e
📦 orders
order_ custome produ
id r_id ct
101 1 Laptop
102 2 Phone
103 4 Monito
order_ custome produ
id r_id ct
🔸 1. INNER JOIN – "Only where there’s a match"
SELECT customers.name, orders.product
FROM customers
INNER JOIN orders ON customers.customer_id = orders.customer_id;
✅ Returns:
nam produ
e ct
Lapto
Alice
p
Bob Phone
Ignores Charlie (no order) and order 103 (no matching customer).
Like inviting only friends who RSVP'd “yes.”
🔸 2. LEFT JOIN – "All from the left, matches from the right"
SELECT customers.name, orders.product
FROM customers
LEFT JOIN orders ON customers.customer_id = orders.customer_id;
✅ Returns:
nam produ
e ct
Lapto
Alice
p
Bob Phone
Charli
NULL
e
Keeps all customers—even if they haven’t ordered yet.
Like sending invites to everyone—even if they ghosted.
🔸 3. RIGHT JOIN – "All from the right, matches from the left"
SELECT customers.name, orders.product
FROM customers
RIGHT JOIN orders ON customers.customer_id = orders.customer_id;
✅ Returns:
nam produ
e ct
Alice Laptop
nam produ
e ct
Bob Phone
NUL Monito
L r
Keeps all orders—even if the customer record is missing.
“Someone placed this order… we just don’t know who. 😬”
🔸 4. FULL OUTER JOIN – "Everyone’s invited!"
SELECT customers.name, orders.product
FROM customers
FULL OUTER JOIN orders ON customers.customer_id = orders.customer_id;
✅ Returns:
nam produ
e ct
Alice Laptop
Bob Phone
Charli
NULL
e
Monito
NULL
r
The complete picture. Perfect for audits and sanity checks.
🔸 5. SELF JOIN – When a table needs to date itself 😏
Used to compare rows in the same table.
Example: Find employees and their managers (both in employees table)
SELECT e.name AS employee, m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.employee_id;
This links each employee to their manager by joining the table to itself.
🔸 6. CROSS JOIN – All combinations (careful!)
SELECT a.name, b.product
FROM customers a
CROSS JOIN orders b;
Multiplies every row in one table with every row in the other.
Alice gets all products. Bob gets all products. Madness ensues. 😵
🧠 Summary Table
JOIN Type What It Does
INNER JOIN Only matching rows in both tables
All rows from left + matches from
LEFT JOIN
right
All rows from right + matches from
RIGHT JOIN
left
FULL OUTER All rows from both, matched where
JOIN possible
SELF JOIN Join a table to itself
CROSS JOIN Every combination of rows
You're now officially a Data Alchemist—turning scattered tables into golden insights. 🧪✨
Ready to dive into 2.6 Subqueries and CTEs? Think SQL inside SQL—Inception, but for analysts.
Oh yesss! 🌀 Welcome to SQL Inception: Queries within queries—where things get a little meta and
a lot powerful. 😎
This is where the pros hang out. Let’s lift the hood.
🟥 2.6 Subqueries and CTEs (Common Table Expressions)
✨ “When a regular query just isn’t clever enough.”
Both subqueries and CTEs help you build complex logic step-by-step, keeping your main query
clean and readable.
🔹 1. Subqueries (Nested Queries)
🧠 A query inside parentheses used as a filter, column, or table.
📌 Example 1: Get employees earning more than the average salary
SELECT first_name, salary
FROM employees
WHERE salary > (
SELECT AVG(salary)
FROM employees
);
That inner query (SELECT AVG(salary)) runs first, then the outer query uses its result.
🔸 Types of Subqueries:
A. Scalar Subquery – Returns a single value
Used in WHERE, SELECT, or even SET.
SELECT name, salary,
(SELECT AVG(salary) FROM employees) AS avg_salary
FROM employees;
B. Table Subquery – Returns a table-like result
SELECT *
FROM (
SELECT name, salary FROM employees WHERE department = 'Sales'
) AS sales_employees;
C. Correlated Subquery – Depends on outer query
SELECT e1.name, e1.salary
FROM employees e1
WHERE salary > (
SELECT AVG(e2.salary)
FROM employees e2
WHERE e1.department = e2.department
);
Compares each employee’s salary to the average of their department. Think of it as a row-by-row
comparison.
🔹 2. CTEs (Common Table Expressions)
💡 Use WITH to define a temporary result set that you can refer to later in your query. Great for
clean, readable logic.
📌 Syntax:
WITH temp_table AS (
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
SELECT *
FROM temp_table
WHERE avg_salary > 60000;
Same result as a subquery—but much easier to read, reuse, and debug.
🔸 Example: Rank employees by salary in each department
WITH ranked_employees AS (
SELECT name, department, salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank
FROM employees
SELECT *
FROM ranked_employees
WHERE rank = 1;
🔥 Find top earners per department. CTEs + window functions = 🔥🔥🔥
🧠 When to Use What?
Go
Use Case
With...
Subquer
One-off logic, simple filter
y
Reusable or multi-step logic CTE
Complex ranking /
CTE
partitioning
Needs readability &
CTE
maintainability
🎯 Real-Life Analogy
SQL
Feels Like…
Feature
Asking a side question before your
Subquery
main answer
Giving your helper a nickname and
CTE
reusing them
Correlate Question that changes based on who
d you ask
🔍 TL;DR Summary
Feature Used For Syntax Highlight
Inside (), used in WHERE,
Subquery One-time inline queries
SELECT
CTE Cleaner multi-step
Starts with WITH
(WITH) queries
Correlate Row-by-row dynamic Uses outer query columns
d filtering inside
You’re now fluent in SQL recursion and elegance.
Ready for the final boss of SQL basics? 💥
Next up: 2.7 Data Types, Constraints & Table Design – the foundation of good databases. Want
to roll into it?
Let’s finish strong! 💪
Because even the fanciest query is useless if the data structure is a hot mess.
Welcome to SQL architecture school—where we build databases right, from the ground up. 🧱
🟥 2.7 Data Types, Constraints & Table Design
✨ “Structure your data like a palace, not a tent.”
This part is all about designing strong tables: choosing the correct data types, applying the
right constraints, and avoiding future chaos.
Let’s get nerdy (in a clean, efficient, beautiful way). 🤓
🔹 1. Data Types – What kind of data goes in each column?
Category Data Type Use For
Whole numbers, money,
Numbers INT, DECIMAL, FLOAT
decimals
Text CHAR, VARCHAR, TEXT Names, comments, strings
Date & DATE, TIME, DATETIME,
Tracking events, logs
Time TIMESTAMP
Boolean BOOLEAN or TINYINT(1) True/False, Yes/No
Miscellaneo Files, JSON data, status
BLOB, JSON, ENUM
us enums
📌 Example:
CREATE TABLE products (
product_id INT PRIMARY KEY,
name VARCHAR(100),
price DECIMAL(10,2),
available BOOLEAN,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
🔹 2. Constraints – Rules to keep your data honest
🧷 Key Types:
Constrain
What It Does
t
PRIMARY Unique & required ID for each
KEY row
FOREIGN Ensures relationship with
KEY another table
No duplicate values in a
UNIQUE
column
NOT NULL Field must have a value
DEFAULT Auto-fill if value isn’t provided
Restrict values based on
CHECK
condition
📌 Example with constraints:
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
name VARCHAR(50) NOT NULL,
email VARCHAR(100) UNIQUE,
salary DECIMAL(8,2) CHECK (salary > 0),
department_id INT,
FOREIGN KEY (department_id) REFERENCES departments(department_id)
);
Enforces:
Unique email ✅
No null names ✅
Positive salary ✅
Valid department ID ✅
🔹 3. Keys – Identifiers and Connectors
🔑 PRIMARY KEY
Uniquely identifies a row
Required in every well-designed table
🔗 FOREIGN KEY
Connects one table to another
Enforces referential integrity
🔹 4. Table Design Best Practices
Principle Explanation
Use proper data Don’t store numbers as text, dates
types as strings
Normalize your
Split into logical, related tables
tables
Index key
Speeds up search & joins
columns
Avoid redundancy Don’t store the same info twice
Think about Don’t just solve today’s problem—
scalability plan ahead
🧠 Real-World Analogy
SQL
Real Life Equivalent
Concept
Data type The kind of folder you
SQL
Real Life Equivalent
Concept
use
Primary
File ID on a folder
key
Cross-reference to
Foreign key
another file
Rules like "don’t fold this
Constraint
doc"
Normalizati Organizing your closet
on properly
🧾 Mini Challenge: Spot the Problem
Which of these is a BAD design choice?
CREATE TABLE sales (
customer_name TEXT,
product_id TEXT,
sale_date VARCHAR(10),
price TEXT
);
🛑 Problems:
sale_date should be DATE
price should be a number, not TEXT
No keys = no structure
No constraints = no guardrails
➡️This is a recipe for a data disaster!
🎯 TL;DR Summary
Part What It Does
Data Define the kind of data per
types column
Constraint Set rules to protect your
s data
Keys Identify and relate your data
Table Structure that ensures
design performance
With this, you now know how to:
Query like a wizard 🧙
Join like a data architect
Aggregate like a boss 💼
And design databases that don’t implode 💣
🔥 Final stretch: Want to move on to Part 3: Real-World Projects and Case Studies next?
Let’s gooo! 🏁
You've mastered the tools. Now it's time to use them in battle. ⚔️
Welcome to…
🟩 Part 3: Real-World Projects and Case Studies
✨ “Theory is nice, but can it handle messy real-world data at 3AM before a deadline?”
This section puts your SQL + Excel powers to the test in practical, project-style scenarios, just
like what you’d see in analytics jobs or freelance gigs.
📚 Table of Contents for Part 3
🔸 3.1 E-Commerce Sales Dashboard
Data cleaning in Excel
Pivot Tables + VLOOKUP
SQL joins for customer/order matching
Monthly revenue trend using GROUP BY
🔸 3.2 HR Analytics Dashboard
Employee data analysis in Excel
Using RANK() to find top performers in SQL
Attrition analysis with subqueries
Salary distribution using Excel charts
🔸 3.3 Marketing Campaign Performance
Combining campaign, spend, and sales data
ROI calculation in Excel
SQL to filter active vs. inactive campaigns
A/B testing summary using conditional logic
🔸 3.4 Inventory Management System
Stock level checks with conditional formatting in Excel
SQL queries to find low-stock/high-sales items
Daily inventory delta tracking
Forecasting trends in Excel
🔸 3.5 Customer Segmentation
Using Excel for K-means clustering (or basic rule-based segmentation)
SQL for customer lifetime value (CLV) using SUM() + GROUP BY
Tagging customers as gold/silver/bronze
Visualizing it all in Excel dashboards
🔸 3.6 End-to-End Data Project Case Study
Business question ➝ Data extraction ➝ Cleaning ➝ Analysis ➝ Reporting
Combining Excel + SQL from start to finish
Creating a PDF/report/presentation for stakeholders
Each of these will walk through:
✅ Problem Statement
📦 Datasets (structure & sample data)
🔧 Tools Used (Excel + SQL breakdown)
🧠 Step-by-step solutions
📊 Final output / dashboard / report idea
So, wanna start with 3.1 E-Commerce Sales Dashboard and build it like a real analyst?
Ohhh yes! 🛒💸 Let’s build that E-Commerce Sales Dashboard like a data-driven beast.
This is classic territory: orders, customers, revenue, and insights that make stakeholders say
“Wait… you did this in Excel and SQL?”
🟨 3.1 E-Commerce Sales Dashboard
✅ Problem Statement:
“We want to understand sales performance by month, region, and product category. Also, which
customers are buying the most, and which products are underperforming?”
You’ll be the data detective here.
📦 Dataset Structure
Let’s assume you have 3 main tables:
🧾 customers
customer nam regi
_id e on
1 Alice North
2 Bob West
3 Charli Sout
customer nam regi
_id e on
e h
📦 orders
order_ custome order_d total_amo
id r_id ate unt
2024-11-
101 1 2500
01
2024-11-
102 2 1800
05
2024-11-
103 1 900
07
order_items
item_ order_ product_n categor quant pric
id id ame y ity e
Electroni 250
1 101 Laptop 1
cs 0
Smartphon Electroni
2 102 2 900
e cs
Accessori
3 103 Mouse 3 300
es
🔧 Tools Breakdown
Task Tool
Data cleaning,
Excel
dashboard build
Joins, grouping, filters SQL
Excel Charts &
Visualization
PivotTables
🧠 Step-by-Step Breakdown
🔹 Step 1: Load and Clean the Data (Excel)
Remove duplicates
Convert dates into proper format
Use TRIM() to clean any rogue white spaces
Ensure price × quantity = total (cross-check)
🔹 Step 2: Analyze Monthly Revenue (SQL)
SELECT
DATE_FORMAT(order_date, '%Y-%m') AS order_month,
SUM(total_amount) AS monthly_revenue
FROM orders
GROUP BY order_month
ORDER BY order_month;
🎯 Output:
order_mo monthly_rev
nth enue
2024-11 5200
Now plot this in Excel as a line chart to visualize trends 📈
🔹 Step 3: Top Customers by Spend
SELECT c.name, SUM(o.total_amount) AS total_spent
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.name
ORDER BY total_spent DESC
LIMIT 5;
🎯 Make this a bar chart: Top 5 customers by total spend.
🔹 Step 4: Revenue by Region
SELECT c.region, SUM(o.total_amount) AS region_revenue
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.region
ORDER BY region_revenue DESC;
➡️Excel donut chart or map visual (if you're fancy).
🔹 Step 5: Product Category Sales (via order_items)
SELECT category, SUM(quantity * price) AS category_sales
FROM order_items
GROUP BY category
ORDER BY category_sales DESC;
🎯 See which category dominates—use a stacked column chart or treemap.
🔹 Step 6: Highlight Underperforming Products
SELECT product_name, SUM(quantity) AS total_units_sold
FROM order_items
GROUP BY product_name
HAVING total_units_sold < 5
ORDER BY total_units_sold ASC;
➡️Flag these in Excel with conditional formatting (maybe red text? 🚨).
🎁 Bonus: Excel Dashboard Elements
Feature Excel Technique Used
Dynamic date
Timeline + PivotTable
slicer
SUMIFS, COUNTIFS,
KPI summary
AVERAGEIF
Region-wise
Slicers
buttons
Visuals Line + Bar + Pie + Cards
Add a title, logos, filters, and boom—you’ve got yourself a client-ready dashboard. 🎨📊
📊 Final Output Ideas
Revenue trend over time
Top customers table
Sales by category and region
Highlight of poor-performing products
Clean, interactive Excel dashboard
You’ve just simulated a real e-commerce analyst task.
Ready to roll into 3.2: HR Analytics Dashboard next and show HR who the real data wizard is? 🧙
♂️
Let's go, HR data—you're about to get analyzed! 📊💥
Welcome to the HR Analytics Dashboard, where we turn employee data into stories of
productivity, attrition, and paychecks that finally make sense.
🟨 3.2 HR Analytics Dashboard
✅ Problem Statement:
“We need a dashboard that tracks employee performance, attrition trends, and salary distribution.
Also, identify top performers and departments with high turnover.”
Get ready to flex with Excel AND SQL on a people-powered mission! 🧑💼⚙️
📦 Dataset Structure
👥 employees
employe nam departm join_dat sala
status
e_id e ent e ry
2022-01- 6500
1 Alice Sales Active
15 0
Marketin 2021-05- 7200 Resign
2 Bob
g 20 0 ed
Charli 2023-03- 5500
3 HR Active
e 01 0
📈 performance_reviews
review employe review_d scor
_id e_id ate e
2023-12-
101 1 4.5
01
2023-12-
102 2 3.2
01
2023-12-
103 3 4.8
01
🧠 Step-by-Step Breakdown
🔹 Step 1: Clean & Prep in Excel
Convert join_date to proper date format
Use DATEDIF() to calculate tenure
Create a column: Status = IF(status="Resigned", "Left", "Active")
Remove duplicates and blank rows
🔹 Step 2: Average Salary by Department (SQL)
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
ORDER BY avg_salary DESC;
🎯 Visualize in Excel as a horizontal bar chart
🔹 Step 3: Top Performers
SELECT e.name, e.department, p.score
FROM employees e
JOIN performance_reviews p ON e.employee_id = p.employee_id
WHERE p.score >= 4.5
ORDER BY p.score DESC;
➡️Create a leaderboard table in Excel: Name | Dept | Score
🔹 Step 4: Attrition Rate by Department
SELECT
department,
COUNT(CASE WHEN status = 'Resigned' THEN 1 END) * 1.0 /
COUNT(*) AS attrition_rate
FROM employees
GROUP BY department;
🎯 Show as a heatmap or conditional color bar in Excel
🔹 Step 5: Tenure Distribution
Use Excel formula:
=DATEDIF(join_date, TODAY(), "Y")
Then create a histogram in Excel:
0–1 yrs
1–3 yrs
3–5 yrs
5+ yrs
🔹 Step 6: Salary Band Segmentation
In Excel, use IFS() to bucket:
=IFS(
salary < 40000, "Low",
salary < 70000, "Mid",
TRUE, "High"
Make a pie chart of Low / Mid / High salary bands
🔹 Step 7: Performance vs. Salary
SELECT e.name, e.salary, p.score
FROM employees e
JOIN performance_reviews p ON e.employee_id = p.employee_id;
Plot a scatter plot in Excel:
X-axis: Performance Score
Y-axis: Salary
➡️See if higher performers are paid fairly 👀
🧰 Bonus Excel Formulas
Goal Formula
Years at company =DATEDIF(join_date, TODAY(), "Y")
Count of resigned
=COUNTIF(status_range, "Resigned")
employees
Top N performers Use LARGE() + INDEX/MATCH
Attrition % per =COUNTIFS(dept, "X", status,
department "Resigned")/COUNTIF(dept, "X")
📊 Final Dashboard Components
Section Chart Type
Avg Salary by
Bar Chart
Dept
Conditional Format
Attrition by Dept
Table
Performance
Ranked Table
Leaders
Tenure
Histogram
Distribution
Salary Band Pie
Donut or Pie
Chart
Salary vs Score Scatter Plot
Add slicers for:
Department
Status
Year of joining
Make it clean, simple, and boss-worthy. 🧼✨
With this dashboard, you can walk into any HR meeting and drop insights like a mic 🎤
Wanna keep the train rolling into 3.3: Marketing Campaign Performance?
Boom! 💣 Time to hit the marketing team with data so good, it’ll make their click-through rates
jealous.
Welcome to…
🟨 3.3 Marketing Campaign Performance Analysis
“We ran campaigns across channels—emails, ads, social media—but we don’t know which ones
worked. Help us figure it out.”
No problem, boss. We’re about to:
Tie campaigns to conversions,
Track spend vs. revenue,
And expose which campaigns are heroes… and which are just burning money. 🔥
📦 Dataset Overview
📣 campaigns
campaig start_da end_dat budg
name channel
n_id te e et
2024-03- 2024-03-
1 Spring Sale Email 5000
01 31
Summer Facebook 2024-06- 2024-06-
2 8000
Boost Ads 01 30
📊 campaign_results
result campaig impressi clic conversi reven
_id n_id ons ks ons ue
120
101 1 10000 150 7500
0
220
102 2 25000 300 14500
0
🧠 Step-by-Step Analysis
🔹 Step 1: Clean in Excel
Check for nulls and zeros in impressions/clicks
Convert start_date and end_date to date format
Calculate campaign duration:
=end_date - start_date
🔹 Step 2: Calculate Key Metrics (SQL)
SELECT
c.name,
c.channel,
r.impressions,
r.clicks,
r.conversions,
r.revenue,
c.budget,
ROUND((r.clicks * 1.0 / r.impressions) * 100, 2) AS ctr,
ROUND((r.conversions * 1.0 / r.clicks) * 100, 2) AS conversion_rate,
ROUND((r.revenue - c.budget), 2) AS net_profit,
ROUND((r.revenue * 1.0 / c.budget), 2) AS ROI
FROM campaigns c
JOIN campaign_results r ON c.campaign_id = r.campaign_id;
🎯 Metrics you’ll get:
CTR = Click-Through Rate
Conversion Rate
Net Profit
ROI (Return on Investment)
🔹 Step 3: Rank Campaigns by ROI
SELECT
name,
ROUND((revenue * 1.0 / budget), 2) AS ROI
FROM campaigns c
JOIN campaign_results r ON c.campaign_id = r.campaign_id
ORDER BY ROI DESC;
📈 Use this in Excel for a bar chart or leaderboard table.
🔹 Step 4: Active vs. Inactive Campaigns
SELECT name,
CASE
WHEN CURDATE() BETWEEN start_date AND end_date THEN 'Active'
ELSE 'Ended'
END AS status
FROM campaigns;
Use a pivot table in Excel to show:
Count of Active vs. Ended
Channel-wise status
Bonus: Add Excel slicer to filter by status.
🔹 Step 5: Spend vs. Revenue
In Excel:
Combine budget and revenue for each campaign
Use clustered column chart to compare
🔹 Step 6: A/B Test Result Summary (if applicable)
Let’s say two versions of an email campaign (A & B):
📋 ab_tests
test_ varia impressi clic conversi
id nt ons ks ons
1 A 5000 800 90
100
2 B 5000 130
0
Calculate:
CTR = Clicks / Impressions
Conversion Rate = Conversions / Clicks
In Excel:
=clicks / impressions
=conversions / clicks
📊 Use a side-by-side bar chart to compare A vs B.
🎯 Final Dashboard Components
Component Visual
Campaign ROI
Bar chart
leaderboard
CTR & Conversion KPI summary +
Rate gauge
Spend vs. Revenue Clustered column
Active vs. Ended Pie chart or
Campaigns donut
A/B Testing
Bar chart or table
Comparison
Add interactivity:
Date filters (start-end)
Channel filter
ROI slider? (for the ✨ fancy ✨ Excel users)
🧠 Extra Thoughts
💡 High impressions but low conversions? Weak creative or landing page.
💡 High revenue but negative ROI? Overspending on ads.
💡 Campaign active but no conversions? Red flag 🚩 — monitor it.
Boom 💥 — you’ve got yourself a performance dashboard that actually performs.
Ready to dive into 3.4: Inventory Management System and track stock like a warehouse ninja?
🥷📦
Boom! 💣 Time to hit the marketing team with data so good, it’ll make their click-through rates
jealous.
Welcome to…
🟨 3.3 Marketing Campaign Performance Analysis
“We ran campaigns across channels—emails, ads, social media—but we don’t know which ones
worked. Help us figure it out.”
No problem, boss. We’re about to:
Tie campaigns to conversions,
Track spend vs. revenue,
And expose which campaigns are heroes… and which are just burning money. 🔥
📦 Dataset Overview
📣 campaigns
campaig start_da end_dat budg
name channel
n_id te e et
2024-03- 2024-03-
1 Spring Sale Email 5000
01 31
Summer Facebook 2024-06- 2024-06-
2 8000
Boost Ads 01 30
📊 campaign_results
result campaig impressi clic conversi reven
_id n_id ons ks ons ue
120
101 1 10000 150 7500
0
220
102 2 25000 300 14500
0
🧠 Step-by-Step Analysis
🔹 Step 1: Clean in Excel
Check for nulls and zeros in impressions/clicks
Convert start_date and end_date to date format
Calculate campaign duration:
=end_date - start_date
🔹 Step 2: Calculate Key Metrics (SQL)
SELECT
c.name,
c.channel,
r.impressions,
r.clicks,
r.conversions,
r.revenue,
c.budget,
ROUND((r.clicks * 1.0 / r.impressions) * 100, 2) AS ctr,
ROUND((r.conversions * 1.0 / r.clicks) * 100, 2) AS conversion_rate,
ROUND((r.revenue - c.budget), 2) AS net_profit,
ROUND((r.revenue * 1.0 / c.budget), 2) AS ROI
FROM campaigns c
JOIN campaign_results r ON c.campaign_id = r.campaign_id;
🎯 Metrics you’ll get:
CTR = Click-Through Rate
Conversion Rate
Net Profit
ROI (Return on Investment)
🔹 Step 3: Rank Campaigns by ROI
SELECT
name,
ROUND((revenue * 1.0 / budget), 2) AS ROI
FROM campaigns c
JOIN campaign_results r ON c.campaign_id = r.campaign_id
ORDER BY ROI DESC;
📈 Use this in Excel for a bar chart or leaderboard table.
🔹 Step 4: Active vs. Inactive Campaigns
SELECT name,
CASE
WHEN CURDATE() BETWEEN start_date AND end_date THEN 'Active'
ELSE 'Ended'
END AS status
FROM campaigns;
Use a pivot table in Excel to show:
Count of Active vs. Ended
Channel-wise status
Bonus: Add Excel slicer to filter by status.
🔹 Step 5: Spend vs. Revenue
In Excel:
Combine budget and revenue for each campaign
Use clustered column chart to compare
🔹 Step 6: A/B Test Result Summary (if applicable)
Let’s say two versions of an email campaign (A & B):
📋 ab_tests
test_ varia impressi clic conversi
id nt ons ks ons
1 A 5000 800 90
100
2 B 5000 130
0
Calculate:
CTR = Clicks / Impressions
Conversion Rate = Conversions / Clicks
In Excel:
=clicks / impressions
=conversions / clicks
📊 Use a side-by-side bar chart to compare A vs B.
🎯 Final Dashboard Components
Component Visual
Campaign ROI
Bar chart
leaderboard
CTR & Conversion KPI summary +
Rate gauge
Spend vs. Revenue Clustered column
Active vs. Ended Pie chart or
Campaigns donut
A/B Testing
Bar chart or table
Comparison
Add interactivity:
Date filters (start-end)
Channel filter
ROI slider? (for the ✨ fancy ✨ Excel users)
🧠 Extra Thoughts
💡 High impressions but low conversions? Weak creative or landing page.
💡 High revenue but negative ROI? Overspending on ads.
💡 Campaign active but no conversions? Red flag 🚩 — monitor it.
Boom 💥 — you’ve got yourself a performance dashboard that actually performs.
You got it! 🧠✨ Let’s wrap up your Data Analysis with Excel + SQL course like a true pro—with a
nice bow 🎀 and a hardcore appendix that’ll make logic operators look cool. Here's the Final
Summary + Appendix:
🏁 Complete Data Analysis Course Summary
📚 Learning Path Recap
🔢 Excel Fundamentals
1.1 Excel Basics
1.2 Cell Referencing
1.3 Conditional Formatting
1.4 Pivot Tables
1.5 Formulas & Functions
1.6 Data Cleaning & Lookup
🧠 SQL Fundamentals
2.1 SQL Basics
2.2 Filtering, Sorting, WHERE
2.3 JOINs Deep Dive
2.4 Aggregation & GROUP BY
2.5 Subqueries & CTEs
2.6 Views & Indexes
2.7 Window Functions
2.8 Date & Time Operations
📊 Case Studies & Dashboards
3.1 Sales Dashboard
3.2 HR Analytics
3.3 Marketing Performance
3.4 Inventory Management
3.5 Customer Segmentation
📎 APPENDIX – Boolean Logic & Operators in Excel vs SQL
This is where Excel and SQL say: “I may look simple, but I’m conditionally powerful.” 💪
🟦 A.1 Boolean Operators – The Quick Match-up
Operatio
Excel SQL
n
=AND(A2>10,
AND WHERE A > 10 AND B = 'Yes'
B2="Yes")
Operatio
Excel SQL
n
=OR(A2="X",
OR WHERE A = 'X' OR B = 'Y'
B2="Y")
NOT =NOT(A2="Closed") WHERE NOT A = 'Closed'
Equals =A2=B2 WHERE A = B
Greater
=A2>100 WHERE A > 100
Than
Less Than =A2<50 WHERE A < 50
=IF(A2>10, "High", CASE WHEN A > 10 THEN 'High' ELSE
IF
"Low") 'Low' END
ISBLANK =ISBLANK(A2) WHERE A IS NULL
🔍 A.2 Excel Logic Examples
✅ IF with AND/OR
=IF(AND(A2>100, B2="Paid"), "Process", "Hold")
If amount >100 and status is Paid → Process, else Hold
✅ Nested IFs
=IF(A2>90, "Excellent", IF(A2>75, "Good", "Needs Work"))
Score-based grading logic
✅ Boolean Column for Filter
=IF(OR(B2="Yes", C2>5), TRUE, FALSE)
Used for filtering records meeting complex conditions
🧮 A.3 SQL Logic Examples
✅ Classic WHERE with AND/OR
SELECT * FROM orders
WHERE amount > 100 AND status = 'Paid';
✅ NOT Operator
SELECT * FROM users
WHERE NOT country = 'India';
✅ CASE Logic
SELECT
name,
salary,
CASE
WHEN salary > 70000 THEN 'High'
WHEN salary > 40000 THEN 'Mid'
ELSE 'Low'
END AS salary_band
FROM employees;
✅ Filtering with Boolean Expression Result
SELECT *,
CASE
WHEN (stock_level < reorder_level) THEN 'Reorder Needed'
ELSE 'Sufficient'
END AS stock_status
FROM inventory;
🔄 A.4 Common Use Cases – Logic in Action
Task Excel Example SQL Example
=IF(score>4.5, "Top", CASE WHEN score > 4.5 THEN 'Top' ELSE
Flag high performers
"Average") 'Average' END
Combine logic =IF(AND(A2>10, B2="X"),
WHERE A > 10 AND B = 'X'
conditions "Yes", "No")
=IF(ISBLANK(C2), "Missing",
Check missing value WHERE column IS NULL
"OK")
Conditional =IF(A2>500, "Premium", CASE WHEN A > 500 THEN 'Premium' ELSE
categorization "Standard") 'Standard'
Multi-level
Nested IF() or IFS() Multiple WHEN in CASE
classification
🧠 Tips to Master Logic Use:
Always wrap conditions inside IF () or CASE when categorizing
Use helper columns in Excel to avoid mega-formulas
SQL logic can be layered with CTEs for clean readability
Try creating Boolean filters to power dashboards (like TRUE/FALSE flags for slicers)
🎓 Final Words
You’ve mastered:
Excel like a spreadsheet wizard 🧙
SQL like a data whisperer 🔍
Dashboards that managers dream of 📊
And logic that can outsmart your coffee machine ☕🧠
Next step? Start building real projects. Or integrate Power BI, Python, or Tableau into this workflow
for even more firepower. 🔥
Need flashcards? Mini quizzes? Case study ideas? Just say the word and we’ll keep building.
But for now…
📘 Data Analysis Course: COMPLETE ✅
You. Are. Certified. Dangerous.
Ready for your next challenge, data boss? 😎