0% found this document useful (0 votes)
3 views11 pages

Three SQL Techniques

Uploaded by

ashmad928
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views11 pages

Three SQL Techniques

Uploaded by

ashmad928
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

3 SQL Techniques

Every Data
Scientist Needs
to Know for
Faster Queries
Sluggish Queries?
Turn them Lightning-Fast with:
✔️ Indexing
✔️ Partitioning
✔️ Window Functions

@varshacbendre
The Struggle 😫
Slow queries that:
❌ Slow: Takes forever
❌ Inefficient: Large datasets
❌ Complex: Hard-to-read code

The Solution 👩‍💻


Advanced SQL = Better Queries
✅ Efficient: Faster execution
✅ Scalable: Handles large datasets
✅ Simple: Easy to understand

@varshacbendre
Indexing
A Shortcut for Your Queries

What it does
Speeds up row retrieval,
like an index in a book.

When to use
Searching/filtering on
frequently queried
columns.

Impact
Dramatically improves
SELECT query
performance.

@varshacbendre
Before vs After Indexing
QUERY: Find orders for customer_id = 123

BEFORE INDEXING
Full table scan (10M rows)
Time: 3.2 seconds
Scans ENTIRE table

AFTER INDEXING
Targeted row retrieval
Time: 0.03 seconds
Precise data path

CREATE INDEX Code


CREATE INDEX idx_customer_id
ON orders(customer_id);

Performance Boost
99% faster queries
Reduced overhead
Minimal storage impact

@varshacbendre
Partitioning
Slice Tables for Speed

What it does
Divides tables (e.g., by
date, region)
Boosts performance &
management

When to use
Specific data queries
Logical table divisions

Impact
Faster queries
Reduced complexity
Easier archiving

@varshacbendre
Before vs After Partitioning
QUERY: Analyze data for year = 2023

BEFORE PARTITIONING:
Full table scan (500M rows) ⏳ 12.5
seconds

AFTER PARTITIONING (BY YEAR):


Scans only 2023 data ✨
0.9 seconds

CREATE PARTITION Code:


CREATE PARTITION FUNCTION YearPF (datetime)
AS RANGE RIGHT FOR VALUES
('2022-01-01', '2023-01-01', '2024-01-01');

PERFORMANCE BOOST:
92% faster queries
Simplified data management
Better scalability

@varshacbendre
Window Functions
What it does
Perform calculations over
a set of rows
Examples: Ranking,
moving averages

When to use
Complex aggregations
(no GROUP BY)
Time-series analysis
Comparative calculations

Impact
Improved performance
Fewer subqueries

@varshacbendre
Before vs After Window
Functions
Example: 7-Day Sales Average 📊
Before:
SELECT date, sales,
(SELECT AVG(sales)
FROM sales s2
WHERE s2.date BETWEEN s1.date - 6 AND
s1.date) AS avg_7day
FROM sales s1;

⏳ Execution Time: ~4.2s


After:
SELECT date, sales,
AVG(sales) OVER (
ORDER BY dateROWS BETWEEN 6
PRECEDING AND CURRENT ROW
) AS avg_7day
FROM sales;

✨ Execution Time: ~0.4s


Before vs After Window
Functions

Performance Boost:
90% faster
Simplified structure

Pro Tip: Combine with PARTITION BY


for group-based calculations!

@varshacbendre
⚡ Recap: The Power of
Advanced SQL

1️⃣ Indexing: Locate rows faster with


smart shortcuts.

2️⃣ Partitioning: Optimize queries on


massive datasets.

3️⃣ Window Functions: Simplify


complex calculations.

🚀 These techniques take you from


struggling to scaling with SQL.

@varshacbendre
Ready to Elevate Your Data
Science and AI Journey?

Follow for Daily Insights


and Expert Tips!
@varshacbendre Save

You might also like