0% found this document useful (0 votes)
7 views7 pages

SQL Optimization Interview Questions

The document discusses when to use JOINs and Common Table Expressions (CTEs) over subqueries in SQL, emphasizing that JOINs enhance performance and maintainability, particularly with large datasets. It highlights the inefficiencies of correlated subqueries, which can lead to significant performance bottlenecks in production databases. Best practices include defaulting to JOINs for related data retrieval, using CTEs for clarity, and validating aggregate logic to avoid duplicates.

Uploaded by

Shiv Dutt Jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views7 pages

SQL Optimization Interview Questions

The document discusses when to use JOINs and Common Table Expressions (CTEs) over subqueries in SQL, emphasizing that JOINs enhance performance and maintainability, particularly with large datasets. It highlights the inefficiencies of correlated subqueries, which can lead to significant performance bottlenecks in production databases. Best practices include defaulting to JOINs for related data retrieval, using CTEs for clarity, and validating aggregate logic to avoid duplicates.

Uploaded by

Shiv Dutt Jha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

SQL OPTIMIZATION

INTERVIEW
QUESTION
Interview Question:

Q: When should you use JOINs and CTEs instead of


subqueries in SQL?
Asked by: Deloitte, Infosys

✅ Answer:
• Use JOINs for better performance and maintainability,
especially on large datasets.

• Correlated subqueries are executed once per row —


making them inefficient as data size grows.

• JOINs and CTEs support parallel processing, are more


readable, and scale well with indexed tables.
Why it matters:

• In production databases (like Azure SQL, Snowflake,


or PostgreSQL), correlated subqueries often cause
significant performance bottlenecks.

• They trigger nested loop executions per row,


increasing latency, memory usage, and query
complexity — especially in analytical workloads.

• Using JOINs or CTEs helps databases optimize


execution plans and utilize indexes or partitioning
strategies.
Optimization Tips:

Replace correlated subqueries with LEFT JOINs and


GROUP BY

Use CTEs for clarity and modular query building

Leverage execution plans to validate performance


improvements

Always check query cost before and after


transformation
Examples:

Slow Correlated Subquery:


SELECT name,
(SELECT COUNT(*)
FROM orders o
WHERE o.customer_id = c.id)
FROM customers c;

Executes the subquery for each row in customers

✅ JOIN-based Rewrite:
SELECT c.name, COUNT(o.id)
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.name;

Processes data in a single pass, supports index


scans and is highly scalable.
Best Practice:

• ✅ Default to JOINs when retrieving related data across


tables

• ❌ Avoid correlated subqueries unless absolutely


necessary

• ✅ Use CTEs for layering complex logic and improving


readability

• Watch out for duplicated rows — always validate


aggregate logic when switching from subqueries to joins
Real-World Example:
You're querying customers with their total order count from
a table of 10 million rows:

❌ Inefficient Query (Correlated Subquery):


SELECT name,
(SELECT COUNT(*)
FROM orders o
WHERE o.customer_id = c.id)
FROM customers c;

Takes 3–4 minutes due to nested execution

✅ Optimized JOIN:
SELECT c.name, COUNT(o.id)
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.name;

Runs in seconds — joins leverage indexes and


process rows in parallel

Massive improvement on systems like Azure Synapse,


BigQuery, and Snowflake

You might also like