Open In App

SQL Query Optimizations

Last Updated : 06 Aug, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Poorly written SQL queries can result in slow performance, high resource costs, locking and blocking issues, and unhappy users. The following are common practices that can be used to write efficient queries.

1. Use Indexes Wisely

Indexes let the database quickly look up rows instead of scanning the entire table. For Example:

Creating an index on customer_id if there are frequent queries on this column like the following query.

SELECT * FROM orders WHERE customer_id = 123;

CREATE INDEX idx_orders_customer_id ON orders(customer_id);

The above query will run much faster if customer_id is indexed.

  • Primary Index: Automatically created on the primary key; ensures uniqueness and efficient access.
  • Secondary Index: Created on non-primary key columns to improve query performance. Need to be created explicitly.
  • Clustered Index: Determines the physical order of data in a table; only one per table. In some databases primary indexes are automatically clustered.
  • Non-Clustered Index: Contains pointers to the data; multiple can exist per table

Best Practices:

  • Index columns that are frequently used in WHERE clauses, JOIN conditions, or ORDER BY clauses.
  • We should avoid over indexing as it makes insertions and deletions slower. Additional storage space required; can slow down write operations like INSERT, UPDATE, and DELETE due to index maintenance overhead.
  • Regularly monitor and analyze index usage to optimize performance

2. Avoid SELECT *: Choose Only Required Columns

Using SELECT * impact query performance with large tables or joins. The database engine retrieves every column, even the ones you don’t need which increases memory usage, slows down data transfer, and makes the execution plan more complex. Example:

Avoid this:

SELECT * FROM products;

Use this instead:

SELECT product_id, product_name, price FROM products;

Why This Helps:

  • Reduces I/O load and memory usage.
  • Enables the optimizer to skip unnecessary columns.
  • Makes queries more readable and maintainable.

3. Limit Rows with WHERE and LIMIT

Fetching more rows than needed is a common issue. Even if you only use 10 rows in your app, the query might retrieve thousands, slowing things down. Use the WHERE clause to filter data precisely and LIMIT to restrict the number of rows returned.

Example:

SELECT name FROM customers

WHERE country = 'USA'

ORDER BY signup_date DESC

LIMIT 50;

Benefits:

  • Reduces CPU and network load.
  • Avoids returning excessive data during analysis or validation.
  • Great for previewing transformations and testing queries.

4. Write Efficient WHERE Clauses

The WHERE clause is one of the most important parts of an SQL query because it filters rows based on conditions. However, how you write it can significantly impact performance. A common mistake is using functions or operations directly on column values in the WHERE clause — this can prevent the database from using indexes, which slows down query execution.

Poor:

SELECT * FROM employees WHERE YEAR(joining_date) = 2022;

Why this is bad: The YEAR() function is applied to every value in the joining_date column. This disables the use of indexes, forcing a full table scan.

Optimized:

SELECT * FROM employees

WHERE joining_date >= '2022-01-01' AND joining_date < '2023-01-01';

Optimization Tips:

  • Don’t use functions on columns (LOWER(column), YEAR(column), etc.)
  • Avoid mathematical operations like salary + 5000 = 100000
  • Rewrite conditions to let the database use available indexes

5. Avoid Functions on Indexed Columns

Using SQL functions (like UPPER(), LOWER(), DATE()) on indexed columns can prevent the database from using indexes, leading to slower queries.

Bad (pseudocode):

SELECT * FROM users WHERE UPPER(email) = '[email protected]';

Good (pseudocode):

SELECT * FROM users WHERE email = '[email protected]';

Why This Matters:

  • Preserves index usability
  • Keeps search efficient
  • Prevents unnecessary computation on large datasets

6. Use Joins Smartly

Always join only the tables you need and filter data before joining whenever possible. Use INNER JOIN instead of OUTER JOIN when you don't need unmatched records.

Example:

SELECT u.name, o.amount
FROM users u
JOIN orders o ON u.user_id = o.user_id
WHERE o.amount > 100;

Why This Matters:

  • Reduces join processing time
  • Prevents Cartesian products (when no ON condition)
  • Helps the planner optimize the execution path

7. Avoid N+1 Query Problems

N+1 happens when your app makes one query to get a list, then runs additional queries in a loop to get related data. Always aim to fetch related data in one query using JOINs.

Bad (pseudocode):

SELECT * FROM users;
-- For each user: SELECT * FROM orders WHERE user_id = ?

Good pseudocode:

SELECT u.user_id, u.name, o.order_id, o.amount
FROM users u
JOIN orders o ON u.user_id = o.user_id;

Why This Matters:

  • Reduces database calls
  • Improves latency and throughput
  • Keeps the database from getting overloaded

8. Use EXISTS Instead of IN (for Subqueries)

When checking existence, EXISTS can be more efficient than IN, especially if the subquery returns a large dataset.

Bad (pseudocode):

SELECT name FROM customers
WHERE customer_id IN (SELECT customer_id FROM orders);

Good (pseudocode):

SELECT name FROM customers
WHERE EXISTS (
SELECT 1 FROM orders WHERE orders.customer_id = customers.customer_id
);

Why This Matters:

  • EXISTS stops searching once a match is found
  • Often better optimized by query planners
  • Reduces memory use for large subqueries

9. Avoid Wildcards at the Start of LIKE

Using % at the beginning of a LIKE pattern disables index use and leads to full table scans.

Bad (pseudocode):

SELECT * FROM users WHERE name LIKE '%john';

Good (pseudocode):

SELECT * FROM users WHERE name LIKE 'john%';

Why This Matters:

  • Keeps searches index-friendly
  • Speeds up pattern matching
  • Reduces scan overhead

10. Consider Denormalization for Performance

While normalization keeps data clean, excessive joins can slow down read-heavy queries. Denormalization (storing some redundant data) can help in scenarios where performance is more critical than strict data structure.

Example:

  • Store total order amount in the orders table instead of calculating with JOINs every time

Why This Matters:

  • Reduces join complexity
  • Speeds up frequent reads
  • Useful for analytics or dashboards

11. Use Query Execution Plan

Every major DBMS has a way to show the execution plan (like EXPLAIN in MySQL/PostgreSQL). It shows how the SQL engine processes your query.

Example:

EXPLAIN SELECT * FROM orders WHERE user_id = 42;

Why This Matters:

  • Helps identify full table scans
  • Reveals if indexes are used
  • Guides optimization decisions

12. Use UNION ALL Instead of UNION (if possible)

UNION removes duplicates, which adds sorting overhead. If you don’t need duplicates removed, UNION ALL is faster.

Bad (pseudocode):

SELECT col FROM table1
UNION
SELECT col FROM table2;

Good (pseudocode):

SELECT col FROM table1
UNION ALL
SELECT col FROM table2;

Why This Matters:

  • Avoids unnecessary sorting
  • Faster result merging
  • Better for performance on large sets

13. Avoid SELECT Inside Loops (In Applications)

Don't run a query inside a loop in your app if you can write one efficient query that retrieves all needed data.

Bad (pseudocode):

for id in ids:
cursor.execute("SELECT name FROM users WHERE id = ?", (id,))

Good (pseudocode):

SELECT id, name FROM users WHERE id IN (1, 2, 3, 4);

Why This Matters:

  • Reduces round trips to the database
  • Lower latency
  • Better network and CPU usage

14. Partition Large Tables

Partitioning helps by breaking a large table into smaller, more manageable chunks. Queries on partitions are faster as they scan only relevant data.

Example:

  • Partition sales data by year or region

Why This Matters:

  • Speeds up scans on large datasets
  • Helps with parallelism and archiving
  • Makes indexes more effective within partitions

15. Optimize ORDER BY and GROUP BY

Sorting and grouping can be expensive operations. Always limit the number of rows being sorted or grouped and use indexes that match the ORDER BY columns if possible.

Bad (pseudocode):

SELECT * FROM orders ORDER BY created_at;

Good (pseudocode):

SELECT order_id, amount FROM orders WHERE created_at >= '2023-01-01' ORDER BY created_at;

Why This Matters:

  • Reduces memory load for sorting
  • Makes sorting and grouping faster
  • Enhances user-facing query performance

Article Tags :

Similar Reads