0% found this document useful (0 votes)
1 views

sql indexes - advanced sql _ bipp analytics

The document provides an overview of SQL indexes, explaining their purpose in optimizing query performance and the associated costs of maintaining them. It discusses techniques for verifying index usage, refactoring queries for better performance, and the use of functional indexes. Additionally, it highlights the benefits of materialized views and execution directives to enhance query efficiency.

Uploaded by

Yanet Cesaire
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

sql indexes - advanced sql _ bipp analytics

The document provides an overview of SQL indexes, explaining their purpose in optimizing query performance and the associated costs of maintaining them. It discusses techniques for verifying index usage, refactoring queries for better performance, and the use of functional indexes. Additionally, it highlights the benefits of materialized views and execution directives to enhance query efficiency.

Uploaded by

Yanet Cesaire
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Platform Features Pricing Resources Learn Company Connect Sign in Try for free

Search SQL Tutorial

SQL Tutorial > Advanced SQL > SQL Indexes

SQL Indexes
An SQL index is an independently stored data structure that belongs to a table. A table IN THIS PAGE
can have zero or more indexes. When used appropriately, an SQL index can speed up
your query. SQL indexes are primarily used with WHERE, JOIN or ORDER BY clauses Cost of SQL
Indexes
to optimize the access of few records. Indexes are not typically used in analytical Verify Indexes are
environments where queries access big portions of the involved tables. being Used
SQL Indexes
In these examples, there is an index on table population with a key on the Based on
Expressions
last_name column. This query uses the index:
Query
Refactorization
Copy Reduce the
Record Quantity
SELECT ssn, last_name Avoid Correlated
Subqueries
FROM population
Consider a
WHERE last_name = ‘Smith1234567’ Materialized View
Add Directives to
the Planner

This query cannot use the index as the WHERE clause is using the first_name Closing Words
column:

Copy

SELECT ssn, last_name


FROM population
WHERE first_name = ‘John1234567’

Cost of SQL Indexes


Since indexes are data structures associated with a table, all changes to the table must
be also applied to the index. Inserts, deletes, and updates to tables with indexes are
slower as they perform more write operations.

Before creating an index, evaluate the cost benefit relationship. You should only create
an index where the benefit exceeds the cost. To identify the indexes that have the best
benefit:

1. Identify frequently executed queries with slow response times.


2. Identify which queries can be accelerated using an index.
3. Evaluate cost-benefit ratio of the proposed index.

Verify Indexes are being Used


If you have a slow query with an index, verify the index is actually being used. Once
technique is the EXPLAIN statement. Review Show Me the Best Execution Plan in the
Query Optimization tutorial.

Here is the query:

Copy

SELECT ssn, last_name


FROM population
WHERE lower ( last_name ) = ‘smith 10000002’

Run the query with the EXPLAIN execution plan:

In the EXPLAIN output you see it ran a Sequential Scan instead of the expected Index
Scan on the last_name column. What happened? The problem is the lower()
function. You cannot have a function or operation on a column if you want to use it as
an index.

Refactor your query:

Copy

SELECT ssn, last_name


FROM population
WHERE last_name = ‘smith 10000002’

Run the EXPLAIN execution plan on the refactored query:

The expected Index scan is used.

SQL Indexes Based on Expressions


As you learned, if you use a function or operation in the WHERE clause on the index key
column, the index is not used in the query. Instead, a sequential scan is run.

To solve this limitation, some databases offer a special Functional index to support an
expression as the index key.

In the PostgreSQL dialect you can use CREATE INDEX to create a functional index
using the expression lower(last_name):

Copy

CREATE INDEX functional_index


ON population ( lower(last_name) )

Run the EXPLAIN execution plan on the query:

Success! The query execution plan is using an index scan.

Query Refactorization
In SQL, there are many ways to write a query to return the same results. You can
refactor a slow query to use different clauses or conditions while returning the same
results.

Here are some refactor techniques to try and improve query performance.

Reduce the Record Quantity


Avoid Correlated Subqueries
Consider a Materialized View
Add Directives to the Planner

Reduce the Record Quantity


A slow query which returns a few records may actually be processing many records in
the early stages of the query, with the more restrictive filters (to reduce the result set)
applied at the end of the query processing.
In this case, you can use a Common Table Expression (CTE) to apply the restrictive
filter first, then execute the query on the CTE table.

In these examples, the query is used to obtain the first last_name in each zip_code.

Here is the standard query, with a 7 second execution time:


Here is the refactored query using a CTE to reduce the number of records. The
execution time is 5.4 seconds.

Avoid Correlated Subqueries


A correlated subquery is a subquery that refers to columns of the outer query. The
problem with correlated subqueries is the high number of times it must be executed:
once for each row processed by the outer query. Avoid correlated subqueries whenever
possible.

This example uses a correlated subquery to obtain the zip_code for cities with a
population of 1000 or above. The execution time is 1 minute 37 seconds

Here is the refactored query without using the correlated subquery. The execution time
is 17.2 seconds.

Consider a Materialized View


SQL views obtain the view records online using in the FROM clause. The online
calculation of the view records can affect the query performance. Materialized views
can improve query performance in some cases. There are considerations with using
materialized views. For example, the view must be refreshed frequently for accurate
results.

For this region population example, data refreshed daily is highly accurate.

This view groups any zip_code that starts with the same 3 digits and obtains the
population in the region:

Copy

CREATE VIEW region_population AS


SELECT substring(zip_code, 1, 7) AS region, count(*) AS pop
FROM population
GROUP BY substring(zip_code, 1,7)

This is the query to obtain the population for the regions in the range City200 to
City250:

Copy
SELECT region, pop
FROM region_population
WHERE region BETWEEN ‘City200’ AND ‘Citi250’

When the query is executed, the view region_population records are calculated as part
of the query, adding extra time to the query execution time.

If you create a materialized view, the view records are not calculated each time the
query is run. This example creates the materialized view:

Copy

CREATE MATERIALIZED VIEW region_population AS


SELECT substring(zip_code, 1, 7) AS region, count(*) AS pop
FROM population
GROUP BY substring(zip_code, 1,7)

Execute the query:

Copy

SELECT region, pop


FROM region_population
WHERE region BETWEEN ‘City200’ AND ‘Citi250’

The query is faster without having to calculate the view records.

Add Directives to the Planner


This is a non-standard technique offered by only a few SQL database engines.

Some database engines support adding execution directives in the query. For example,
you can specify an index to use, or an index to not use, and what table to read first.

For example, this Oracle query includes a directive for which index to use:

Copy

SELECT /*+ INDEX (employees emp_department_ix)*/ employee_i


d, department_id
FROM employees
WHERE department_id > 50;

The optimizer directive is included as a comment, enclosed with /* */ to specify the


index emp_department_ix of the table employees.

Closing Words
In this lesson you learned about indexes, functional indexes, and techniques to refactor
your queries to take advantage of indexes. Keep going, learn SQL and increase your
skills!

Query Optimization

Sign up for a Connect to leading SQL


databases
Build visual data models
without learning a new
Create two dashboards
and ten reports for free
forever free language

account!
Sign up for free
Platform Features Resources Learn Company Connect
Business Intelligence bipp Data Modeling Layer Blog Documentation About Request Demo

Embedded Analytics Visual SQL Data Explorer Reports bipp Tutorial Meet The Team Support

Professional Services Data Visualization Release Notes Why bipp? Careers Contact Us

Security Git-based Version Control Community SQL Tutorial News

Pricing In-Database Migrate from Chartio Definitions Case studies

SQL Editor Compare Disclosure

Scheduled Delivery and Privacy


Alerts
Terms
Real-Time BI

© 2022 bipp Inc.

You might also like