7 Oracle SQL Tuning Tactics You Can Start Implementing Immediately
7 Oracle SQL Tuning Tactics You Can Start Implementing Immediately
Kaley Crum
For Oracle Database
THE FOLLOWING PAGES REVEAL 7 TACTICS THAT I HAVE
USED TO TUNE SQL FOR HIGH-PAYING CUSTOMERS
2
Secret #1. Check For Stale Table Statistics
Whenever Oracle makes decisions on how it wants to fulfill queries it makes those decisions using table
statistics. What kind of decisions? Things like what order should it visit the tables in, should the query
use indexes, and what method should Oracle use to join the tables.
Whenever data in a table changes significantly, you should re-gather statistics on the table. For
example, when you delete or update or insert a large number of rows in a table, then you would want to
gather new statistics on the table.
Go through and check each of the tables in your query to make sure that none of them have stale
statistics. You can get this information by querying DBA_TAB_STATISTICS.*
If any of the tables in your query have stale statistics, you might consider re-gathering statistics on those
tables.
Note: Be careful if a table with stale stats is constantly being queried or modified. Gathering statistics
on an active table can cause contention! Try to refrain from gathering statistics during high-activity
periods.
Something else to consider: You should back up your existing statistics on your table when gathering
new statistics. Sometimes, gathering new stats actually causes more problems than it solves. If your old
stats are backed up, you can quickly put them back if problems arise.
You can use my script to backup old stats, and gather new stats located here:
https://tuningsql.com/backup-old-stats-gather-new
3
Secret #2. Leave Columns In The WHERE Clause Alone
It’s usually best to avoid doing any kind of manipulation or calculation on a table column in the WHERE
clause. For example, suppose you have a column called RACE_START_TIMESTAMP on a table called
HORSE_RACES. You might try to find all the races that start today with a query like this:
select *
from horse_races hr
where trunc(hr.race_start_timestamp) = trunc(sysdate)
That will get you the answer, but there are several drawbacks of phrasing the query this way. If
HORSE_RACES is range partitioned on RACE_START_TIMESTAMP, this query will not use
partition-pruning. It will scan ALL partitions in the table, rather than just reading the partition that
corresponds with today. Also, if there is an index on RACE_START_TIMESTAMP, Oracle won’t use it.
Why? Because you’ve modified the RACE_START_TIMESTAMP column. You’re no longer searching
based on RACE_START_TIMESTAMP. You’re searching on the expression
TRUNC(RACE_START_TIMESTAMP). Therefore, Oracle can’t use partition-pruning or indexes on
RACE_START_TIMESTAMP. Even if there are no indexes and no partition-pruning involved, Oracle still
has to spend a little bit more CPU for every row that it visits, because it has to perform a TRUNC()
operation on the RACE_START_TIMESTAMP column, and then evaluate if it’s equal to
TRUNC(SYSDATE) (this isn’t usually all that much, but why waste CPU if you don’t have to?).
It’s better if you can rewrite the query so that the table columns are not modified. For example, the
following query will yield the same answer. The difference is, the new query can use indexes and take
advantage of partition-pruning. It will also save CPU (assuming RACE_START_TIMESTAMP is a DATE
column):
select *
from horse_races hr
where hr.race_start_timestamp between trunc(sysdate)
and trunc(sysdate) + interval '23:59:59' hour to second
Generally, it’s best to leave the table column unmodified. Rather than issue a query like this:
select *
from registered_voters rv
where add_months(rv.birthday, 21*12) < sysdate -- Older than 21?
...it’s better to issue a query like this, which gives the same answer but leaves BIRTHDAY unmodified:
select *
4
from registered_voters rv
where rv.birthday < add_months(sysdate, -21 * 12) -- Older than 21?
Secret #3. Avoid Implicit Type Conversion
This is really a “sneakier” version of tip #2. Have a look at the following query. Notice the '1' on the left
is in single quotes, making it a string, and the other 1 is without quotes, making it a number.
select count(*)
from dual
where '1' = 1
Whenever Oracle has to compare two different types (e.g. string vs number) Oracle will always silently
convert one of the data types to match the other. Specifically, any time you compare a string with a
number, Oracle will always try to convert the string into a number. So really, the above query has an
invisible TO_NUMBER() wrapped around the first '1' like this:
select count(*)
from dual
where to_number('1') = 1
This means that we can accidentally change table columns into expressions without even knowing it.
Yikes! Have a look at the following query. Keep in mind that ZIP codes in the United States are numeric,
but they’re often stored as a VARCHAR2 to preserve leading zeros. If we issue a query like this:
select *
from customers
where zip_code = 90210
...then we’re comparing a VARCHAR2 data type (ZIP_CODE column) with a NUMBER (90210). This
means that any indexes on ZIP_CODE or any partition pruning won’t be options for us on that column,
because really what’s being issued is this:
select *
from customers
where to_number(zip_code) = 90210
How do you fix this? Wrap the numeric expression in single quotes, so that you’re comparing a string to
a number.
select *
from customers
where zip_code = '90210'
Be sure to always compare numbers with numbers, strings with strings, dates with dates, etc.
5
Secret #4. Avoid Unnecessary Sorts and Hashes
Go find a large table (say 10 million rows or more) in your database. Make sure it’s a table, not a view.
If you don’t have a large table, you can make a test table by copying the contents of DBA_OBJECTS
repeatedly. Issue the following:
select *
from MY_LARGE_TABLE
Normally you should pull back your first set of rows pretty fast. Note that there are factors that can
cause it to run slowly, but most of the time it will return quickly. Now issue this query:
select *
from MY_LARGE_TABLE
order by 1
Many times, you will see the second query run much slower than the first! This is because we’ve added
an ORDER BY clause, meaning Oracle will (likely) have to perform a sort on the data. The same is true
if we add a DISTINCT operation:
select distinct *
from MY_LARGE_TABLE
Again this query will likely run much slower than the first query. This is due to Oracle doing extra work to
make the record set distinct. Each row must be either hashed or sorted so that Oracle can remove
duplicates.
This can come in subtler forms as well. For example, a UNION does the exact same thing as a UNION
ALL, except it removes duplicates. How does Oracle remove those duplicates? By sorting or hashing
the row set. So this:
Avoid ORDER BY or DISTINCT clauses if you don’t need them, and prefer a UNION ALL to a UNION if
6
you can. This allows you to reduce the amount of sorting and hashing in your query.
Secret #5. The “Strip Table” Trick
Many times, the most time-intensive piece of any query is “accessing data.” For example, if your query
is doing a full table scan (you’ll see TABLE ACCESS FULL in the plan), then Oracle is going to read the
table from top to bottom. Consider a table with 100 columns, and suppose that each column is 10 bytes
long. If you’re doing a full table scan on such a table, you have to read through all of the table’s data.
Let’s say you issue a query that only uses 3 of those 100 columns. Oracle will have to traverse through
the whole table and discard 97% of the table’s data as part of the full table scan! If Oracle accesses the
table multiple times via full table scan, you can see where this would get extremely expensive.
How could you fix this? By making a “strip table” or a “skinny table.” This works exceptionally well if it
can be done “ahead of time” (i.e. before the scheduled job run). It tends to work best with full table
scans, especially if there are multiple full table scans involved in the overall job. A skinny table (or a strip
table, same thing) is just a copy of a table made with a regular CREATE TABLE .. AS SELECT…
statement, but it only includes the columns needed for querying. Additionally, you might consider setting
PCTFREE on the table to 0. Setting PCTFREE to 0 means that Oracle will pack the rows as densely as
possible when loading them into the table. This allows a full table scan to be as efficient as possible.
You should only make this adjustment if you’re not going to be updating rows on the strip table. This is
usually a safe assumption, as these tables are mostly for read-only purposes.
Instead of
select ...
from MASSIVE_TABLE_WITH_100_COLUMNS ...
select ...
from STRIP_TABLE ...
Usually strip tables are just temporary. Once the job is complete, you can drop the strip table and
re-generate it again the next time the job is run. You can expand this same principle of creating a
staging table ahead of time can to other things as well. For example, you might include pre-calculated
aggregations, or you might only select pertinent rows. The smaller you make the table, the more efficient
full table scans will be.
7
Secret #6. Create An Index
Indexing, like “strip tables,” is another trick to help Oracle access less data. Indexing only tends to help
us pick out a very small number of rows from a table.
Note: You DO NOT want to use an index to read large percentages of a table--a full table scan will often
be much faster. If you’re selecting very small percentages like 5% of the rows in a table, or 1% or 0.5%
indexes are usually a good idea. If you’re selecting 20%, 30%, or 40% of the rows in a table, those are
usually best left to full table scans.
Many times if you add proper constraints to your table, your indexing work will be half done! For
example, if you create a primary key on a table, or a unique key on a table, those constraints will create
indexes for you. Oracle does not, however, automatically index foreign keys (hopefully, you’ve
implemented these in your database as well!). But 99.9% of the time, you should be indexing your
foreign keys as well. If you don’t have proper foreign keys or primary keys on your tables, you can try
indexing the columns that you use to join to other tables. Those are your “unofficial” primary and foreign
keys.
Note that you shouldn’t “go crazy” with indexes. PLEASE don’t index every column in your table hoping
that “something sticks.” Indexes take up space on your hard drive, and in the database’s memory, and
they will make table modifications slower. This is because any time you change the table, you also have
to go and change the index.
Many times, you’ll want to create indexes based on the WHERE clause of your query (again, only use
indexes if you’re only selecting a small number of records). For example, if you have the following
query:
select *
from house_fires
where fire_date between date '2020-01-01' and date '2020-12-31'
and fire_cause = 'Pyroflatulence'
and state = 'Alabama'
Tip: Column order in an index matters! Whenever creating an index, always list the columns with = first
(in our example, it was FIRE_CAUSE and STATE). Then list any columns that use things like >, <, >=,
<=, or BETWEEN (in our example, FIRE_DATE used BETWEEN).
8
Secret #7. Parallelism
Parallelism should be one of the ABSOLUTE LAST tactics you should use. Here’s why: Suppose you’ve hired
someone to cut your grass. After a while, you notice you don’t hear a lawnmower, so you walk outside. You find
the person you hired on their hands and knees in your front lawn using scissors to cut your grass.
Suppose you thought “This is taking forever! I should use parallelism!” So you hire 9 more people and handed
them all pairs of scissors to cut your grass. Suddenly the cost of mowing your lawn just increased 10x because
you’re now hiring 10 people instead of 1! The better solution is fix the inefficient process; take away the
yardworker’s scissors and give them a lawnmower.
It’s the same with databases. If you take an inefficient process, and run it at parallel degree of 10, you’re just
pouring more CPU power into an already inefficient process. Your database server could use that CPU power to
service other processes that need it. It’s best to ensure you have an efficient query first, and then add parallelism
second.
Here’s the shock that most people have with parallelism: Parallelism does not always make things faster! It
can actually make things go slower!! Here’s why:
Most SQL clients like Toad, SQL Developer, etc. have a button where you can check a query’s explain plan. If you
can’t find the appropriate button, you can always check the plan by issuing:
If you believe your query can benefit from parallelism, you can add a parallel degree hint to it. Here’s an example
of how you would add a hint to tell Oracle to run a query with a degree of parallelism of 8:
...to this:
SELECT /*+ parallel(8) */ COUNT(*) FROM MY_FAVORITE_TABLE 9
Want More Secrets?
10