SQL Tuning#1
SQL Tuning#1
SQL Tuning#1
Overview
• SQL Writing Process
• SQL Standards
• Using Indexes
• The Optimizer
• FROM, WHERE Clauses
• EXPLAIN
• SQL Trace
• Sub-Selects and Joins
• Tips and Tricks
2
SQL Writing Process
SELECT columns
FROM tables
WHERE ... (joins, filters, subqueries)
3
SQL Writing Process
• There are many, many ways to get the right results,
but only one is the fastest way—1000-to-1
improvements are attainable!
4
Pre-Tuning Questions
• How long is too long?
5
SQL Standards
Why are SQL standards important?
• Maintainability, readability
6
SQL Standards
Question: which of these statements are the same?
A. SELECT LNAME FROM EMP WHERE EMPNO = 12;
B. SELECT lname FROM emp WHERE empno = 12;
C. SELECT lname FROM emp WHERE empno = :id;
D. SELECT lname FROM emp
WHERE empno = 12;
7
SQL Standards
• Answer: None
8
Tables Used in the Examples
9
SQL Standards: Example
Keywords upper case
SELECT E.empno,
and left-aligned
D.dname
Columns on new lines
FROM emp E,
Use std. table aliases
dept D
Separate w/ one space
WHERE E.deptno = D.deptno
Use bind variables
AND (D.deptno = :vardept
AND/OR on new lines
OR E.empno = :varemp);
No space before/after
parentheses
10
Indexes: What are they?
• An index is a database object used to speed retrieval
of rows in a table.
11
Indexes and SQL
• If a column appears in a WHERE clause it is a
candidate for being indexed.
12
Example: Query without Index
No index exists for column EMPNO on table EMP, so
a table scan must be performed:
Table: EMP
SELECT * empno fname lname...
FROM emp 4 lisa baker
WHERE empno = 8 9 jackie miller
1 john larson
3 larry jones
5 jim clark
2 mary smith
7 harold simmons
8 mark burns
6 gene harris
13
Example: Query with Index
Column EMPNO is indexed, so it can be used to find
the requested row:
SELECT *
FROM emp
WHERE empno = 8
Table: EMP
Index: PK_EMP
empno fname lname ...
EMP (EMPNO) 5
4 lisa baker
9 jackie miller
1, 4 5, 9 1 john larson
3 larry jones
5 jim clark
1 2 3 4 5 6 7 8 9 2 mary smith
7 harold simmons
8 mark burns
6 gene harris
14
Indexes: Caveats
• Sometimes a table scan cannot be avoided
15
Indexes: Functions
Using a function, calculation, or other operation on an
indexed column disables the use of the Index
SELECT *
FROM emp Will NOT use index
WHERE TRUNC(hiredate) = TRUNC(SYSDATE);
...
WHERE fname || lname = 'MARYSMITH';
SELECT *
FROM emp
WHERE hiredate BETWEEN TRUNC(SYSDATE)
AND TRUNC(SYSDATE)+1
...
WHERE fname = 'MARY' WILL use index
AND lname = 'SMITH';
16
Indexes: NOT
Using NOT excludes indexed columns:
SELECT *
FROM dept
WHERE deptno != 0; Will NOT use index
... deptno NOT = 0;
... deptno IS NOT NULL;
SELECT *
FROM dept
WILL use index
WHERE deptno > 0;
17
The Optimizer
• The WHERE/FROM rules on the following pages apply
to the Rule-based optimizer (Oracle).
18
FROM Clause: Driving Table
Specify the driving table last in the FROM Clause:
SELECT *
FROM dept D, -- 10 rows
emp E -- 1,000 rows Driving table is EMP
WHERE E.deptno = D.deptno;
SELECT *
FROM emp E, -- 1,000 rows
Driving table is DEPT
dept D -- 10 rows
WHERE E.deptno = D.deptno;
19
FROM Clause: Intersection Table
When joining 3 or more tables, use the Intersection table
(with the most shared columns) as the driving table:
SELECT *
FROM dept D,
salgrade S, EMP shares columns with
emp E DEPT and SALGRADE,
WHERE E.deptno = D.deptno so use as the driving table
AND E.grade = S.grade;
20
WHERE: Discard Early
Use WHERE clauses first which discard the maximum
number of rows:
SELECT *
FROM emp E
WHERE E.empno IN (101, 102, 103) 3 rows
AND E.deptno > 10; 90,000 rows
21
WHERE: AND Subquery First
When using an "AND" subquery, place it first:
SELECT *
FROM emp E CPU = 156 sec
WHERE E.sal > 50000
AND 25 > (SELECT COUNT(*)
FROM emp M
WHERE M.mgr = E.empno)
23
WHERE: Filter First, Join Last
When Joining and Filtering, specify the Filter condition
first, Joins last.
SELECT *
FROM emp E,
dept D
WHERE (E.empno = 123 Filter criteria
OR D.deptno > 10)
AND E.deptno = D.deptno; Join criteria
24
Subqueries: IN vs. EXISTS
Use EXISTS instead of IN in subqueries:
SELECT E.*
FROM emp E IN: Both tables are
WHERE E.deptno IN ( scanned
SELECT D.deptno
FROM dept D
WHERE D.dname = 'SALES');
26
Join vs. EXISTS
Best performance depends on subquery/driving table:
SELECT * EXISTS: better than Join if
FROM emp E the number of matching
WHERE EXISTS ( rows in DEPT is small
SELECT 'X'
FROM dept D
WHERE D.deptno = E.deptno
AND D.dname = 'SALES');
27
Explain
Display the access path the database will use (e.g., use
of indexes, sorts, joins, table scans)
• Oracle: EXPLAIN
• Sybase: SHOWPLAN
• DB2: EXPLAIN
Oracle Syntax:
EXPLAIN PLAN
SET STATEMENT_ID = 'statement id'
INTO PLAN_TABLE FOR
statement
28
Explain
Example 1: “IN” subquery
SELECT *
FROM emp E
WHERE E.deptno IN (
SELECT D.deptno
FROM dept D
WHERE D.dname = 'SALES');
Result:
MERGE JOIN 3 joins
SORT (JOIN) 1 dynamic view
TABLE ACCESS (FULL) OF EMP 2 table scans
SORT (JOIN) 3 sorts
VIEW
SORT (UNIQUE)
TABLE ACCESS (FULL) OF DEPT
29
Explain
Example 2: "EXISTS" subquery
SELECT *
FROM emp e
WHERE EXISTS (
SELECT 'x'
FROM dept d
WHERE d.deptno = e.deptno
AND d.dname = 'SALES');
1 table scan
Result: 1 index scan
FILTER 1 index access
TABLE ACCESS (FULL) OF EMP
TABLE ACCESS (BY INDEX ROWID) OF DEPT
INDEX (UNIQUE SCAN) OF PK_DEPT (UNIQUE)
30
Explain
Example 3: Join (no subquery)
SELECT E.*
FROM emp E,
dept D
WHERE D.dname = 'SALES'
AND D.deptno = E.deptno; 1 table scan
1 index scan
1 index access
Result:
NESTED LOOPS
TABLE ACCESS (FULL) OF EMP
TABLE ACCESS (BY INDEX ROWID) OF DEPT
INDEX (UNIQUE SCAN) OF PK_DEPT (UNIQUE)
31
SQL Trace
Use SQL Trace to determine the actual time and
resource costs for for a statement to execute.
32
SQL Trace
Step 4: Trace file is created in <USER_DUMP_DEST>
directory on the server (specified by the DBA).
tkprof
echd_ora_15319.trc Trace file
$HOME/prof.out Formatted output file
table=plan_table destination for Explain
explain=dbuser/passwd user/passwd for Explain
33
SQL Trace
Step 6: view the output file:
TIMED_STATISTICS
... must be turned on to get
SELECT E.*
FROM emp E, dept D these values
WHERE D.dname = 'SALES' AND D.deptno = E.deptno;
34
Tips and Tricks: UNION ALL
Use UNION ALL instead of UNION if there are no
duplicate rows (or if you don't mind duplicates):
35
Tips and Tricks: HAVING vs. WHERE
With GROUP BY, use WHERE instead of HAVING (if the
filter criteria does not apply to a group function):
SELECT deptno,
AVG(sal)
FROM emp HAVING: rows are
GROUP BY deptno filtered after result
HAVING deptno IN (10, 20); set is returned
SELECT deptno,
AVG(sal)
FROM emp WHERE: rows are
WHERE deptno IN (10, 20) filtered first--possibly
GROUP BY deptno; far fewer to process
36
Tips and Tricks: EXISTS vs DISTINCT
Use EXISTS instead of DISTINCT to avoid implicit sort (if
the column is indexed):
SELECT DISTINCT DISTINCT: implicit
e.deptno, sort is performed to
e.lname filter duplicate rows
FROM dept d,
emp e
WHERE d.deptno = e.deptno;
39
Tips and Tricks: COUNT
Use COUNT(*) instead of COUNT(column):
SELECT COUNT(empno)
FROM emp;
SELECT COUNT(*)
~ 50% faster
FROM emp;
40
Tips and Tricks: Self-Join
Use a self-join (joining a table to itself) instead of two
queries on the same table:
SELECT E.mgr,
E.lname
FROM emp E,
emp M AFTER: only 1
WHERE M.deptno = 10
AND E.empno = M.mgr;
41
Tips and Tricks: ROWNUM
Use the ROWNUM pseudo-column to return only the first
N rows of a result set. (For example, if you just want a
sampling of data):
42
Tips and Tricks: ROWID
The ROWID pseudo-column uniquely identifies a row,
and is the fastest way to access a row:
43
Tips and Tricks: Sequences
Use a Sequence to generate unique values for a table:
MAX(empno) requires a
SELECT MAX(empno) sort and an index scan
INTO :new_empno
FROM emp; INSERT could fail with a
... Duplicate error if someone
INSERT INTO emp else gets there first
VALUES (:new_empno + 1, ...);
Using a Sequence
INSERT INTO emp ensures that you always
VALUES (emp_seq.NEXTVAL, ...); have a unique number,
or and does not require any
SELECT emp_seq.NEXVAL table reads
INTO :new_empno FROM dual;
44
Tips and Tricks: Connect By
Use CONNECT BY to construct hierarchical queries:
45
Tips and Tricks: Cartesian Products
Avoid Cartesian products by ensuring that the tables are
joined on all shared keys:
SELECT *
FROM dept, -- 10 rows
salgrade, -- 20 rows
emp; -- 1,000 rows
10 * 1000 * 20 = 200,000 rows
SELECT *
FROM dept, -- 10 rows
salgrade, -- 20 rows
emp -- 1,000 rows
WHERE E.deptno = D.deptno
1,000 rows
AND E.grade = S.grade;
46
Q&A
Structure Of Indexes
B*-tree indexes(Default)
Bitmap indexes
Function-based indexes
Invisible Indexes(11g)
DISTINCT vs. GROUP BY
What is the performance
comparison between the above if
the aggregate function not used in
SELECT?