SQL Tuning#1

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 49

SQL Tuning Session I

Overview
• SQL Writing Process
• SQL Standards
• Using Indexes
• The Optimizer
• FROM, WHERE Clauses
• EXPLAIN
• SQL Trace
• Sub-Selects and Joins
• Tips and Tricks
2
SQL Writing Process

Step 1: What information do I need?  Columns

Step 2: Where is it?  Tables

Step 3: Write SQL:

SELECT columns
FROM tables
WHERE ... (joins, filters, subqueries)

3
SQL Writing Process
• There are many, many ways to get the right results,
but only one is the fastest way—1000-to-1
improvements are attainable!

• Inefficient SQL can dramatically degrade the


performance of the entire system

• Developers and DBAs must work together to tune the


database and the application

4
Pre-Tuning Questions
• How long is too long?

• Is the statement running on near-production


volumes?

• Is the optimal retrieval path being used?

• How often will it execute?

• When will it execute?

5
SQL Standards
Why are SQL standards important?

• Maintainability, readability

• Performance: If SQL is the same as a (recently)


executed statement, it can be re-used instead of
needing to be reparsed

6
SQL Standards
Question: which of these statements are the same?
A. SELECT LNAME FROM EMP WHERE EMPNO = 12;
B. SELECT lname FROM emp WHERE empno = 12;
C. SELECT lname FROM emp WHERE empno = :id;
D. SELECT lname FROM emp
WHERE empno = 12;

7
SQL Standards
• Answer: None

• Whitespace, case, bind variables vs. constants all


matter

• Using standards helps to ensure that equivalent SQL


can be reused.

8
Tables Used in the Examples

DEPT EMP SALGRADE


deptno empno grade
dname mgr losal
loc job hisal
deptno
fname
lname
comm
hiredate
grade
sal

9
SQL Standards: Example
Keywords upper case
SELECT E.empno,
and left-aligned
D.dname
Columns on new lines
FROM emp E,
Use std. table aliases
dept D
Separate w/ one space
WHERE E.deptno = D.deptno
Use bind variables
AND (D.deptno = :vardept
AND/OR on new lines
OR E.empno = :varemp);
No space before/after
parentheses

10
Indexes: What are they?
• An index is a database object used to speed retrieval
of rows in a table.

• The index contains only the indexed value--usually the


key(s)--and a pointer to the row in the table.

• Multiple indexes may be created for a table

• Not all indexes contain unique values

• Indexes may have multiple columns (e.g., Oracle


allows up to 32)

11
Indexes and SQL
• If a column appears in a WHERE clause it is a
candidate for being indexed.

• If a column is indexed the database can used the


index to find the rows instead of scanning the table.

• If the column is not referenced properly, however,


the database may not be able to used the index and
will have to scan the table anyway.

• Knowing what columns are and are not indexed can


help you write more efficient SQL

12
Example: Query without Index
No index exists for column EMPNO on table EMP, so
a table scan must be performed:

Table: EMP
SELECT * empno fname lname...
FROM emp 4 lisa baker
WHERE empno = 8 9 jackie miller
1 john larson
3 larry jones
5 jim clark
2 mary smith
7 harold simmons
8 mark burns
6 gene harris

13
Example: Query with Index
Column EMPNO is indexed, so it can be used to find
the requested row:
SELECT *
FROM emp
WHERE empno = 8
Table: EMP
Index: PK_EMP
empno fname lname ...
EMP (EMPNO) 5
4 lisa baker
9 jackie miller
1, 4 5, 9 1 john larson
3 larry jones
5 jim clark
1 2 3 4 5 6 7 8 9 2 mary smith
        7 harold simmons
8 mark burns
6 gene harris

14
Indexes: Caveats
• Sometimes a table scan cannot be avoided

• Not every column should be indexed--there is


performance overhead on Inserts, Updates, Deletes

• Small tables may be faster with a table scan

• Queries returning a large number (> 5-20%) of the


rows in the table may be faster with a table scan

15
Indexes: Functions
Using a function, calculation, or other operation on an
indexed column disables the use of the Index

SELECT *
FROM emp Will NOT use index
WHERE TRUNC(hiredate) = TRUNC(SYSDATE);
...
WHERE fname || lname = 'MARYSMITH';

SELECT *
FROM emp
WHERE hiredate BETWEEN TRUNC(SYSDATE)
AND TRUNC(SYSDATE)+1
...
WHERE fname = 'MARY' WILL use index
AND lname = 'SMITH';
16
Indexes: NOT
Using NOT excludes indexed columns:

SELECT *
FROM dept
WHERE deptno != 0; Will NOT use index
... deptno NOT = 0;
... deptno IS NOT NULL;

SELECT *
FROM dept
WILL use index
WHERE deptno > 0;

17
The Optimizer
• The WHERE/FROM rules on the following pages apply
to the Rule-based optimizer (Oracle).

• If the Cost-based Optimizer is used, Oracle will attempt


to reorder the statements as efficiently as possible
(assuming statistics are available).

18
FROM Clause: Driving Table
Specify the driving table last in the FROM Clause:
SELECT *
FROM dept D, -- 10 rows
emp E -- 1,000 rows Driving table is EMP
WHERE E.deptno = D.deptno;

SELECT *
FROM emp E, -- 1,000 rows
Driving table is DEPT
dept D -- 10 rows
WHERE E.deptno = D.deptno;

19
FROM Clause: Intersection Table
When joining 3 or more tables, use the Intersection table
(with the most shared columns) as the driving table:

SELECT *
FROM dept D,
salgrade S, EMP shares columns with
emp E DEPT and SALGRADE,
WHERE E.deptno = D.deptno so use as the driving table
AND E.grade = S.grade;

20
WHERE: Discard Early
Use WHERE clauses first which discard the maximum
number of rows:

SELECT *
FROM emp E
WHERE E.empno IN (101, 102, 103) 3 rows
AND E.deptno > 10; 90,000 rows

21
WHERE: AND Subquery First
When using an "AND" subquery, place it first:
SELECT *
FROM emp E CPU = 156 sec
WHERE E.sal > 50000
AND 25 > (SELECT COUNT(*)
FROM emp M
WHERE M.mgr = E.empno)

SELECT * CPU = 10 sec


FROM emp E
WHERE 25 > (SELECT COUNT(*)
FROM emp M
WHERE M.mgr = E.empno)
AND E.sal > 50000
22
WHERE: OR Subquery Last
When using an "OR" subquery, place it last:
SELECT * CPU = 100 sec
FROM emp E
WHERE 25 > (SELECT COUNT(*)
FROM emp M
WHERE M.mgr = E.empno)
OR E.sal > 50000

SELECT * CPU = 30 sec


FROM emp E
WHERE E.sal > 50000
OR 25 > (SELECT COUNT(*)
FROM emp M
WHERE M.mgr = E.empno)

23
WHERE: Filter First, Join Last
When Joining and Filtering, specify the Filter condition
first, Joins last.

SELECT *
FROM emp E,
dept D
WHERE (E.empno = 123 Filter criteria
OR D.deptno > 10)
AND E.deptno = D.deptno; Join criteria

24
Subqueries: IN vs. EXISTS
Use EXISTS instead of IN in subqueries:
SELECT E.*
FROM emp E IN: Both tables are
WHERE E.deptno IN ( scanned
SELECT D.deptno
FROM dept D
WHERE D.dname = 'SALES');

SELECT * EXISTS: Only outer table


FROM emp E is scanned; subquery
WHERE EXISTS ( uses index
SELECT 'X'
FROM dept D
WHERE D.deptno = E.deptno
AND D.dname = 'SALES');
25
Subquery vs. Join
Use Join instead of Subquery :

SELECT * IN: Both tables are


FROM emp E scanned
WHERE E.deptno IN (
SELECT D.deptno
FROM dept D
WHERE D.dname = 'SALES');

SELECT E.* JOIN: Only one table is


FROM emp E, scanned, other uses index
dept D
WHERE D.dname = 'SALES'
AND D.deptno = E.deptno;

26
Join vs. EXISTS
Best performance depends on subquery/driving table:
SELECT * EXISTS: better than Join if
FROM emp E the number of matching
WHERE EXISTS ( rows in DEPT is small
SELECT 'X'
FROM dept D
WHERE D.deptno = E.deptno
AND D.dname = 'SALES');

SELECT E.* JOIN: better than Exists if


FROM emp E, the number of matching
dept D rows in DEPT is large
WHERE D.dname = 'SALES'
AND D.deptno = E.deptno;

27
Explain
Display the access path the database will use (e.g., use
of indexes, sorts, joins, table scans)

• Oracle: EXPLAIN
• Sybase: SHOWPLAN
• DB2: EXPLAIN
Oracle Syntax:
EXPLAIN PLAN
SET STATEMENT_ID = 'statement id'
INTO PLAN_TABLE FOR
statement

Requires Select/Insert privileges on PLAN_TABLE

28
Explain
Example 1: “IN” subquery
SELECT *
FROM emp E
WHERE E.deptno IN (
SELECT D.deptno
FROM dept D
WHERE D.dname = 'SALES');

Result:
MERGE JOIN 3 joins
SORT (JOIN) 1 dynamic view
TABLE ACCESS (FULL) OF EMP 2 table scans
SORT (JOIN) 3 sorts
VIEW
SORT (UNIQUE)
TABLE ACCESS (FULL) OF DEPT
29
Explain
Example 2: "EXISTS" subquery
SELECT *
FROM emp e
WHERE EXISTS (
SELECT 'x'
FROM dept d
WHERE d.deptno = e.deptno
AND d.dname = 'SALES');

1 table scan
Result: 1 index scan
FILTER 1 index access
TABLE ACCESS (FULL) OF EMP
TABLE ACCESS (BY INDEX ROWID) OF DEPT
INDEX (UNIQUE SCAN) OF PK_DEPT (UNIQUE)

30
Explain
Example 3: Join (no subquery)
SELECT E.*
FROM emp E,
dept D
WHERE D.dname = 'SALES'
AND D.deptno = E.deptno; 1 table scan
1 index scan
1 index access
Result:
NESTED LOOPS
TABLE ACCESS (FULL) OF EMP
TABLE ACCESS (BY INDEX ROWID) OF DEPT
INDEX (UNIQUE SCAN) OF PK_DEPT (UNIQUE)

31
SQL Trace
Use SQL Trace to determine the actual time and
resource costs for for a statement to execute.

Step 1: ALTER SESSION SET SQL_TRACE TRUE;

Step 2: Execute SQL to be traced:


SELECT E.*
FROM emp E,
dept D
WHERE D.dname = 'SALES'
AND D.deptno = E.deptno;

Step 3: ALTER SESSION SET SQL_TRACE FALSE;

32
SQL Trace
Step 4: Trace file is created in <USER_DUMP_DEST>
directory on the server (specified by the DBA).

Step 5: Run TKPROF (UNIX) to create a formatted


output file:

tkprof
echd_ora_15319.trc Trace file
$HOME/prof.out Formatted output file
table=plan_table destination for Explain
explain=dbuser/passwd user/passwd for Explain

33
SQL Trace
Step 6: view the output file:
TIMED_STATISTICS
... must be turned on to get
SELECT E.*
FROM emp E, dept D these values
WHERE D.dname = 'SALES' AND D.deptno = E.deptno;

call count cpu elapsed disk query current rows


------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 4 19 3 6
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 4 19 3 6

Misses in library cache during parse: 0


Optimizer goal: CHOOSE
Parsing user id: 62 (PMARKS)

Rows Row Source Operation


------- ---------------------------------------------------
6 NESTED LOOPS
14 TABLE ACCESS FULL EMP EXPLAIN output
14 TABLE ACCESS BY INDEX ROWID DEPT
14 INDEX UNIQUE SCAN (object id 4628)

34
Tips and Tricks: UNION ALL
Use UNION ALL instead of UNION if there are no
duplicate rows (or if you don't mind duplicates):

SELECT * FROM emp


UNION UNION: requires sort
SELECT * FROM emp_arch;

SELECT * FROM emp


UNION ALL UNION ALL: no sort
SELECT * FROM emp_arch;

35
Tips and Tricks: HAVING vs. WHERE
With GROUP BY, use WHERE instead of HAVING (if the
filter criteria does not apply to a group function):

SELECT deptno,
AVG(sal)
FROM emp HAVING: rows are
GROUP BY deptno filtered after result
HAVING deptno IN (10, 20); set is returned

SELECT deptno,
AVG(sal)
FROM emp WHERE: rows are
WHERE deptno IN (10, 20) filtered first--possibly
GROUP BY deptno; far fewer to process

36
Tips and Tricks: EXISTS vs DISTINCT
Use EXISTS instead of DISTINCT to avoid implicit sort (if
the column is indexed):
SELECT DISTINCT DISTINCT: implicit
e.deptno, sort is performed to
e.lname filter duplicate rows
FROM dept d,
emp e
WHERE d.deptno = e.deptno;

SELECT e.deptno, e.lname


FROM emp e EXISTS: no sort
WHERE EXISTS (
SELECT 'X'
FROM dept d
WHERE d.deptno = e.deptno);
37
Tips and Tricks: Consolidate SQL
Select from Sequences and use SYSDATE in the
statement in which they are used:

SELECT SYSDATE INTO :vardate FROM dual;


SELECT arch_seq.NEXTVAL INTO :varid FROM dual;
INSERT INTO archive BEFORE: 3 statements
VALUES (:vardate, :varid, ...) are used to perform 1
Insert

INSERT INTO emp_archive


VALUES (SYSDATE, emp_seq.NEXTVAL, ...)
AFTER: only 1
statement is needed
38
Tips and Tricks: Consolidate SQL
Consolidate unrelated statements using outer-joins to the
the DUAL (dummy) table:

SELECT dname FROM dept WHERE deptno = 10;


SELECT lname FROM emp WHERE empno = 7369;
BEFORE: 2 round-trips
SELECT d.dname,
e.lname
FROM dept d,
emp e, AFTER: only 1 round-trip
dual x
WHERE d.deptno (+) = 10
AND e.empno (+) = 7369
AND NVL('X', x.dummy) = NVL('X', e.ROWID (+))
AND NVL('X', x.dummy) = NVL('X', d.ROWID (+));

39
Tips and Tricks: COUNT
Use COUNT(*) instead of COUNT(column):
SELECT COUNT(empno)
FROM emp;

SELECT COUNT(*)
~ 50% faster
FROM emp;

40
Tips and Tricks: Self-Join
Use a self-join (joining a table to itself) instead of two
queries on the same table:

SELECT mgr INTO :varmgr FROM emp WHERE deptno =


10;
LOOP...
SELECT mgr, lname FROM emp WHERE mgr =
:varmgr; BEFORE: 2 round-trips

SELECT E.mgr,
E.lname
FROM emp E,
emp M AFTER: only 1
WHERE M.deptno = 10
AND E.empno = M.mgr;

41
Tips and Tricks: ROWNUM
Use the ROWNUM pseudo-column to return only the first
N rows of a result set. (For example, if you just want a
sampling of data):

SELECT * Returns only the first 10


FROM emp employees in the table,
WHERE ROWNUM <= 10; in no particular order

42
Tips and Tricks: ROWID
The ROWID pseudo-column uniquely identifies a row,
and is the fastest way to access a row:

CURSOR retired_emp_cur IS Instead of selecting the


SELECT ROWID key column(s), ROWID is
FROM emp used to identify the row
WHERE retired = 'Y'; for later use
...
FOR retired_emp_rec IN retired_emp_cur LOOP
SELECT fname || ' ' || lname
INTO :printable_name
FROM emp
WHERE ROWID = retired_emp_rec.ROWID;
...

43
Tips and Tricks: Sequences
Use a Sequence to generate unique values for a table:
MAX(empno) requires a
SELECT MAX(empno) sort and an index scan
INTO :new_empno
FROM emp; INSERT could fail with a
... Duplicate error if someone
INSERT INTO emp else gets there first
VALUES (:new_empno + 1, ...);

Using a Sequence
INSERT INTO emp ensures that you always
VALUES (emp_seq.NEXTVAL, ...); have a unique number,
or and does not require any
SELECT emp_seq.NEXVAL table reads
INTO :new_empno FROM dual;

44
Tips and Tricks: Connect By
Use CONNECT BY to construct hierarchical queries:

SELECT LPAD(' ',4*(LEVEL-1)) || lname Name,


Job
FROM emp Name Job
WHERE job != 'CLERK' King PRESIDENT
START WITH job = 'PRESIDENT' Jones MANAGER
CONNECT BY PRIOR empno = mgr; Scott ANALYST
Ford ANALYST
Blake MANAGER
Allen SALESMAN
Ward SALESMAN
Martin SALESMAN
Turner SALESMAN
Clark MANAGER

45
Tips and Tricks: Cartesian Products
Avoid Cartesian products by ensuring that the tables are
joined on all shared keys:

SELECT *
FROM dept, -- 10 rows
salgrade, -- 20 rows
emp; -- 1,000 rows
10 * 1000 * 20 = 200,000 rows

SELECT *
FROM dept, -- 10 rows
salgrade, -- 20 rows
emp -- 1,000 rows
WHERE E.deptno = D.deptno
1,000 rows
AND E.grade = S.grade;

46
Q&A
Structure Of Indexes
B*-tree indexes(Default)

Reverse key indexes

Bitmap indexes

Function-based indexes

Invisible Indexes(11g)
DISTINCT vs. GROUP BY
What is the performance
comparison between the above if
the aggregate function not used in
SELECT?

Both will give the same result and


will not have much difference in
performance and can be
interchanged.

You might also like