Physical Database Design and Tuning: R&G - Chapter 20
Physical Database Design and Tuning: R&G - Chapter 20
Introduction
We have talked at length about database design
Conceptual Schema: info to capture, tables,
columns, views, etc.
Physical Schema: indexes, clustering, etc.
Physical design linked tightly to query
optimization
We must begin by understanding the workload:
The most important queries and how often they
arise.
The most important updates and how often they
arise.
The desired performance for these queries and
updates.
Frequency % table
monthly
100
daily
0.1
daily
0.1
monthly
10
Employee Table
Name
Salary
S
S
I
I
D
D
S
U
Address
S
I
D
Decisions to Make
What indexes should we create?
Which relations should have indexes?
What
field(s) should be the search key? Should we
build several indexes?
For each index, what kind of an index should it
be?
Clustered? Dynamic/static?
Should we make changes to the conceptual
schema?
For example, denormalize
Horizontal partitioning, replication, views ...
Index Selection
One approach:
Consider most important queries in turn.
Consider best plan using the current
indexes, and see if better plan is possible
with an additional index.
If so, create it.
Before creating an index, must also consider
the impact on updates in the workload!
Trade-off: indexes can make queries go
faster, updates slower. Require disk space,
too.
Or do they???? :-)
WHERE
Example 1
D.dname=To
Example 2
E.dno
Examples of ClusteringSELECT
FROM Emp E
E.hobby=Stamp
E.dno=D.dno
D.mgr
FROM Dept D, Emp E
<E.dno> WHERE D.dno=E.dno
SELECT D.mgr, E.eid
<E.dno,E.eid>
FROM Dept D, Emp E
WHERE D.dno=E.dno
A number of
queries can
be answered
without
retrieving
SELECT E.dno, COUNT(*)
any tuples
<E.dno> FROM Emp E
from one or
GROUP BY E.dno
more of the
relations
SELECT E.dno, MIN(E.sal
involved if a <E.dno,E.sal>
FROM Emp E
B-tree trick!
suitable
GROUP BY E.dno
index is
<E. age,E.sal>
SELECT AVG(E.sal)
available.
or
FROM Emp E
<E.sal, E.age>
WHERE E.age=25 AND
E.sal BETWEEN 3000 AND 500
Horizontal Decompositions
Usual Def. of decomposition: Relation is
replaced by collection of relations that are
projections.
Most important case.
We talked about this at length as part of
Conceptual DB Design
Sometimes, might want to replace relation by a
collection of relations that are selections.
Each new relation has same schema as
original, but subset of rows.
Collectively, new relations contain all rows
of the original.
Typically, the new relations are disjoint.
Horizontal Decompositions
(Contd.)
pid
it if
Points to Remember
Indexes must be chosen to speed up important queries
(and perhaps some updates!).
Index maintenance overhead on updates to key
fields.
Choose indexes that can help many queries, if
possible.
Build indexes to support index-only strategies.
Clustering is an important decision; only one
index on a given relation can be clustered!
Order of fields in composite index key can be
important.
Static indexes may have to be periodically re-built.
Statistics have to be periodically updated.