0% found this document useful (0 votes)
35 views16 pages

Physical Database Design and Tuning: Module 5, Lecture 3

This document discusses physical database design and tuning. The key steps are understanding the workload, choosing appropriate indexes to optimize important queries and updates, and refining the conceptual schema based on the workload. Indexes should be selected to benefit queries while considering the impact on updates, and composite indexes and clustering are important design decisions. The conceptual schema may also be tuned by choosing a lower normal form, denormalizing relations, or further decomposing relations based on the workload.

Uploaded by

harsha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views16 pages

Physical Database Design and Tuning: Module 5, Lecture 3

This document discusses physical database design and tuning. The key steps are understanding the workload, choosing appropriate indexes to optimize important queries and updates, and refining the conceptual schema based on the workload. Indexes should be selected to benefit queries while considering the impact on updates, and composite indexes and clustering are important design decisions. The conceptual schema may also be tuned by choosing a lower normal form, denormalizing relations, or further decomposing relations based on the workload.

Uploaded by

harsha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 16

Physical Database Design

and Tuning
Module 5, Lecture 3

Overview

AfterERdesign,schemarefinement,andthedefinition
ofviews,wehavethelogicalandexternalschemasfor
ourdatabase.
Thenextstepistochooseindexes,makeclustering
decisions,andtorefinetheconceptualandexternal
schemas(ifnecessary)tomeetperformancegoals.
Wemustbeginbyunderstandingtheworkload:
Themostimportantqueriesandhowoftentheyarise.
Themostimportantupdatesandhowoftentheyarise.
Thedesiredperformanceforthesequeriesandupdates.
2

Understanding the Workload

Foreachqueryintheworkload:
Whichrelationsdoesitaccess?
Whichattributesareretrieved?
Whichattributesareinvolvedinselection/joinconditions?How
selectivearetheseconditionslikelytobe?

Foreachupdateintheworkload:
Whichattributesareinvolvedinselection/joinconditions?How
selectivearetheseconditionslikelytobe?
Thetypeofupdate(INSERT/DELETE/UPDATE),andtheattributes
thatareaffected.
3

Decisions to Make

Whatindexesshouldwecreate?
Whichrelationsshouldhaveindexes?Whatfield(s)shouldbethe
searchkey?Shouldwebuildseveralindexes?

Foreachindex,whatkindofanindexshoulditbe?
Clustered?Hash/tree?Dynamic/static?Dense/sparse?

Shouldwemakechangestotheconceptualschema?
Consideralternativenormalizedschemas?(Remember,thereare
manychoicesindecomposingintoBCNF,etc.)
Shouldwe``undosomedecompositionstepsandsettlefora
lowernormalform?(Denormalization.)
Horizontalpartitioning,replication,views...
4

Choice of Indexes

Oneapproach:considerthemostimportantqueriesinturn.
Considerthebestplanusingthecurrentindexes,andseeif
abetterplanispossiblewithanadditionalindex.Ifso,
createit.
Beforecreatinganindex,mustalsoconsidertheimpacton
updatesintheworkload!
Tradeoff:indexescanmakequeriesgofaster,updatesslower.
Requirediskspace,too.

Issues to Consider in Index


Selection

AttributesmentionedinaWHEREclausearecandidatesfor
indexsearchkeys.
Exactmatchconditionsuggestshashindex.
Rangequerysuggeststreeindex.
Clusteringisespeciallyusefulforrangequeries,althoughitcan
helponequalityqueriesaswellinthepresenceofduplicates.

Trytochooseindexesthatbenefitasmanyqueriesas
possible.Sinceonlyoneindexcanbeclusteredper
relation,chooseitbasedonimportantqueriesthatwould
benefitthemostfromclustering.
6

Issues in Index Selection


(Contd.)
Multiattributesearchkeysshouldbeconsideredwhena
WHERE
clausecontainsseveralconditions.

Ifrangeselectionsareinvolved,orderofattributesshouldbecarefully
chosentomatchtherangeordering.
Suchindexescansometimesenableindexonlystrategiesforimportant
queries.
Forindexonlystrategies,clusteringisnotimportant!

Whenconsideringajoincondition:
HashindexoninnerisverygoodforIndexNestedLoops.
Shouldbeclusteredifjoincolumnisnotkeyforinner,andinner
tuplesneedtoberetrieved.
ClusteredB+treeonjoincolumn(s)goodforSortMerge.
7

Example1

SELECTE.ename,D.mgr
FROMEmpE,DeptD
WHERED.dname=ToyANDE.dno=D.dno

HashindexonD.dnamesupportsToyselection.
Giventhis,indexonD.dnoisnotneeded.

HashindexonE.dnoallowsustogetmatching(inner)Emp
tuplesforeachselected(outer)Depttuple.
WhatifWHEREincluded:``...ANDE.age=25?
CouldretrieveEmptuplesusingindexonE.age,thenjoinwith
Depttuplessatisfyingdnameselection.Comparabletostrategy
thatusedE.dnoindex.
So,ifE.ageindexisalreadycreated,thisqueryprovidesmuch
lessmotivationforaddinganE.dnoindex.
8

Example2

SELECTE.ename,D.mgr
FROMEmpE,DeptD
WHEREE.salBETWEEN10000AND20000
ANDE.hobby=StampsANDE.dno=D.dno

Clearly,Empshouldbetheouterrelation.
SuggeststhatwebuildahashindexonD.dno.

WhatindexshouldwebuildonEmp?
B+treeonE.salcouldbeused,ORanindexonE.hobbycouldbe
used.Onlyoneoftheseisneeded,andwhichisbetterdepends
upontheselectivityoftheconditions.
Asaruleofthumb,equalityselectionsmoreselectivethan
rangeselections.

Asbothexamplesindicate,ourchoiceofindexesisguided
bytheplan(s)thatweexpectanoptimizertoconsiderfora
query.Havetounderstandoptimizers!
9

Multi-Attribute Index Keys

ToretrieveEmprecordswithage=30ANDsal=4000,an
indexon<age,sal>wouldbebetterthananindexonageor
anindexonsal.
Suchindexesalsocalledcompositeorconcatenatedindexes.
Choiceofindexkeyorthogonaltoclusteringetc.

Ifconditionis:20<age<30AND3000<sal<5000:
Clusteredtreeindexon<age,sal>or<sal,age>isbest.

Ifconditionis:age=30AND3000<sal<5000:
Clustered<age,sal>indexmuchbetterthan<sal,age>index!

Compositeindexesarelarger,updatedmoreoften.
10

Index-Only Plans
<E.dno>

SELECTD.mgr
FROMDeptD,EmpE
WHERED.dno=E.dno

SELECTD.mgr,E.eid
Anumberof
<E.dno,E.eid>
FROMDeptD,EmpE
Treeindex!
queriescanbe
WHERED.dno=E.dno
answeredwithout
SELECTE.dno,COUNT(*)
retrievingany
<E.dno> FROMEmpE
tuplesfromone
GROUPBYE.dno
ormoreofthe
SELECTE.dno,MIN(E.sal)
<E.dno,E.sal> FROMEmpE
relations
Treeindex! GROUPBYE.dno
involvedifa
suitableindexis
<E.age,E.sal> SELECTAVG(E.sal)
or
available.
FROMEmpE

<E.sal,E.age> WHEREE.age=25AND
Tree! E.salBETWEEN3000AND5000

11

Summary

Databasedesignconsistsofseveraltasks:requirements
analysis,conceptualdesign,schemarefinement,physical
designandtuning.
Ingeneral,havetogobackandforthbetweenthesetaskstorefine
adatabasedesign,anddecisionsinonetaskcaninfluencethe
choicesinanothertask.

Understandingthenatureoftheworkloadforthe
application,andtheperformancegoals,isessentialto
developingagooddesign.
Whataretheimportantqueriesandupdates?What
attributes/relationsareinvolved?
12

Summary (Contd.)

Indexesmustbechosentospeedupimportantqueries(and
perhapssomeupdates!).

Indexmaintenanceoverheadonupdatestokeyfields.
Chooseindexesthatcanhelpmanyqueries,ifpossible.
Buildindexestosupportindexonlystrategies.
Clusteringisanimportantdecision;onlyoneindexonagiven
relationcanbeclustered!
Orderoffieldsincompositeindexkeycanbeimportant.

Staticindexesmayhavetobeperiodicallyrebuilt.
Statisticshavetobeperiodicallyupdated.
13

Tuning the Conceptual


Schema

Thechoiceofconceptualschemashouldbeguidedbytheworkload,in
additiontoredundancyissues:
Wemaysettlefora3NFschemaratherthanBCNF.
Workloadmayinfluencethechoicewemakeindecomposingarelationinto3NF
orBCNF.
WemayfurtherdecomposeaBCNFschema!
Wemightdenormalize(i.e.,undoadecompositionstep),orwemightaddfieldsto
arelation.
Wemightconsiderhorizontaldecompositions.

Ifsuchchangesaremadeafteradatabaseisinuse,calledschema
evolution;mightwanttomasksomeofthesechangesfromapplications
bydefiningviews.

14

Summary of Database Tuning

Theconceptualschemashouldberefinedbyconsidering
performancecriteriaandworkload:
Maychoose3NForlowernormalformoverBCNF.
MaychooseamongalternativedecompositionsintoBCNF(or3NF)based
upontheworkload.
Maydenormalize,orundosomedecompositions.
MaydecomposeaBCNFrelationfurther!
Maychooseahorizontaldecompositionofarelation.
Importanceofdependencypreservationbaseduponthedependencytobe
preserved,andthecostoftheICcheck.
Canaddarelationtoensuredeppreservation(for3NF,notBCNF!);or
else,cancheckdependencyusingajoin.
15

Summary (Contd.)

Overtime,indexeshavetobefinetuned(dropped,
created,rebuilt,...)forperformance.
Shoulddeterminetheplanusedbythesystem,andadjustthe
choiceofindexesappropriately.

Systemmaystillnotfindagoodplan:
Onlyleftdeepplansconsidered!
Nullvalues,arithmeticconditions,stringexpressions,theuseof
ORs,etc.canconfuseanoptimizer.

So,mayhavetorewritethequery/view:
Avoidnestedqueries,temporaryrelations,complexconditions,
andoperationslikeDISTINCTandGROUPBY.
16

You might also like