Lesson 06
Lesson 06
SYSTEMS
ICT3273
Nuwan Laksiri
Department of ICT
Faculty of Technology
University of Ruhuna Lecture 06
WHAT WE DISCUSS TODAY ……..
• RECAP QUERY PROCESSING PART I
• OVERVIEW OF QUERY PROCESSING PART II
• SORTING
• JOIN OPERATION
• OTHER OPERATIONS
• EVALUATION OF EXPRESSIONS
NEXT WEEK
• QUERY OPTIMIZATION PART I
• INTRODUCTION
• TRANSFORMATION OF RELATIONAL EXPRESSIONS
• EQUIVALENT RULES
• COST BASED OPTIMIZATION
• HEURISTIC OPTIMIZATION
RECAP
• OVERVIEW
• MEASURES OF QUERY COST
• SELECTION OPERATION
• BASIC ALGORITHMS
• SELECTIONS USING INDICES
• SELECTIONS INVOLVING COMPARISONS
• IMPLEMENTATION OF COMPLEX SELECTIONS
Sorting
• What is Sorting in the context of databases?
• SQL queries can specify that the output be
sorted.
• Several of the relational operations, such as
joins, can be implemented efficiently if the input
relations are sorted.
Sorting
• We may build an index on the relation, and
then use the index to read the relation in sorted
order. May lead to one disk block access for
each tuple.
• For relations that fit in memory, techniques like
quicksort can be used.
• For relations that don’t fit in memory, external
sort-merge is a good choice.
External Sort Merge
• If smaller relation (student) fits entirely in memory, the cost estimate will be 500
block transfers.
Join Operation (Block Nested-Loop Join)
• Variant of nested-loop join in which every block of inner relation is
paired with every block of outer relation.
building="Watson" (department)
then compute the store its join with instructor,
and finally compute the projection on name.
Materialization
• Materialized evaluation is always applicable
• Cost of writing results to disk and reading them back can be quite
high
• Our cost formulas for operations ignore cost of writing results to
disk, so
• Overall cost = sum of costs of individual operations +
cost of writing intermediate results to disk
• Double buffering: use two output buffers for each operation, when
one is full write it to disk while the other is getting filled
• Allows overlap of disk writes with computation and reduces
execution time
Pipelining
• Pipelined evaluation : evaluate several operations
simultaneously, passing the results of one operation on to the
next.
• Ex: In previous expression tree, don’t store result of
building="Watson" (department)
• Instead, pass tuples directly to the join.. Similarly, don’t
store result of join, pass tuples directly to projection.
• Much cheaper than materialization: no need to store a
temporary relation to disk.
• Pipelining may not always be possible – e.g., sort, hash-join.
• For pipelining to be effective, use evaluation algorithms that
generate output tuples even as tuples are received for inputs to
the operation.
• Pipelines can be executed in two ways: demand driven and
producer driven
Pipelining
• In demand driven or lazy evaluation
• System repeatedly requests next tuple from top level operation
• Each operation requests next tuple from children operations as
required, in order to output its next tuple
• In between calls, operation has to maintain “state” so it knows what to
return next
• Blocking operations
• Operations are pipelined
HOME WORK
• Find more details about the concepts which are discussed in
the class by referring reference books
SUMMARY
• RECAP B+ TREE
• OVERVIEW
• MEASURES OF QUERY COST
• SELECTION OPERATION
• BASIC ALGORITHMS
• SELECTIONS USING INDICES
• SELECTIONS INVOLVING COMPARISONS
• IMPLEMENTATION OF COMPLEX SELECTIONS
REFERENCES