Layers of A DBMS
Layers of A DBMS
Query
Query optimization
Query
Processor Query
Execution engine execution
plan
Files and access methods
Buffer management
BUFFER POOL
disk page
free frame
MAIN MEMORY
L1 L2 L3 L4
INPUT 1
OUTPUT
INPUT 2
2 N log 2 N 1 1,2
2,3
• Idea: Divide and conquer: 3,4 8-page runs
sort subfiles and merge 4,5
6,6
7,8
9
General External Merge Sort
e than 3 buffer pages. How can we utilize them
• To sort a file with N pages using B buffer pages:
– Pass 0: use B buffer pages. Produce N / B sorted runs
of B pages each.
– Pass 2, …, etc.: merge B-1 runs.
INPUT 1
... ..
INPUT 2
... OUTPUT
INPUT B-1 .
Disk Disk
B Main memory
buffers
Cost of External Merge Sort
• Number of passes: 1 log B 1 N / B
• Cost = 2N * (# of passes)
• E.g., with 5 buffer pages, to sort 108 page file:
– Pass 0: = 22 sorted runs of 5 pages each
(last run is only 3 pages)
– Pass 1: 108 / 5 = 6 sorted runs of 20 pages each
(last run is only 8 pages)
22 / 4runs,
– Pass 2: 2 sorted 80 pages and 28 pages
– Pass 3: Sorted file of 108 pages
Number of Passes of External
Sort
N B=3 B=5 B=9 B=17 B=129 B=257
100 7 4 3 2 1 1
1,000 10 5 4 3 2 2
10,000 13 7 5 4 2 2
100,000 17 9 6 5 3 3
1,000,000 20 10 7 5 3 3
10,000,000 23 12 8 6 4 3
100,000,000 26 14 9 7 4 4
1,000,000,000 30 15 10 8 5 4
Cost Model for Our Analysis
As a good approximation, we ignore CPU costs:
– B: The number of data pages
– R: Number of records per page
– D: (Average) time to read or write disk page
– Measuring number of page I/O’s ignores gains of
pre-fetching blocks of pages; thus, even I/O cost is
only approximated.
– Average-case analysis; based on several simplistic
assumptions.
Assumptions in Our Analysis
• Single record insert and delete.
• Heap Files:
– Equality selection on key; exactly one match.
– Insert always at end of file.
• Sorted Files:
– Files compacted after deletions.
– Selections on sort field(s).
• Hashed Files:
– No overflow buckets, 80% page occupancy.
Cost of Operations
Data entries
Data entries
(Index File)
(Data file)
CLUSTERED UNCLUSTERED
Index Classification (Contd.)
• Dense vs. Sparse: If
there is at least one data
entry per search key Ashby, 25, 3000
30
44
smaller;
Index Classification (Contd.)
• Composite Search Keys: Search Examples of composite key
indexes using lexicographic order.
on a combination of fields.
– Equality query: Every field 11,80 11
value is equal to a constant 12,10 12
name age sal
value. E.g. wrt <sal,age> 12,20 12
13,75 bob 12 10 13
index: <age, sal> cal 11 80 <age>
• age=20 and sal =75 joe 12 20
10,12 sue 13 75 10
– Range query: Some field
20,12 Data records 20
value is not a constant. E.g.: 75,13 sorted by name 75
• age =20; or age=20 and 80,11 80
P K P K 2 P K m Pm
0 1 1 2
40 Root
20 33 51 63
10* 15* 20* 27* 33* 37* 40* 46* 51* 55* 63* 97*
B+ Tree: The Most Widely Used
Index
• Insert/delete at log F N cost; keep tree height-
balanced. (F = fanout, N = # leaf pages)
• Minimum 50% occupancy (except for root).
Each node contains d <= m <= 2d entries.
The parameter d is called the order of the tree.
Root
Index Entries
Data Entries
Example B+ Tree
• Search begins at root, and key comparisons
direct it to a leaf.
• Search for 5*, 15*, all data entries >=
24* ...
13 17 24 30
2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
B+ Trees in Practice
• Typical order: 100. Typical fill-factor: 67%.
– average fanout = 133
• Typical capacities:
– Height 4: 1334 = 312,900,700 records
– Height 3: 1333 = 2,352,637 records
• Can often hold top levels in buffer pool:
– Level 1 = 1 page = 8 Kbytes
– Level 2 = 133 pages = 1 Mbyte
– Level 3 = 17,689 pages = 133 MBytes
Inserting a Data Entry into a B+
Tree
• Find correct leaf L.
• Put data entry onto L.
– If L has enough space, done!
– Else, must split L (into L and a new node L2)
• Redistribute entries evenly, copy up middle key.
• Insert index entry pointing to L2 into parent of L.
• This can happen recursively
– To split index node, redistribute entries evenly, but
push up middle key. (Contrast with leaf splits.)
Inserting 8* into Example B+
Tree
• Note: Entry to be inserted in parent node.
(Note that 5 is
s copied up and
5
continues to appear in the leaf.)
– why
minimum 2* 3* 5* 7* 8*
occupancy is
guaranteed.
– Difference Entry to be inserted in parent node.
(Note that 17 is pushed up and only
between 17
appears once in the index. Contrast
this with a leaf split.)
copy-up and
push-up. 5 13 24 30
Example B+ Tree After Inserting
8*
Root
17
5 13 24 30
2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
17
5 13 27 30
2* 3* 5* 7* 8* 14* 16* 22* 24* 27* 29* 33* 34* 38* 39*
• Observe `toss’ of
index entry (on right), 22* 27* 29* 33* 34* 38* 39*