0% found this document useful (0 votes)

56 views41 pages

IT3020 L06 Indexing

Uploaded by

Thisuni Weerasinghe

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views41 pages

IT3020 L06 Indexing

Uploaded by

Thisuni Weerasinghe

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 41

Database Systems

File Organization and Indexes

This Lecture…
 File Organization

 Indexes
Files of Records

 Page or block is OK when doing I/O, but higher

levels of DBMS operate on records, and files of
records.
 FILE: A collection of pages, each containing a
collection of records. Must support:
 insert/delete/modify record

 read a particular record (specified using record

id)
 scan all records (possibly with some conditions

on the records to be retrieved)

File Organization

 Three types
 Heap File Organization
 Sequential File Organization
 Hashing File Organization
Alternative File Organizations

Many alternatives exist, each ideal for some situation , and not so
good in others:
 Heap files: Suitable when typical access is a file scan
retrieving all records.
 Search (Equality/Range) needs to scan the file

 Insert: At the end of file

 Delete: Search for record and delete record

 Sorted Files: Best if records must be retrieved in some order,

or only a `range’ of records is needed.
 Search (Equality/Range): Efficient

 Insert: Finding the position, inserting & move records

 Delete Search for record, delete & move records

Alternative File Organizations…
(contd.)
 Hashed Files: Good for equality selections.
 File is a collection of buckets. Bucket = primary page plus

zero or more overflow pages.

 Hashing function h: h(r) = bucket in which record r

belongs. h looks at only some of the fields of r, called the

search fields.
Alternative File Organizations…
(contd.)
 Hashed Files:
 Search (Equality): good for equality (if based
on search key). Otherwise scan table
 Search (Range): needs to scan the file
 Insert: search for primary bucket (hash) and
insert
 Delete: search for primary bucket (hash) if
available, else scan file & delete record
Example
ExampleLibrary
LibraryCatalog
Catalog/ /Book
BookIndex
Index

Indexes

 An index on a file speeds up selections

on the search key fields for the index.

 Any subset of the fields of a relation can be

the search key for an index on the relation.

 Search key is not the same as key (minimal

set of fields that uniquely identify a record in
a relation).
Characteristics
 Indexes provide fast access

 Indexes takes space

 Need to be careful in creating only useful indexes

 May slow-down certain inserts/updates/

deletes (maintain indexes)
Explain
Explainon
onboard
board

Alternatives for Data Entry k* in

Index
 An index contains a collection of data entries,
and supports efficient retrieval of all data
entries k* with a given key value k.

 Three alternatives:
1. Data record with key value k (Alt. 1)
2. <k, rid of data record with search key value
k> (Alt. 2)
3. <k, list of rids of data records with search
key k> (Alt. 3)
Terminology
 File of records containing index entries
= index file

 There are several organization

techniques for building index files =
access methods
Properties of Indexes…

 Clustered vs. Unclustered Index

Index entries
CLUSTERED direct search for UNCLUSTERED
data entries

Data entries Data entries

(Index File)
(Data file)

Data Records Data Records

 Can have at most one clustered index per table

 Cost of retrieving data records through index varies greatly based on
whether index is clustered or not!
Properties… (contd.)

 Dense vs. Sparse: If there is Ashby, 25, 3000

at least one data entry per Basu, 33, 4003

search key value (in some

25
Bristow, 30, 2007
30

data record), then dense. Ashby

Cass, 50, 5004

33
Cass
 Alt. 1 always leads to Smith Daniels, 22, 6003
40
dense index. Jones, 40, 6003
44

 Every sparse index is Smith, 44, 3000

clustered!
50
Tracy, 44, 5004

 Sparse indexes are Sparse Index Dense Index

on on
smaller; however, some Name Data File Age
useful optimizations are
based on dense indexes.
Properties… (contd.)
 Primary vs. secondary: If search key
contains primary key, then called
primary index.
 Unique index: Search key contains a
candidate key.
Properties… (contd.)
Examples of composite key
indexes using lexicographic order.
 Composite Search Keys: Search on a
combination of fields. 11,80 11
 Equality query: Every field value is 12,10 12
equal to a constant value. E.g. wrt 12,20 name age sal 12
<sal,age> index: 13,75 bob 12 10 13
 age=20 and sal =75
<age, sal> cal 11 80 <age>
 Range query: Some field value is joe 12 20
not a constant. E.g.: 10,12 sue 13 75 10
 age =20; or age=20 and sal > 20
20,12 Data records
10 75,13 sorted by name 75
 Data entries in index sorted by search 80,11 80
key to support range queries. <sal, age> <sal>
Data entries in index Data entries
sorted by <sal,age> sorted by <sal>
Indexes in SQL…
 Index is not a part of SQL-92

 However, all major DBMSs provide facilities

for index creation
 CREATE INDEX…
 DROP INDEX…

 SQL Server support indexes (clustered and

non-clustered indexes)
Range Searches
 ``Find all students with gpa > 3.0’’
 If data is in sorted file, do binary search to find first such

student, then scan to find others.

 Cost of binary search can be quite high.

 Simple idea: Create an `index’ file.

k1 k2 kN Index File

Page 1 Page 2 Page 3 Page N Data File

 Can do binary search on (smaller) index file!

B+ Tree: The Most Widely Used
Index
 Insert/delete at log F N cost; keep tree height-balanced.
(F = fanout, N = # leaf pages)
 Minimum 50% occupancy (except for root). Each node
(except root) contains d <= m <= 2d entries. The
parameter d is called the order of the tree.
 Supports equality and range-searches efficiently.

Index Entries
(Direct search)

Data Entries
("Sequence set")
B+ Trees in Practice
 Typical order: 100. Typical fill-factor: 67%.
 average fanout = 133

 Typical capacities:
 Height 4: 1334 = 312,900,700 records

 Height 3: 1333 = 2,352,637 records

 Can often hold top levels in buffer pool:

 Level 1 = 1 page = 8 Kbytes
 Level 2 = 133 pages = 1 Mbyte
 Level 3 = 17,689 pages = 133 MBytes
B+ Tree…
 Search begins at root, and key comparisons
direct it to a leaf
 Each Node has search keys (Ki) and pointers
(Pi).
 Pi points to a sub-tree in which all key values
K are such that Ki ≤ K < Ki+1
Search
func tree_search (nodepointer, search key value K) returns
nodepointer
/ / Searches tree for entry
if *nodepointer is a leaf, return nodepointer;
else,
if K < K1 then return tree_search(Po, K);
else,
if K ≥ Km then return tree_search(Pm, K) // m = # entries
else,
find i such that Ki ≤K < Ki+1;
return tree_search(Pi, K)
end if
end if
Example B+ Tree…

 Search for 5, 15, all data entries >=

24* ... Root
13 17 24 30

2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

 Based on the search for 15*, we know it is not in the tree!

Inserting a Data Entry into a
B+ Tree
Find correct leaf L.
Put data entry onto L.
If L has enough space, done!
Else, must split L (into L and a new node L2)
Redistribute entries evenly, copy up middle key.
Insert index entry pointing to L2 into parent of L.
This can happen recursively
To split index node, redistribute entries evenly, but push up
middle key. (Contrast with leaf splits.)
Splits “grow” tree; root split increases height.
Tree growth: gets wider or one level taller at top.
Inserting 8* into Example B+
Tree
Entry to be inserted in parent node.
 Observe how 5 (Note that 5 is
s copied up and
minimum continues to appear in the leaf.)

occupancy is
guaranteed in 2* 3* 5* 7* 8*

both leaf and

index pg splits.
 Note difference
between copy-up Entry to be inserted in parent node.
(Note that 17 is pushed up and only
17
and push-up; be appears once in the index. Contrast
this with a leaf split.)
sure you
understand the 5 13 24 30
reasons for this.
Example B+ Tree After
Inserting 8*
Root
17

5 13 24 30

2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

 Notice that root was split, leading to increase in height.

 In this example, we can avoid split by re-distributing
entries; however, this is usually not done in practice.
Deleting a Data Entry from a
B+ Tree
 Start at root, find leaf L where entry belongs.
 Remove the entry.
 If L is at least half-full, done!

 If L has only d-1 entries,

 Try to re-distribute, borrowing from sibling (adjacent

node with same parent as L).

 If re-distribution fails, merge L and sibling.

 If merge occurred, must delete entry (pointing to L or sibling)

from parent of L.

Merge could propagate to root, decreasing height .
Example Tree After (Inserting
8*, Then) Deleting 19* and
20* ...
Root

5 13 27 30

2* 3* 5* 7* 8* 14* 16* 22* 24* 27* 29* 33* 34* 38* 39*

 Deleting 19* is easy.

 Deleting 20* is done with re-distribution.
Notice how middle key is copied up.
... And Then Deleting
24*
 Must merge. 30

 Observe `toss’ of
index entry (on right), 22* 27* 29* 33* 34* 38* 39*

and `pull down’ of

index entry (below).
Root
5 13 17 30

2* 3* 5* 7* 8* 14* 16* 22* 27* 29* 33* 34* 38* 39*

Duplicates in B+ Trees…
 We have ignored duplicates so far…

 Alternatives…
 Overflow leaf pages

 Duplicate values in the leaf pages

 Make unique key values (by adding rowid’s)

 Preferred approach by many DBMSs
Hashing

 Hash-based indexes are best for equality

selections.

 Cannot support range searches.

 Static and dynamic hashing techniques

exists
Static Hashing
 # primary pages fixed, allocated sequentially,
never de-allocated; overflow pages if needed.
 h(k) mod N = bucket to which data entry with
key k belongs. (N = # of buckets)
0
h(key) mod N
2
key
h

N-1
Primary bucket pages Overflow pages
Static Hashing… (contd.)

 Buckets contain data entries.

 Hash fn works on search key field of record
r. Must distribute values over range 0...N-1.
 h(key) = (a * key + b) usually works well.
 a and b are constants; lots known about how
to tune h.
Static Hashing… (contd.)
Problems…
 Insertion can create long overflow
chains can develop and degrade
performance.
 Deletion may waste space
 Extendible and Linear Hashing:
Dynamic techniques to fix this problem.
Extendible Hashing

 Situation: Bucket (primary page) becomes full. Why not

re-organize file by doubling # of buckets?
 Reading and writing all pages is expensive!

 Idea: Use directory of pointers to buckets , double # of

buckets by doubling the directory, splitting just the

bucket that overflowed!
 Directory much smaller than file, so doubling it is much

cheaper. Only one page of data entries is split. No

overflow page!
 Trick lies in how hash function is adjusted!
LOCAL DEPTH 2
Bucket A
4* 12* 32*16*

Example
GLOBAL DEPTH

2 2
Bucket B
00 1* 5* 21* 13*
 Directory is array of size 4.
01
 To find bucket for r, take last 10 2
`global depth’ # bits of h(r); 10*
Bucket C
11
we denote r by h(r).
 If h(r) = 5 = binary 101, it
DIRECTORY 2
is in bucket pointed to by 15* 7* 19*
Bucket D
01.
DATA PAGES

 Insert: If bucket is full, split it (allocate new page, re-distribute).

 If necessary, double the directory. (As we will see, splitting a
bucket does not always require doubling; we can tell by
comparing global depth with local depth for the split bucket.)
Insert h(r)=20 (Causes
Doubling)
LOCAL DEPTH 2 3
Bucket A LOCAL DEPTH
GLOBAL DEPTH 32*16* 32*16*Bucket A
GLOBAL DEPTH

2 2
3 2
00 1* 5* 21*13*Bucket B 000 1* 5* 21*13*Bucket B
01 001
10 2 2
010
10* Bucket C
11 10*
011 Bucket C
100
2
DIRECTORY 101 2
Bucket D
15*7* 19*
110 15*7* 19* Bucket D
111
2
3
4* 12*20* Bucket A2
DIRECTORY 4* 12*20* Bucket A2
(`split image'
of Bucket A) (`split image'
of Bucket A)
Points to Note

 20 = binary 10100. Last 2 bits (00) tell us r belongs in A or

A2. Last 3 bits needed to tell which.
 Global depth of directory : Max # of bits needed to tell

which bucket an entry belongs to.

 Local depth of a bucket: # of bits used to determine if an

entry belongs to this bucket.

 Not all splits double the directory size

 Example: Insert 9*
Points to Note (contd.)
 When does bucket split cause directory
doubling?
 Before insert, local depth of bucket = global
depth. Insert causes local depth to become >
global depth; directory is doubled by copying
it over and `fixing’ pointer to split image page.
(Use of least significant bits enables efficient
doubling via copying of directory!)
Comments on Extendible
Hashing
 If directory fits in memory, equality search answered with
one disk access; else two.
 100MB file, 100 bytes/rec, 4K pages contains 1,000,000

records (as data entries) and 25,000 directory elements;

chances are high that directory will fit in memory.
 Directory grows in spurts, and, if the distribution of hash

values is skewed, directory can grow large.

 Multiple entries with same hash value cause problems!
Comments on Extendible Hashing
(contd.)
 Delete: If removal of data entry
makes bucket empty, can be merged
with `split image’. If each directory
element points to same bucket as its
split image, can halve directory.
Summary
 File Organizations
 Indexes
 B+ Tree
 Hashing (Extendible)

Indexing Hashing Files
No ratings yet
Indexing Hashing Files
68 pages
File Organization
No ratings yet
File Organization
41 pages
Unit-4 Hand Written
No ratings yet
Unit-4 Hand Written
35 pages
03 UW Indexing
No ratings yet
03 UW Indexing
97 pages
Ch14, Veiws, Normalization - Summary
No ratings yet
Ch14, Veiws, Normalization - Summary
68 pages
DBMS Indexing
No ratings yet
DBMS Indexing
43 pages
Indexing
No ratings yet
Indexing
77 pages
2 - Indexing Structures - Ch14
No ratings yet
2 - Indexing Structures - Ch14
50 pages
CH 12 Updated
No ratings yet
CH 12 Updated
55 pages
Chapter 8 Indexing NEW
No ratings yet
Chapter 8 Indexing NEW
43 pages
Lecture 5 Trees
No ratings yet
Lecture 5 Trees
47 pages
CS2202 IndexingHashing
No ratings yet
CS2202 IndexingHashing
83 pages
Storage and Indexing
No ratings yet
Storage and Indexing
41 pages
Lecture12 (CNC 312)
No ratings yet
Lecture12 (CNC 312)
36 pages
B - Trees
No ratings yet
B - Trees
19 pages
Layers of A DBMS
No ratings yet
Layers of A DBMS
38 pages
Unit 5 Indexing 2024
No ratings yet
Unit 5 Indexing 2024
50 pages
Dbms. 5 Unit Part-B
No ratings yet
Dbms. 5 Unit Part-B
8 pages
CSE 301 Lecture-8-Indexing WT
No ratings yet
CSE 301 Lecture-8-Indexing WT
31 pages
Unit-5 B+Trees & Hashing
No ratings yet
Unit-5 B+Trees & Hashing
37 pages
IT3031 L06 Indexing
No ratings yet
IT3031 L06 Indexing
45 pages
Module Iippt
No ratings yet
Module Iippt
27 pages
Unit 3 - DBMS (Indexing, Hashing, B+-Tree)
No ratings yet
Unit 3 - DBMS (Indexing, Hashing, B+-Tree)
7 pages
Unit-3 Part 2 Indexing and Hashing
No ratings yet
Unit-3 Part 2 Indexing and Hashing
36 pages
7 Indexing
No ratings yet
7 Indexing
13 pages
Lesson 8 Cs450 - Indexing
No ratings yet
Lesson 8 Cs450 - Indexing
31 pages
DINLect 1
No ratings yet
DINLect 1
69 pages
DBMS Unit-Iv
No ratings yet
DBMS Unit-Iv
9 pages
DBMS Unit5
No ratings yet
DBMS Unit5
40 pages
UNIT-5: Indexing and Hashing
No ratings yet
UNIT-5: Indexing and Hashing
78 pages
DBMS Indexing Methods
No ratings yet
DBMS Indexing Methods
33 pages
Index Architecture: Febriliyan Samopa
No ratings yet
Index Architecture: Febriliyan Samopa
110 pages
DBMS Unit-4
No ratings yet
DBMS Unit-4
9 pages
DBMS Storage and Indexing
No ratings yet
DBMS Storage and Indexing
80 pages
CH 13
No ratings yet
CH 13
34 pages
INDEXING
No ratings yet
INDEXING
10 pages
File Organizations and Indexing: R&G Chapter 8
No ratings yet
File Organizations and Indexing: R&G Chapter 8
26 pages
Database Management System-203105251: Assistant Professor Computer Science & Engineering
No ratings yet
Database Management System-203105251: Assistant Professor Computer Science & Engineering
35 pages
Indexing: Contents
No ratings yet
Indexing: Contents
13 pages
File Organizations and Indexing: R&G Chapter 8
No ratings yet
File Organizations and Indexing: R&G Chapter 8
40 pages
IN3020/4020 - Database Systems Spring 2020, Week 3.1 Indexing
No ratings yet
IN3020/4020 - Database Systems Spring 2020, Week 3.1 Indexing
44 pages
Chap. 2 File Organization and Indexing: Abel J.P. Gomes
No ratings yet
Chap. 2 File Organization and Indexing: Abel J.P. Gomes
20 pages
Hash Tree Index
No ratings yet
Hash Tree Index
44 pages
Lecture3 File Orgn
No ratings yet
Lecture3 File Orgn
13 pages
Indexing and Hashing
No ratings yet
Indexing and Hashing
20 pages
CO3-Session-09 & 10
No ratings yet
CO3-Session-09 & 10
41 pages
Index and Hashing
No ratings yet
Index and Hashing
82 pages
Memoryhierarchy Indexing
No ratings yet
Memoryhierarchy Indexing
9 pages
CSE 544: Lecture 11 Storing Data, Indexes: Monday, 5/1/2006
No ratings yet
CSE 544: Lecture 11 Storing Data, Indexes: Monday, 5/1/2006
52 pages
Unit 3 Storage Strategies Indices B-Trees Hashing
No ratings yet
Unit 3 Storage Strategies Indices B-Trees Hashing
12 pages
File Organizations and Indexing: R&G Chapter 8
No ratings yet
File Organizations and Indexing: R&G Chapter 8
40 pages
File Organizations and Indexing: R&G Chapter 8
No ratings yet
File Organizations and Indexing: R&G Chapter 8
40 pages
Unit Iv
No ratings yet
Unit Iv
29 pages
Unit Iv Indexing and Hashing: Basic Concepts
No ratings yet
Unit Iv Indexing and Hashing: Basic Concepts
35 pages
Chapter 12: Indexing and Hashing
No ratings yet
Chapter 12: Indexing and Hashing
31 pages
Find All Students With Gpa 3.0'': Can Do Binary Search On (Smaller) Index File!
No ratings yet
Find All Students With Gpa 3.0'': Can Do Binary Search On (Smaller) Index File!
42 pages
Chapter 11: Indexing and Hashing
No ratings yet
Chapter 11: Indexing and Hashing
47 pages
2 明实录 01 明太祖实录
No ratings yet
2 明实录 01 明太祖实录
1,000 pages
Full Book Computer Mcqs
No ratings yet
Full Book Computer Mcqs
4 pages
3529201
No ratings yet
3529201
3 pages
How To Create Data
No ratings yet
How To Create Data
16 pages
Retrieval-Augmented Generation (RAG) : Michael Klesel H. Felix Wittmann
No ratings yet
Retrieval-Augmented Generation (RAG) : Michael Klesel H. Felix Wittmann
11 pages
100 Computer MCQs For EPFO SSA 2023
No ratings yet
100 Computer MCQs For EPFO SSA 2023
10 pages
Interview Question
No ratings yet
Interview Question
4 pages
GPT6
No ratings yet
GPT6
7 pages
C5-CSA-20 Exam Question Paper 2024 - B-1
No ratings yet
C5-CSA-20 Exam Question Paper 2024 - B-1
3 pages
Binding and Message Passing
No ratings yet
Binding and Message Passing
8 pages
19022024024645class 9 Computer Applications Master Worksheet 2024
No ratings yet
19022024024645class 9 Computer Applications Master Worksheet 2024
3 pages
How To Use The Alarm
No ratings yet
How To Use The Alarm
11 pages
R002 - How To Make A Diagnostic and Start On LULC07
No ratings yet
R002 - How To Make A Diagnostic and Start On LULC07
7 pages
BTR Enrollment Form
No ratings yet
BTR Enrollment Form
2 pages
COMPUTER p2 004 - 1628917445
No ratings yet
COMPUTER p2 004 - 1628917445
4 pages
Comparison and Analysis of Leading Cloud Service Providers (Aws, Azure and GCP)
No ratings yet
Comparison and Analysis of Leading Cloud Service Providers (Aws, Azure and GCP)
14 pages
Lab Manual Pervasive Computing
No ratings yet
Lab Manual Pervasive Computing
21 pages
Logcat 2
No ratings yet
Logcat 2
34 pages
TMS TAdvStringGrid Quick Start
No ratings yet
TMS TAdvStringGrid Quick Start
16 pages
Packet Tracer - Creating An Ethernet LAN - Moodle
No ratings yet
Packet Tracer - Creating An Ethernet LAN - Moodle
3 pages
System Proposal - 6
No ratings yet
System Proposal - 6
17 pages
SEU As400Cmd Sqlqueries
No ratings yet
SEU As400Cmd Sqlqueries
6 pages
CCS352 Ma Lab Set1
No ratings yet
CCS352 Ma Lab Set1
3 pages
(ODIN) Installing 2.2 Froyo DDJP2 On Samsung Galaxy 3
No ratings yet
(ODIN) Installing 2.2 Froyo DDJP2 On Samsung Galaxy 3
23 pages
Introduction To Computer Aided Drafting Course Outline Learning Outcomes
No ratings yet
Introduction To Computer Aided Drafting Course Outline Learning Outcomes
2 pages
Certifics
No ratings yet
Certifics
4 pages
CSS Text Properties
No ratings yet
CSS Text Properties
2 pages
How To Share PDF Sticky Notes
No ratings yet
How To Share PDF Sticky Notes
2 pages
Softwaretesting 2010 PDF
No ratings yet
Softwaretesting 2010 PDF
2 pages
SQL Interview Success From Beginner To Pro
From Everand
SQL Interview Success From Beginner To Pro
Shana
No ratings yet
DBMS MASTER: Become Pro in Database Management System
From Everand
DBMS MASTER: Become Pro in Database Management System
Ummed Singh
No ratings yet
Search Tree: Fundamentals and Applications
From Everand
Search Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet

IT3020 L06 Indexing

Uploaded by

IT3020 L06 Indexing

Uploaded by

Database Systems

File Organization and Indexes

 Page or block is OK when doing I/O, but higher

 read a particular record (specified using record

on the records to be retrieved)

 Insert: At the end of file

 Delete: Search for record and delete record

 Sorted Files: Best if records must be retrieved in some order,

 Insert: Finding the position, inserting & move records

 Delete Search for record, delete & move records

zero or more overflow pages.

belongs. h looks at only some of the fields of r, called the

 An index on a file speeds up selections

 Any subset of the fields of a relation can be

 Search key is not the same as key (minimal

 Indexes takes space

 May slow-down certain inserts/updates/

Alternatives for Data Entry k* in

 There are several organization

 Clustered vs. Unclustered Index

Data entries Data entries

Data Records Data Records

 Can have at most one clustered index per table

 Dense vs. Sparse: If there is Ashby, 25, 3000

at least one data entry per Basu, 33, 4003

search key value (in some

data record), then dense. Ashby

Cass, 50, 5004

 Every sparse index is Smith, 44, 3000

 Sparse indexes are Sparse Index Dense Index

 However, all major DBMSs provide facilities

 SQL Server support indexes (clustered and

student, then scan to find others.

 Simple idea: Create an `index’ file.

Page 1 Page 2 Page 3 Page N Data File

 Can do binary search on (smaller) index file!

 Height 3: 1333 = 2,352,637 records

 Can often hold top levels in buffer pool:

 Search for 5*, 15*, all data entries >=

 Based on the search for 15*, we know it is not in the tree!

both leaf and

 Notice that root was split, leading to increase in height.

 If L has only d-1 entries,

 Try to re-distribute, borrowing from sibling (adjacent

node with same parent as L).

 If merge occurred, must delete entry (pointing to L or sibling)

 Deleting 19* is easy.

and `pull down’ of

2* 3* 5* 7* 8* 14* 16* 22* 27* 29* 33* 34* 38* 39*

 Duplicate values in the leaf pages

 Make unique key values (by adding rowid’s)

 Hash-based indexes are best for equality

 Cannot support range searches.

 Static and dynamic hashing techniques

 Buckets contain data entries.

 Situation: Bucket (primary page) becomes full. Why not

 Idea: Use directory of pointers to buckets , double # of

buckets by doubling the directory, splitting just the

cheaper. Only one page of data entries is split. No

 Insert: If bucket is full, split it (allocate new page, re-distribute).

 20 = binary 10100. Last 2 bits (00) tell us r belongs in A or

which bucket an entry belongs to.

entry belongs to this bucket.

 Not all splits double the directory size

records (as data entries) and 25,000 directory elements;

values is skewed, directory can grow large.

You might also like

 Search for 5, 15, all data entries >=