0% found this document useful (0 votes)

11 views79 pages

04 UW Hashing

Uploaded by

selezeno4ka1337

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views79 pages

04 UW Hashing

Uploaded by

selezeno4ka1337

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 79

Ullman et al.

:
Database System
Principles
Notes 5: Hashing and More

1
Indexes
… WHERE key = 22
Row
Key pointer

Table
Index
Types of Indexes
These are several types of index structures available to
you, depending on the need:
– A B+-tree index is in the form of a balanced tree
and is the default index type.
– A bitmap index has a bitmap for each distinct value
indexed, and each bit position represents a row
that may or may not contain the indexed value.
This is best for low-cardinality columns.
B+ tree index
Index entry

Root

Branch

Index entry header

Leaf Key column length
Key column value
ROWID
B+-Tree Index
Structure of a B+-tree index
At the top of the index is the root, which contains entries that point to the
next level in the index. At the next level are branch blocks, which in turn
point to blocks at the next level in the index. At the lowest level are the
leaf nodes, which contain the index entries that point to rows in the
table. The leaf blocks are doubly linked to facilitate the scanning of the
index in an ascending as well as descending order of key values.

Format of index leaf entries

An index entry is made up of the following components:
An entry header, which stores the number of columns and locking
information
Key column length-value pairs, which define the size of a column in the
key followed by the value for the column (The number of such pairs
is a maximum of the number of columns in the index.)
ROWID of a row that contains the key values
–
Index Options
A unique index ensures that every indexed value is
unique.
CREATE UNIQUE INDEX emp1 ON EMP (ename);
DBA_INDEXES. UNIQUENESS  ‘UNIQUE’

– Bitmap index
CREATE BITMAP INDEX emp2 ON EMP (deptno);
DBA_INDEXES.INDEX_TYPE  ‘BITMAP’

An index can have its key values stored in ascending or

descending order.
CREATE INDEX emp3 ON emp (sal DESC);
DBA_IND_COLUMNS.DESCEND  ‘DESC’
–
Index Options
A composite index is one that is based on more than one
column.
CREATE INDEX emp4 ON emp (empno, sal);
DBA_IND_COLUMNS.COLUMN_POSITION  1,2 …

– A function-based index is an index based on a function’s

return value.
CREATE INDEX emp5 ON emp (SUBSTR(ename, 3, 4));
DBA_IND_EXPRESSIONS.COLUMN_EXPRESSION  ‘SUBSTR …’

– A compressed index has repeated key values removed.

CREATE INDEX emp6 ON emp (empno, ename, sal) COMPRESS
2;
DBA_INDEXES.COMPRESSION  ‘ENABLED’
DBA_INDEXES.PREFIX_LENGTH  2
Compressed index
example
• online,0,AAAPvCAAFAAAAFaA • online,0
Aa • AAAPvCAAFAAAAFaAAa
• AAAPvCAAFAAAAFaAAg
• online,0,AAAPvCAAFAAAAFaA
• AAAPvCAAFAAAAFaAAl
Ag
• online,3
• online,0,AAAPvCAAFAAAAFaA • AAAPvCAAFAAAAFaAAq
Al • AAAPvCAAFAAAAFaAAt

• online,3,AAAPvCAAFAAAAFaA
Aq
• online,3,AAAPvCAAFAAAAFaA
At

CS 245 Notes 5 8
Bitmap Indexes
Table File 3
Block 10

Block 11

Index Block 12

Start End
Key ROWID ROWID Bitmap
<Blue, 10.0.3, 12.8.3, 1000100100010010100>
<Green, 10.0.3, 12.8.3, 0001010000100100000>
<Red, 10.0.3, 12.8.3, 0100000011000001001>
<Yellow, 10.0.3, 12.8.3, 0010001000001000010>
Bitmap Indexes
Structure of a bitmap index
A bitmap index is also organized as a B-tree, but the leaf node stores a
bitmap for each key value instead of a list of ROWIDs. Each bit in the
bitmap corresponds to a possible ROWID, and if the bit is set, it means
that the row with the corresponding ROWID contains the key value.
As shown in the diagram, the leaf node of a bitmap index contains the
following:
An entry header that contains the number of columns and lock info
Key values consisting of length and value pairs for each key column
Start ROWID
End ROWID
A bitmap segment consisting of a string of bits. (The bit is set when the
corresponding row contains the key value and is unset when the row
does not contain the key value. The Oracle server uses a patented
compression technique to store bitmap segments.)
Bitmap Index
Empno Status Region Gender Info
101 single east male bracket_1
102 married central female bracket_4
103 married west female bracket_2
104 divorced west male bracket_4
105 single central female bracket_2
106 married central female bracket_3

REGION='east' REGION='central' REGION='west'

1 0 0

0 1 0

0 0 1

0 1 0

0 1 0
Using Bitmap Indexes
SELECT COUNT(*)
FROM CUSTOMER
WHERE MARITAL_STATUS = 'married‘
AND REGION IN ('central','west');
Range queries
AGE SALARY
SELECT * FROM T
25 60
WHERE Age BETWEEN 44 AND 55
45 60
AND Salary BETWEEN 100 AND 200;
50 75
50 100
50 120
Bitvectors for Age
70 110
Bitvectors for Salary
85 140
30 260 25: 100000001000 60:
25 400 110000000000
45 350 30: 000000010000 75:
50 275 001000000000
60 260 45: 010000000100 100:
000100000000
50: 001110000010 110:
Range queries
AGE SALARY
SELECT * FROM T
25 60
WHERE Age BETWEEN 44 AND 55
45 60
AND Salary BETWEEN 100 AND 200;
50 75
50 100
45: 010000000100
50 120
50: 001110000010 OR ->
70 110
011110000110
85 140
30 260
100: 000100000000
25 400
110: 000001000000
45 350
120: 000010000000
50 275
140: 000000100000 OR ->
60 260
000111100000
Compressed bitmaps

1’s in a bit vector will be very rare. We compress the vector.

Run-length encoding:
run: a sequence of i 0’s followed by a 1
10000001000000000100010000000000001

1. Determine how many bits the binary representation of i

has. This is number j.
2. We represent j in „unary” by j-1 1’s and a single 0.
3. Then we follow with i in binary.
Compressed bitmaps

Example: …0100000000000001
run with 13 0’s

i in binary: 1101
j = 4 -> in unary: 1110
Encoding for the run: 11101101
Compressed bitmaps
Encoding for i = 0: 00
Encoding for i = 1: 01

We ignore the trailing 0’s. But not the starting 0’s !

Decoding:
Decode the following: 11101101001011 -> 13, 0, 3

Original bitvector: 0000000000000110001

Hashing

key  h(key) <key>

Buckets
(typically 1
. disk block)
.
.

18
Hashing

19
Two alternatives
.
.
.
records
(1) key  h(key) .
(direct reference, not flexible) .
.

20
Two alternatives

record
key 1
(2) key  h(key)
(indirect reference, more flexible)
Index

• Alt (2) for “secondary” search key

21
Typical implementation

22
Example hash function

• Key = ‘x1 x2 … xn’ n byte character

string
• Have b buckets
• h: add x1 + x2 + ….. xn
– compute sum modulo b

23
 This may not be best function …
 Read Knuth Vol. 3 if you really
need to select a good
function.

Good hash  Expected number

of
function: keys/bucket is the
same for all
buckets
24
Within a bucket:
• Do we keep keys sorted?

• Yes, if CPU time critical

& Inserts/Deletes not too frequent

25
Next: example to illustrate
inserts, overflows,
deletes

h(K)

26
EXAMPLE 2 records/bucket

0
INSERT:
h(a) = 1

1
2
h(b) =
2 3

h(c) = 1
h(d) =
0 27
EXAMPLE 2 records/bucket

0
INSERT: d

h(a) = 1 a
1 c
2
b
h(b) =
2 3

h(c) = 1
h(e) =
h(d)
1 =
0 28
EXAMPLE 2 records/bucket

0
INSERT: d

h(a) = 1 a e
1 c
2
b
h(b) =
2 3

h(c) = 1
h(e) =
h(d)
1 =
0 29
EXAMPLE: deletion

Delete: 0 a
e 1 b d
f c
2
e
3
f
g

30
EXAMPLE: deletion

Delete: 0 a
e 1 b d
f c
c 2
e
3
f maybe move
g “g” up

31
EXAMPLE: deletion

Delete: 0 a
e 1 b d
f c d
c 2
e
3
f maybe move
g “g” up

32
Rule of thumb:
• Try to keep space utilization
between 50% and 80%
Utilization = # keys used
total # keys that
fit
• If < 50%, wasting space
• If > 80%, overflows significant
depends on how good
hash function is & on
# keys/bucket
33
How do we cope with growth?
• Overflows and reorganizations
• Dynamic hashing

34
How do we cope with growth?
• Overflows and reorganizations
• Dynamic hashing

• Extensible
• Linear

35
Extensible hashing: two ideas

(a) Use i of b bits output by hash

function
b
00110101
h(K) 

use i  grows over

time….
36
(b) Use directory
.

h(K)[i ] . to bucket
.
.
.
.
h(K)[i ]: means the first i bits of the output by hash
function

37
Example: h(k) is 4 bits; 2
keys/bucket
1
i=1 0001
0
1
1
1001
1100

Insert
1010

38
Example: h(k) is 4 bits; 2
keys/bucket
1
i=1 0001
0
1
1
1001
1010 1100

Insert 1
1100
1010

39
Example: h(k) is 4 bits; 2
keys/bucket
i =2
1
00
i=1 0001
01

10
1 2
1001 11
1010 1100

New directory
Insert 1 2
1100
1010

40
Example continued

i= 2
00
1
01
0001
10

11 2
1001
1010
Insert:
2
0111 1100
0000
41
Example continued
0000
i= 2 0001
00
1
01
0001 0111
10 0111
11 2
1001
1010
Insert:
2
0111 1100
0000
42
Example continued 2
0000
i= 2 0001
00
12
01
0001 0111
10 0111
11 2
1001
1010
Insert:
2
0111 1100
0000
43
Example continued
0000 2
i= 2 0001
00 0111 2
01

1001 2
1010
Insert:
1001 1100 2

44
Example continued
0000 2
i= 2 0001
00 0111 2
01

10 1001
11 1001
10101001 2
1010
Insert:
1001 1100 2

45
Example continued
i=3
0000 2
000
i= 2 0001
001
00 0111 2
010
01
011
10 1001 3
1001 100
11

10101001 2 3 101

1010 110
Insert:
1001 1100 2 111

46
Extensible hashing: deletion

• No merging of blocks
• Merge blocks
and cut directory if
possible
(Reverse insert
procedure)

47
Deletion example:

• Run thru insert example in reverse!

But: Typically not implemented

48
Note: Still need overflow
chains
• Example: many records with duplicate
keys if we split:
insert 1100

2
1
1101
1100
2
1100
1100
49
Solution: overflow chains

insert 1100 add overflow block:

1 1
1101 1101 1100
1100 1101

50
Summary Extensible hashing
+ Can handle growing files
- with less wasted space
- with no full reorganizations
- Indirection
(Not bad if directory in
memory)
-
Directory doubles in size
(Now it fits, now it does not)
51
Linear hashing
• Another dynamic hashing scheme
Two ideas:
b
(a) Use i low order bits of
hash 01110101
grows i

(b) File grows linearly

52
Example b=4 bits, i =2, 2
keys/bucket

Future
growth
0000 0101 buckets
1010 1111
00 01 10
m = 01 (max
11 used bucket) or n=2 (number of used
buckets)

Rule If h(k)[i ]  m, then

look at bucket h(k)[i]
else, look at bucket h(k)[i] - 2i -
1
53
Example b=4 bits, i =2, 2
keys/bucket
• insert 0101

Future
growth
0000 0101 buckets
1010 1111
00 01 10
m =11
01 (max used bucket)

Rule If h(k)[i ]  m, then

look at bucket h(k)[i]
else, look at bucket h(k)[i] - 2i -
1
54
Example b=4 bits, i =2, 2
keys/bucket
• insert 0101
0101
• can have overflow chains!

Future
growth
0000 0101 buckets
1010 1111
00 01 10
m =11
01 (max used bucket)

Rule If h(k)[i ]  m, then

look at bucket h(k)[i ]
else, look at bucket h(k)[i ] - 2i -
1
55
Note
• In textbook, n is used instead of m
• n=m+1 (n=number of used buckets)
n=10 (n=2 in decimal)
Future
growth
0000 0101 buckets
1010 1111
00 01 10
m =11
01 (max used bucket)

56
Example b=4 bits, i =2, 2
keys/bucket
0101 • insert 0101

Future
growth
0000 0101 1010 buckets
1010 1111
00 01 10
m =11
01 (max used bucket)
10

57
Example b=4 bits, i =2, 2
keys/bucket
0101 • insert 0101

Future
growth
0000 0101 1010 buckets
1010 1111
00 01 10
m =11
01 (max used bucket)
10
11

58
Example b=4 bits, i =2, 2
keys/bucket
0101 • insert 0101

Future
growth
0000 0101 1010 1111 buckets
0101
1010 1111
00 01 10
m =11
01 (max used bucket)
10
11

59
Example Continued: How to grow beyond
this?

i=2

0000 0101 1010 1111

0101
00 01 10 ...
11
m = 11 (max used bucket)

60
Example Continued: How to grow beyond
this?

i = 23

0000 0101 1010 1111

0101
000 0 01 0 10
0 ...
100 101
11 110 111
m = 11 (max used bucket)

61
Example Continued: How to grow beyond
this?

i = 23

0000 0101 1010 1111

0101
000 0 01 0 10
0 100
...
100 101
11 110 111
m = 11 (max used bucket)
100

62
Example Continued: How to grow beyond
this?

i = 23

0000 0101 1010 1111 0101

0101 0101
000 0 01 0 10
0 100 101
...
100 101
11 110 111
m = 11 (max used bucket)
100
101

63
 When do we expand file?

• Keep track of: # used slots = U

total # of
#used slots  #records, total # of slots  #buckets
slots

• If U > threshold then increase m

(and maybe i )

64
Summary Linear Hashing
+ Can handle growing files
- with less wasted space
- with no full reorganizations

+ No indirection like extensible

hashing
- Can still have overflow chains

65
Example: BAD CASE

Very full

Very empty Need to

move
m here…
Would
waste
space...

66
Hashing depends on data distribution!
Summary

Hashing
- How it works
- Dynamic hashing
- Extensible
- Linear

67
Next:

• Indexing vs Hashing
• Index definition in SQL
• Multiple key access

68
Indexing vs Hashing
• Hashing good for probes given key
e.g., SELECT …
FROM R
WHERE R.A = 5

69
Indexing vs Hashing
• INDEXING (Including B Trees) good
for
Range Searches:
e.g.,
SELECT FROM R
WHERE R.A > 5 AND R.A < 10;

70
Index definition in SQL

• Create index name on rel (attr)

• Create unique index name on rel
(attr)
defines candidate
key
• Drop INDEX
name

71
Note CANNOT SPECIFY TYPE OF
INDEX
(e.g. B-tree, Hashing, …)
OR PARAMETERS
(e.g. Load Factor, Size of
Hash,...)

... at least in SQL...

In Oracle you can !

72
Note ATTRIBUTE LIST  MULTIKEY INDEX
(next)
e.g., CREATE INDEX foo ON
R(A,B,C)

73
Multi-key Index

Motivation: Find records where

DEPT = “Toy” AND SAL >
50k

What kind of indexes can support this

query?

74
Strategy I:
• Use one index, say Dept.
• Get all Dept = “Toy” records
and check their salary

75
Strategy II:

• Use 2 Indexes; Manipulate Pointers

Toy ppppp pppppp Sal >

50k

AND  intersection of pointers

76
Strategy III:

• Multiple Key Index

One idea:
I1 I3

77
Example
10k
15k
Art 17k
Sales 21k
Example
Toy
Record
12k Name=Joe
15k DEPT=Sales
Dept 15k SAL=15k
19k
Index

Salary
Index
78
For which queries is this index
good?
Find RECs Dept = “Sales”
SAL=20k
Find RECs Dept = “Sales” SAL >
20k
Find RECs Dept = “Sales”
Find RECs SAL = 20k

Indexing Hashing Files
No ratings yet
Indexing Hashing Files
68 pages
IT3020 L06 Indexing
No ratings yet
IT3020 L06 Indexing
41 pages
5 Data Storage and Indexing
No ratings yet
5 Data Storage and Indexing
58 pages
Chapter 7 Indexing Part2
No ratings yet
Chapter 7 Indexing Part2
41 pages
File Organization
No ratings yet
File Organization
41 pages
Lesson 8 Cs450 - Indexing
No ratings yet
Lesson 8 Cs450 - Indexing
31 pages
Unit 3 - DBMS (Indexing, Hashing, B+-Tree)
No ratings yet
Unit 3 - DBMS (Indexing, Hashing, B+-Tree)
7 pages
Bitmap Index
No ratings yet
Bitmap Index
20 pages
Bitmap Indexes
No ratings yet
Bitmap Indexes
24 pages
Lesson 9 Lecture9
No ratings yet
Lesson 9 Lecture9
45 pages
UNIT-IV - File Organization
No ratings yet
UNIT-IV - File Organization
10 pages
Unit-4 Hand Written
No ratings yet
Unit-4 Hand Written
35 pages
V Unit
No ratings yet
V Unit
36 pages
Index
No ratings yet
Index
16 pages
Module Iippt
No ratings yet
Module Iippt
27 pages
02 Blocking - Addional
No ratings yet
02 Blocking - Addional
74 pages
03 UW Indexing
No ratings yet
03 UW Indexing
97 pages
Chap. 2 File Organization and Indexing: Abel J.P. Gomes
No ratings yet
Chap. 2 File Organization and Indexing: Abel J.P. Gomes
20 pages
CSE 544: Lecture 11 Storing Data, Indexes: Monday, 5/1/2006
No ratings yet
CSE 544: Lecture 11 Storing Data, Indexes: Monday, 5/1/2006
52 pages
File Organization-Lec11
No ratings yet
File Organization-Lec11
15 pages
V Unit
No ratings yet
V Unit
15 pages
Indexing in Database
No ratings yet
Indexing in Database
33 pages
Indexing and Hashing: B.Ramamurthy
No ratings yet
Indexing and Hashing: B.Ramamurthy
24 pages
5 Data Storage and Indexing
No ratings yet
5 Data Storage and Indexing
60 pages
Computerized Accounting System
100% (1)
Computerized Accounting System
6 pages
CS2202 IndexingHashing
No ratings yet
CS2202 IndexingHashing
83 pages
DS TM Study Material Presentations Unit-4 1TM
No ratings yet
DS TM Study Material Presentations Unit-4 1TM
22 pages
Unit Iv Implementation Techniques
No ratings yet
Unit Iv Implementation Techniques
91 pages
Indexing
No ratings yet
Indexing
24 pages
Final Review
No ratings yet
Final Review
96 pages
Lecture12 (CNC 312)
No ratings yet
Lecture12 (CNC 312)
36 pages
CO3 Session 6
No ratings yet
CO3 Session 6
29 pages
Unit 6
No ratings yet
Unit 6
38 pages
Unit-3 Part 2 Indexing and Hashing
No ratings yet
Unit-3 Part 2 Indexing and Hashing
36 pages
File Organizations and Indexing: R&G Chapter 8
No ratings yet
File Organizations and Indexing: R&G Chapter 8
40 pages
DBMS Unit-4
No ratings yet
DBMS Unit-4
9 pages
22-File Organization-06-09-2024
No ratings yet
22-File Organization-06-09-2024
23 pages
Adbs 5
No ratings yet
Adbs 5
37 pages
Chap. 6 Hash-Based Indexing: Abel J.P. Gomes
No ratings yet
Chap. 6 Hash-Based Indexing: Abel J.P. Gomes
15 pages
m5 Index PDF
No ratings yet
m5 Index PDF
60 pages
Data Warehouse - Bitmap Indexing
No ratings yet
Data Warehouse - Bitmap Indexing
24 pages
Aplikasi DB-MKG 7
No ratings yet
Aplikasi DB-MKG 7
22 pages
INDEXING
No ratings yet
INDEXING
10 pages
AnycubicSlicer - Usage Instructions - V1.0 - EN
100% (1)
AnycubicSlicer - Usage Instructions - V1.0 - EN
16 pages
File Organizations and Indexing: R&G Chapter 8
No ratings yet
File Organizations and Indexing: R&G Chapter 8
40 pages
File Organizations and Indexing: R&G Chapter 8
No ratings yet
File Organizations and Indexing: R&G Chapter 8
40 pages
IN3020/4020 - Database Systems Spring 2020, Week 3.1 Indexing
No ratings yet
IN3020/4020 - Database Systems Spring 2020, Week 3.1 Indexing
44 pages
Kunal's Yaml Tutorial Notes
No ratings yet
Kunal's Yaml Tutorial Notes
12 pages
Unit Iv
No ratings yet
Unit Iv
29 pages
Hashing 2
No ratings yet
Hashing 2
17 pages
Latch and Flipflops
100% (1)
Latch and Flipflops
9 pages
DBMS Cat 3-Key
No ratings yet
DBMS Cat 3-Key
8 pages
Lecture9 PDF
No ratings yet
Lecture9 PDF
45 pages
DINLect 1
No ratings yet
DINLect 1
69 pages
Chapter 12: Indexing and Hashing
No ratings yet
Chapter 12: Indexing and Hashing
31 pages
Template Script Mikrotik Routing Game Online: Ip Address Lokal - . Gateway Modem Game .
No ratings yet
Template Script Mikrotik Routing Game Online: Ip Address Lokal - . Gateway Modem Game .
6 pages
B.Sc. (Data Science)
No ratings yet
B.Sc. (Data Science)
9 pages
Hashing in DBMS
No ratings yet
Hashing in DBMS
11 pages
Unit 6.2 Indexing and Hashing
No ratings yet
Unit 6.2 Indexing and Hashing
37 pages
Mix 2030
No ratings yet
Mix 2030
34 pages
Database Modeling - Notes-V
No ratings yet
Database Modeling - Notes-V
9 pages
DBMS Unit5
No ratings yet
DBMS Unit5
40 pages
Indexing in Relational Databases
No ratings yet
Indexing in Relational Databases
2 pages
File Organization
No ratings yet
File Organization
11 pages
Fundamental Data Structures
No ratings yet
Fundamental Data Structures
160 pages
SAP Basic Nav
No ratings yet
SAP Basic Nav
27 pages
ConfigTool V5.1.3 - User Manual
No ratings yet
ConfigTool V5.1.3 - User Manual
88 pages
Unit 3 Storage Strategies Indices B-Trees Hashing
No ratings yet
Unit 3 Storage Strategies Indices B-Trees Hashing
12 pages
Ads R2022
No ratings yet
Ads R2022
178 pages
Exactive Series Manbre en
No ratings yet
Exactive Series Manbre en
258 pages
Math10 Chapter Notes 2
No ratings yet
Math10 Chapter Notes 2
40 pages
710, Barton Centre, M G Road, Bangalore 560 001: A.O: Against Order Tax (Vat) 5% Extra
No ratings yet
710, Barton Centre, M G Road, Bangalore 560 001: A.O: Against Order Tax (Vat) 5% Extra
11 pages
DLL Quarter1 Week3 Tle6
No ratings yet
DLL Quarter1 Week3 Tle6
7 pages
Microshoft Word Shortcut Keys2
No ratings yet
Microshoft Word Shortcut Keys2
21 pages
DX Log
No ratings yet
DX Log
26 pages
Edms 2
No ratings yet
Edms 2
10 pages
DS Dwa-171 D1 Eng
No ratings yet
DS Dwa-171 D1 Eng
3 pages
Code No. M1: Series: SA01
No ratings yet
Code No. M1: Series: SA01
9 pages
Number System
No ratings yet
Number System
23 pages
SMEs Cybercrime FL 496 Summary en
No ratings yet
SMEs Cybercrime FL 496 Summary en
20 pages
IAT Ans
No ratings yet
IAT Ans
6 pages
SIMATIC Virtualization As A Service
No ratings yet
SIMATIC Virtualization As A Service
4 pages
Rational and Irrational Numbers
No ratings yet
Rational and Irrational Numbers
4 pages
Game Requirements For Venge Io (Clone)
No ratings yet
Game Requirements For Venge Io (Clone)
3 pages
User Behavior Analytics
No ratings yet
User Behavior Analytics
2 pages
Manual - Excel Masterclass 1 - DS7
No ratings yet
Manual - Excel Masterclass 1 - DS7
4 pages
Course Content For 03 Days Cyber Secirity Programme
No ratings yet
Course Content For 03 Days Cyber Secirity Programme
1 page
Capgemini Interview Questions
No ratings yet
Capgemini Interview Questions
6 pages
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Digital Circuit Simulation Using Excel
From Everand
Digital Circuit Simulation Using Excel
Anthony Mazzurco
No ratings yet

04 UW Hashing

Uploaded by

04 UW Hashing

Uploaded by

Ullman et al.

Index entry header

Format of index leaf entries

An index can have its key values stored in ascending or

– A function-based index is an index based on a function’s

– A compressed index has repeated key values removed.

REGION='east' REGION='central' REGION='west'

1’s in a bit vector will be very rare. We compress the vector.

1. Determine how many bits the binary representation of i

We ignore the trailing 0’s. But not the starting 0’s !

Original bitvector: 0000000000000110001

key  h(key) <key>

• Alt (2) for “secondary” search key

• Key = ‘x1 x2 … xn’ n byte character

Good hash  Expected number

• Yes, if CPU time critical

(a) Use i of b bits output by hash

use i  grows over

• Run thru insert example in reverse!

But: Typically not implemented

insert 1100 add overflow block:

(b) File grows linearly

Rule If h(k)[i ]  m, then

Rule If h(k)[i ]  m, then

Rule If h(k)[i ]  m, then

0000 0101 1010 1111

0000 0101 1010 1111

0000 0101 1010 1111

0000 0101 1010 1111 0101

• Keep track of: # used slots = U

• If U > threshold then increase m

+ No indirection like extensible

Very empty Need to

• Create index name on rel (attr)

... at least in SQL...

In Oracle you can !

Motivation: Find records where

What kind of indexes can support this

• Use 2 Indexes; Manipulate Pointers

Toy ppppp pppppp Sal >

AND  intersection of pointers

• Multiple Key Index

You might also like