0% found this document useful (0 votes)
65 views

Hash Tables Slides

Hash tables provide a way to store key-value pairs where keys are unique. They use hashing algorithms to map keys to indexes in an array for fast lookup. Collisions are handled using separate chaining, where collided entries are stored in linked lists. Operations like adding, searching, and removing items have O(1) time complexity on average. The document discusses hash table concepts like hashing, collisions, and separate chaining, and provides examples of using hash tables to store HTTP headers.

Uploaded by

aa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views

Hash Tables Slides

Hash tables provide a way to store key-value pairs where keys are unique. They use hashing algorithms to map keys to indexes in an array for fast lookup. Collisions are handled using separate chaining, where collided entries are stored in linked lists. Operations like adding, searching, and removing items have O(1) time complexity on average. The document discusses hash table concepts like hashing, collisions, and separate chaining, and provides examples of using hash tables to store HTTP headers.

Uploaded by

aa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 110

Hash Tables

Robert Horvick
SOFTWARE ENGINEER

@bubbafat www.roberthorvick.com
Hash Table Overview
- Associative Array

Hashing Algorithms
- Stable
- Uniform Distribution
- Secure

Hash Table Operations


- Add
- Search
- Remove

Demo: State Level Caching


Associative Array
A collection of key/value pairs where the key can only
exist once in the collection
Associative Array Examples

HTTP Headers Application Environment Key/Value


Configuration Variables Database
Request URL
Response body
Browser version
Referrer
Etc…
HttpHeader headers;

headers["content-length"] = "8056";

headers["content-type"] = "image/png";

HTTP Headers
The header name and value become the key and value in the associative array
Environment Variables
Environment env;

env[”SSH_TTY"] = ”/dev/pts/0";

env[”PATH"] = ”/usr/local/jdk/bin/...";

Environment Variables
The header name and value become the key and value in the associative array
Hash Table
An associative array container that provides O(1) insert,
delete and search operations.
HashTable<String, String> headers;

headers["content-length"] = "8056";

headers["content-type"] = "image/png";

Hash Table
The hash table stores HTTP header data where the key and value are both
strings
HashTable<String, HttpHeaderValue> headers;

headers["content-length"] = IntHeader(8056)

headers["content-type"] = StringHeader("image/png");

Hash Table
Hash table key and value types do not have to be the same
Hash Function
A function that maps data of arbitrary size to data of a
fixed size.
Hash Function Examples

Verifying downloaded Storing passwords in a Hash tables key


data database lookup
Stability
Uniformity hash = f(value)
Security
Stability
A hash function always generates the same output given
the same input.
Hash Function Stability

Stable Unstable
public int StableHash(string input) public int UnstableHash(string input)
{ {
int result = 0; int result = DateTime.Now.Second;

foreach(byte ascii in input) foreach(byte ascii in input)


{ {
result += ascii; result += ascii;
} }

return result; return result;


} }
Hash Function Stability

Stable Unstable
StableHash("foo"); UnstableHash("foo");

StableHash("foo"); UnstableHash("foo");

StableHash("foo"); UnstableHash("foo");

324 1449447443

324 1449447444

324 1449447445
Uniformity
A hash algorithm should distribute its resulting hash
value uniformly throughout the output space.
Uniform Distribution
Uniform Distribution
Uniform Distribution
Uniform Distribution
Uniform Distribution
Uniform Distribution
Uniform Distribution
Uniform Distribution
Uniform Distribution
Uniform Distribution
Uniform Distribution

(More) Uniform Non-uniform


public uint SDBMHash(string input) public int StableHash(string input)
{ {
uint hash = 0; int result = 0;

foreach (byte ascii in input) foreach(byte ascii in input)


{ {
hash = hash * 65599 + ascii; result += ascii;
} }

return hash; return result;


} }
Uniform Distribution

(More) Uniform Non-uniform


SDBMHash(”foo"); StableHash("foo");

SDBMHash(”oof"); StableHash(”oof");

SDBMHash(”ofo"); StableHash(”ofo");

849955110 324

924308646 324

923718264 324
Security
A secure hashing algorithm cannot be inverted (the
input derived from the output hash).
Username Password Hash (sdbm)
evelyn 3471203675
brian 969889485
will 1978836480
Password Hash
aaaaaaaa 3834880256
aaaaaaab 3834880257
aaaaaaac 3834880258
aaaaaaad 3834880259
… …
zzzzzzzz 528284160
Username Password Hash (sdbm)
evelyn 3471203675
brian 969889485
will 1978836480

Hash Match Actual


3471203675 aagkDhA4 password
969889485 aac0xWRs football
1978836480 aaabhqCy jedi
Output Size (32)
Output Size (64)
Output Size (128)
Output Size (256)
Output Size (512)
Hash Output Sizes

232 (4294967296) 2512 (1.340781e+154)

Checks/sec Time Checks/sec Time


1000 49 Days 1000 4.25e+143 years
10,000 4.9 Days 10,000 4.25e+142 years
100,000 12 hours 100,000 4.25e+141 years
1,000,000 1 hour 1,000,000 4.25e+140 years
10,000,000 7 minutes 10,000,000 4.25e+139 years
100,000,000 42 seconds 100,000,000 4.25e+138 years
1,000,000,000 4 seconds 1,000,000,000 4.25e+137 years
Sample Hash Algorithms
Additive Hash

Pros f o o
Stable
Fast

Cons
Poor uniformity 102 111 111
Poor security

102 213 324


Folding Hash

Pros l o r em i p s um d o l o r
Stable
Fast
Better Uniformity
1701998444 1885937773 544044403 1869377380 114
Cons
Poor Security 1701998444 -707031079 -162986676 1706390704 1706390818
44

Dbj2 Hash

public ulong Dbj2Hash(string input)


{
ulong hash = 5381;
f o o
foreach(byte c in input)
{
hash = hash * 33 + c;
}
177675 5863386 193491849
return hash;
}
Hash Function Comparison

Name Output Size Stable Uniform Secure


Additive 32 YES NO NO
Folding 32 YES YES NO
Dbj2 64 YES YES NO
MD5 128 YES YES NO*
SHA-1 160 YES YES NO*
SHA-2 224/384 YES YES NO*
SHA-2 256-512 YES YES YES
*Once considered secure, this hash should no longer be used for secure applications.
Adding Items
public class HashTable<TKey, TValue>  Key and value type parameters
{
TValue[] table = new TValue[4];  Backing array defaults to size of 4

private uint Hash(TKey key) { ... }  A private hash function

public TValue this[TKey key]  Functions to support indexed get and set
operations using the key type
{
get => table[Index(key)];  Retrieve the value with that key

set => table[Index(key)] = value;  Set the value with that key
}
private uint Index(TKey key)
{
return Hash(key) % table.Length;
}
}
HashTable<string, string> table = new HashTable<string, string>();

table["content-length"] = "8056";

table["content-type"] = "image/png";

Adding Values
HashTable<string, string> table = new HashTable<string, string>();

table["content-length"] = "8056";

table["content-type"] = "image/png";

Adding Values
HashTable<string, string> table = new HashTable<string, string>();

table["content-length"] = "8056";

table["content-type"] = "image/png";

Adding Values

8056
HashTable<string, string> table = new HashTable<string, string>();

table["content-length"] = "8056";

table["content-type"] = "image/png";

Adding Values
HashTable<string, string> table = new HashTable<string, string>();

table["content-length"] = "8056";

table["content-type"] = "image/png";

Adding Values

8056
Hash Collision
When multiple distinct keys would be inserted at the
same hash table index.
Separate Chaining
Collisions in a hash table are chained together into a
linked list whose root node is the hash table array entry.
internal class HashTableEntry<TKey, TValue>

public TKey Key;

public TValue Value;

public HashTableEntry<TKey, TValue> Next;

Separate Chaining
The hash table handles collisions by linking all the values with the same table
index into a linked list of entries.
table["content-length"] = "8056";

table["content-type"] = "image/png";
table["content-length"] = "8056"; Key: “content-length”
Value: “8056”
Next: null
table["content-type"] = "image/png";
table["content-length"] = "8056"; Key: “content-length”
Value: “8056”
Next: null
table["content-type"] = "image/png";
Key: “content-type”
Value: “image/png”
Next: null

table["content-length"] = "8056"; Key: “content-length”


Value: “8056”
Next: <entry>
table["content-type"] = "image/png";
Hash Table Collisions

… …

… … … …
Fill Factor
The percentage of capacity representing the maximum
number of entries before the table will grow. E.g., 0.80
Growth Factor
The multiple to increase the capacity of the hash table
when the fill factor has been exceeded. E.g., 1.50
Hash Table Growth

Use the fill factor to determine if growth is needed

Use the growth factor to allocate a larger array

Determine the new index for the existing items in the hash table

Update the hash table to use the new array


Hash Table Growth

… …
Hash Table Growth

… …
Hash Table Growth

… …
Hash Table Growth

… …
Hash Table Growth

… …
Hash Table Growth

… …
Hash Table Growth

… … …
Hash Table Growth

… … … …
Hash Table Growth

… … … …
Hash Table Growth

… … … …
Iteration

… … … …

Begin at first index … … … …


Begin at first index


… … … …
Visit each index

Begin at first index


Visit each index … … … …

Visit each entry


Begin at first index


Visit each index … … … …

Visit each entry



Begin at first index
Visit each index
… … … …
Visit each entry
Continue to the end

Begin at first index
Visit each index
… … … …
Visit each entry
Continue to the end

Begin at first index
Visit each index
… … … …
Visit each entry
Continue to the end

Begin at first index
Visit each index
… … … …
Visit each entry
Continue to the end

Begin at first index
Visit each index
… … … …
Visit each entry
Continue to the end

Begin at first index
Visit each index
… … … …
Visit each entry
Continue to the end
Items are iterated in the
order they are stored in the
table.
Finding Items
Dog

Cat Goat Horse Ox


Dog

Cat Goat Horse Ox


Find the index
Dog

Find the index Cat Goat Horse Ox

Use the hash function


Dog

Find the index


Cat Goat Horse Ox
Use the hash function

Modulo the hash


Dog

Find the index


Use the hash function Cat Goat Horse Ox

Modulo the hash

Search the entry list


Dog

Find the index


Use the hash function Cat Goat Horse Ox

Modulo the hash

Search the entry list


Dog

Cat Goat Horse Ox


Dog

Cat Goat Horse Ox


Find the index
Dog

Find the index Cat Goat Horse Ox

Use the hash function


Dog

Find the index


Cat Goat Horse Ox
Use the hash function

Modulo the hash


Dog

Find the index


Use the hash function Cat Goat Horse Ox

Modulo the hash

Search the entry list


Dog

Find the index


Use the hash function Cat Goat Horse Ox

Modulo the hash

Search the entry list


Dog

Find the index


Use the hash function Cat Goat Horse Ox

Modulo the hash

Search the entry list


Removing Items
Removing Items

Determine the index Search the entry list Remove the entry
Dog

Cat Goat Horse Ox


Dog

Cat Goat Horse Ox

Find the index


Dog

Find the index Cat Goat Horse Ox

Search the entry list


Dog

Find the index


Cat Goat Horse Ox

Search the entry list


Remove the entry
Dog

Find the index


Cat Goat Ox

Search the entry list


Remove the entry
Find the index
Cat Goat Dog Ox

Search the entry list


Remove the entry
Find the index
Cat Goat Dog Ox

Search the entry list


Remove the entry
Demo

Add state-level caching

You might also like