Hash Maps/Dynamic Resizing

Hash Maps: Dynamic Resizing

Once we have our hash table implemented, we'll want to start adding lots of items. But then, we have to strike a balance.

The smaller the output space of the hash function, the smaller our map is in memory. But if we are using chaining, that leads to longer and longer linked lists, linked lists of size n/m, which gets further from O(1) as the linked lists get longer.

On the other hand, if we make m too big, and make the output space of our hash function too large, we end up wasting space in our hash table.

The concept behind expanding the hash table size dynamically is to maintain the balance on-the-fly. The concept is similar to the array resizing algorithm used by the Arrays/Java/PythonList PythonList class in Java, which implements a dynamically expanding/shrinking array.

Growing the table size, from m to m', requires the following steps:

make table of size m'
build new hash h'
rehash
for item in T: T'.insert(item)

How much time?

O(n) at least
In general, O(n + m + m')

Why new hash?

Hash function is designed to have the specified output space of m
Each key needs to be re-hashed

If n > m, we need to make the table bigger.

If we make the table a constant amount larger, the cost of n inserts is a triangular number, theta(1+2+3+4+...+n) = theta(n^2)
If we make the table twice as large, cost of n insertions is amortized
m' = 2m: cost of n inserts
theta(1 + 2 + 4 + 8 + ... + n)
theta(n)
Only a few - log N - operations are expensive

Similarly, deletions should happen when we get to n < m/4 quarter-full table

Should shrink table size by half
As we saw before, shrinking table size by quarter, we can have single table size or add/remove operation that costs O(N)
Cost of n inserts
Slow operation if we shrink by half when table is empty by half:
slow operation is 2^k <--insert/delete--> (2^k) + 1
This is a Theta(n) operation
When you are a quarter full, you shrink by half
Amortized time is Theta(1)
Little bit tricky to prove...

Amortization

An operation takes "T(n) amortized" if k operations take $\leq k\cdot T(n)$ time

k inserts take theta(k) time, so this is O(1) amortized insert

Flags

Hash Maps/Dynamic Resizing

From charlesreid1