Hash Tables Study Guide: Difference between revisions
From charlesreid1
(Created page with "=Definitions and Variations= =ADTs and Interfaces= =Implementations= =Algorithms for Operations= =Complexity and Cost= =OOP Principles= =Flags= {{StudyGuideFlag}} Cat...") |
|||
| Line 1: | Line 1: | ||
=Definitions and Variations= | =Definitions and Variations= | ||
==Definitions== | |||
'''hash table''' - data structure that uses key or value based lookup system | |||
'''dictionary problem''' - searching for an item in a structure; if item is not found, you learn nothing new | |||
'''hash function''' - a mapping from input objects to a hash table entry | |||
'''bucket array''' - array of data used to store hash table entries, indexed by hash codes (key space) | |||
'''collision''' - when two or more elements that are distinct/different have the same hash code | |||
'''compression function''' - maps the output of the hash code to an integer in the range 0 to N-1 | |||
'''hash code''' - an integer that is uniquely determined by the input key | |||
'''polynomial hash code''' - accounts for positions (not just presence) of various bits, each bit is the coefficient of a polynomial (use Horner's Rule to evaluate): | |||
<math> | |||
x_0 a^{n-1} + x_1 a^{n-2} + \dots + x_{n-2} a + x_{n-1} | |||
</math> | |||
'''cyclic shift hash code''' - shifts the bits of the 32-bit key representation by X bits (left or right) | |||
'''immutable''' - unable to be changed after construction/instantiation; keys must be immutable, or their hash value will change! | |||
'''division method (compression)''' - turns the hash code into an integer between 0 and N-1 via the rather unimaginative | |||
<math> | |||
i \mod N | |||
</math> | |||
'''MAD (multiply and divide) method (a.k.a. universal hashing) (compression''' - use the following compression function: | |||
<math> | |||
(ai + b \mod p ) \mod N | |||
</math> | |||
where p > N and p is prime | |||
'''collision resolution''' - rule used to determine what happens when a collision is found | |||
'''separate chaining''' - using linked lists as buckets, instead of storing elements in an array directly | |||
'''load factor''' - alpha - ratio of number of items N to number of buckets M, cost is proportional to load factor | |||
'''open addressing''' - requires load factor is 1, stores elements directly instead of in buckets | |||
'''linear probing''' - used with open addressing, collision handling technique in which a full element leads to a neighbor search (deletion becomes tricky, mark removed nodes) | |||
'''quadratic probing''' - alternative that provides more desirable properties than linera probing | |||
'''double hashing''' - if we hash a key h(k) and the position is occupied, we can use a second hash function to find a new key, | |||
<math> | |||
h(k) + i \cdot h^{\prime}(k) | |||
</math> | |||
=ADTs and Interfaces= | =ADTs and Interfaces= | ||
Revision as of 09:10, 8 July 2017
Definitions and Variations
Definitions
hash table - data structure that uses key or value based lookup system
dictionary problem - searching for an item in a structure; if item is not found, you learn nothing new
hash function - a mapping from input objects to a hash table entry
bucket array - array of data used to store hash table entries, indexed by hash codes (key space)
collision - when two or more elements that are distinct/different have the same hash code
compression function - maps the output of the hash code to an integer in the range 0 to N-1
hash code - an integer that is uniquely determined by the input key
polynomial hash code - accounts for positions (not just presence) of various bits, each bit is the coefficient of a polynomial (use Horner's Rule to evaluate):
$ x_0 a^{n-1} + x_1 a^{n-2} + \dots + x_{n-2} a + x_{n-1} $
cyclic shift hash code - shifts the bits of the 32-bit key representation by X bits (left or right)
immutable - unable to be changed after construction/instantiation; keys must be immutable, or their hash value will change!
division method (compression) - turns the hash code into an integer between 0 and N-1 via the rather unimaginative
$ i \mod N $
MAD (multiply and divide) method (a.k.a. universal hashing) (compression - use the following compression function:
$ (ai + b \mod p ) \mod N $
where p > N and p is prime
collision resolution - rule used to determine what happens when a collision is found
separate chaining - using linked lists as buckets, instead of storing elements in an array directly
load factor - alpha - ratio of number of items N to number of buckets M, cost is proportional to load factor
open addressing - requires load factor is 1, stores elements directly instead of in buckets
linear probing - used with open addressing, collision handling technique in which a full element leads to a neighbor search (deletion becomes tricky, mark removed nodes)
quadratic probing - alternative that provides more desirable properties than linera probing
double hashing - if we hash a key h(k) and the position is occupied, we can use a second hash function to find a new key,
$ h(k) + i \cdot h^{\prime}(k) $