Hash Functions/Cyclic Permutation
From charlesreid1
See also: Hash Functions
Notes
Cyclic permutation is a method for generating hash functions from data.
The basic idea is to generate a binary representation of the data (typically stored in a 32-bit integer, so wrapped into 32 bits) with a series of 32 0s and 1s, then to cyclically shift the digits of this binary number. If the data are the same, they should result in the same binary number, which will be shifted into the same resulting number when the hash code is computed. Collisions will happen if two objects end up having the same binary representation.
For example, to cyclically shift the digits of 10001111 2 digits to the right, the number would become 111000011.
A simple implementation of the cyclic permutation hash code calculation in Java is given as a static method below:
public static int hashCode(String s) { int shift = 5; int h = 0; for(int i=0; i<s.length(); i++) { h = (h<<(shift)) | (h>>>(32-shift)); h += (int)(s.charAt(i)); } return h; }
The value of the shift can be anywhere from 0 (in which case, an object's hash code is simply its binary representation) to 31 (if the shift is 32, you're back to 0). The actual frequency of collisions depends heavily on this shift value. For example, here is the output of a program that uses a collection of about 7,000 words form James Joyce's Ulysses and computes a hash code for each word, counting the number of collisions:
$ javac HashFunctions.java && java HashFunctions Shift: 0 Hash codes: 7313 Collisions: 6075 Shift: 1 Hash codes: 7313 Collisions: 1678 Shift: 2 Hash codes: 7313 Collisions: 338 Shift: 3 Hash codes: 7313 Collisions: 81 Shift: 4 Hash codes: 7313 Collisions: 13 Shift: 5 Hash codes: 7313 Collisions: 2 Shift: 6 Hash codes: 7313 Collisions: 1 Shift: 7 Hash codes: 7313 Collisions: 2 Shift: 8 Hash codes: 7313 Collisions: 8 Shift: 9 Hash codes: 7313 Collisions: 2 Shift: 10 Hash codes: 7313 Collisions: 27 Shift: 11 Hash codes: 7313 Collisions: 60 Shift: 12 Hash codes: 7313 Collisions: 5 Shift: 13 Hash codes: 7313 Collisions: 2 Shift: 14 Hash codes: 7313 Collisions: 17 Shift: 15 Hash codes: 7313 Collisions: 126 Shift: 16 Hash codes: 7313 Collisions: 752 Shift: 17 Hash codes: 7313 Collisions: 60 Shift: 18 Hash codes: 7313 Collisions: 12 Shift: 19 Hash codes: 7313 Collisions: 0 Shift: 20 Hash codes: 7313 Collisions: 6 Shift: 21 Hash codes: 7313 Collisions: 51 Shift: 22 Hash codes: 7313 Collisions: 25 Shift: 23 Hash codes: 7313 Collisions: 3 Shift: 24 Hash codes: 7313 Collisions: 8 Shift: 25 Hash codes: 7313 Collisions: 3 Shift: 26 Hash codes: 7313 Collisions: 2 Shift: 27 Hash codes: 7313 Collisions: 3 Shift: 28 Hash codes: 7313 Collisions: 23 Shift: 29 Hash codes: 7313 Collisions: 114 Shift: 30 Hash codes: 7313 Collisions: 671 Shift: 31 Hash codes: 7313 Collisions: 2826
As you can see, some keys result in just a few collisions, while others result in thousands.
Flags
Maps and Dictionaries Part of Computer Science Notes
Series on Data Structures
Maps/Dictionaries Maps · Maps/ADT · Maps in Java · Maps/OOP · Maps/Operations and Performance Map implementations: Maps/AbstractMap · Maps/UnsortedArrayMap · Maps/SortedArrayMap Dictionary implementations: Dictionaries/LinkedDict · Dictionaries/ArrayDict
Hashes Hash Maps/OOP · Hash Maps/Operations and Performance Hash Maps/Dynamic Resizing · Hash Maps/Collision Handling with Chaining Hash functions: Hash Functions · Hash Functions/Cyclic Permutation Hash map implementations: Hash Maps/AbstractHashMap · Hash Maps/ChainedHashMap
Skip Lists · Java/ConcurrentSkipList · Java implementations: SkipList
Sets Sets · Sets/ADT · Sets in Java · Sets/OOP · Multisets
|