String Permutations: Difference between revisions
From charlesreid1
| Line 127: | Line 127: | ||
==Generating== | ==Generating== | ||
The task of actually generating all of the permutations of words that contain a certain set of characters is an extremely important one. Knuth covers this in Volume 4 Facsimile 2 of his [[AOCP|Art of Computer Programming]] volumes. | The task of actually generating all of the permutations of words that contain a certain set of characters is an extremely important one. (Knuth covers this exclusively in Volume 4 Facsimile 2 of his [[AOCP|Art of Computer Programming]] volumes.) | ||
We can generate permutations in lexicographic order, or sorted order | We can generate permutations in several ways: | ||
* lexicographic order, or sorted order | |||
* de Bruijn cycles, where we remove an item from the front and add an item to the rear | |||
* binary reflected Gray code, where we change only a single item at a time | |||
Excellent simple algorithm here: http://webhome.csc.uvic.ca/~haron/CoolMulti.pdf | |||
Algorithm: | |||
* Visits every permutation of integer multiset E | |||
* Call to init(E) creates singly linked list that stores elements of E in non-increasing order | |||
* head, min, and inc point to first, second to last, and last nodes, respectively | |||
* All variables are pointers | |||
* phi is null | |||
* If E = {1,1,2,4}, then first three visit(E) calls will produce the configurations: | |||
** 4 2 1 1 | |||
** 1 4 2 1 | |||
** 4 1 2 1 | |||
Here is the algorithm: | |||
<pre> | |||
# head points to first node of current permutation | |||
# i points to ith node of current permutation | |||
# afteri points to the (i+1)st node of current permutation | |||
[head, i afteri] <- init(E) | |||
visit(head) | |||
while (afteri.next != null or afteri.value < head.value) do: | |||
if (afteri.next != null and i.value >= afteri.next.value): | |||
beforek <- afteri | |||
else | |||
beforek <- i | |||
end | |||
k <- beforek.next | |||
beforek.next <- k.next | |||
k.next <- head | |||
if (k.value < head.value): | |||
i <- k | |||
end | |||
afteri <- i.next | |||
head <- k | |||
visit(head) | |||
end | |||
</pre> | |||
Revision as of 19:34, 6 August 2017
Skiena Chapter 7
Question 14
Write a function to find all permutations of the letters in a particular string (example: HELLO).
Enumerating
Enumerating - Easy Case
Let's start by counting the total number of permutations of letters in a particular string.
Consider the simplest case, no repeated letters: ABCD
Each position can hold any of the four letters, but we can only choose a letter once. So the total number of permutations of a k-letter string with no repeated letters is $ k! $
Mathematically, we are picking 4 distinct objects, and we are picking all 4 objects at the same time. In other words, 4 pick 4.
This leads to a total number of possible strings:
$ P(4,4) = \dfrac{4!}{(4-4)!} = \dfrac{4!}{1} = 4! $
We are using pick, not choose, because order matters. Including a 4! on the bottom would indicate that order does not matter, and would lead to one possible outcome (obvious - any permutation with the letters ABCD will always be the same set of letters, but in different order, so there is only one outcome).
Enumerating - Multiset Case
Now consider the less trivial case where we have repeated letters: AAABCDD
To count the number of permutations of a multiset, we should use multinomial coefficients. (Note: see AOCP/Multisets and AOCP/Multinomial Coefficients).
Mathematically, this problem is what is known as a multiset problem, where we have a set of letters each occurring with some frequency:
$ \{ a \cdot A, b \cdot B, c \cdot C, d \cdot D \} $
In this case we can write the number of possible permutations by placing each character, one at a time.
Start by placing the A characters. The number of slots to place the A characters is (a) choose (number of slots):
$ \binom{a+b+c+d}{a} $
Once the As have been placed, we can enumerate the B characters The number of slots to place the B characters is (b) choose (number of slots, less A characters):
$ \binom{b+c+d}{b} $
Next, the Cs can be placed, where the number of slots to place the C characters is (c) choose (number of slots, less A and B characters):
$ \binom{c+d}{c} $
and finally, all of the other characters are placed, leaving the Ds with a single remaining configuration:
$ \binom{d}{d} $
Now, the total number of permutations is the product of each of these. The number of configurations of a multiset is given by the multinomial coefficient
$ \binom{n}{a, b, c, d} = \binom{n}{a} \binom{n-a}{b} \binom{n-a-b}{c} \binom{n-a-b-c}{d} = \dfrac{n!}{a! b! c! d!} $
More generally, for a multiset with k objects (denoted a) occurring a certain number of times (denoted r),
$ \{ r_1 \cdot a_1, r_2 \cdot a_2, \dots, r_k \cdot a_k \} $
then if there are n total slots $ r_1 + r_2 + \dots + r_k = n $, the total number of permutations is given by the multinomial coefficient,
$ \binom{n}{r_1, r_2, \dots, r_k} = \binom{n}{r_1} \binom{n-r_1}{r_2} \binom{n - r_1 - r_2}{r_3} \dots $
This can be expressed more concisely in terms of factorials:
$ \binom{n}{r_1, r_2, \dots, r_k} = \dfrac{n!}{r_1! r_2! \dots r_k!} $
Enumerating - Infinite Letters Case
Suppose we have an infinite pool of characters, and we wish to know the number of words of a given length that can be formed using these infinite pools of letters.
We can still treat this as a multiset problem, but this time the number of characters is infinite. This multiset is denoted:
$ \{ \infty \cdot A, \infty \cdot B, \infty \cdot C, \infty \cdot D \} $
In this case, the number of permutations of length n that can be formed from the 4 different characters is given by the simple expression,
$ n^4 $
More generally, if we have an infinite multiset with k distinct elements,
$ \{ \infty \cdot r_1, \infty \cdot r_2, \dots, \infty \cdot r_k \} $
then the number of words of length n that can be formed is given by:
$ n^k $
Enumerating - Stars and Bars
The stars and bars theorem/approach is another way to look at assembling/enumerating permutations.
If we have n objects being placed into k partitions, we can think of this as placing k-1 bars among the n objects.
In this case, the total number of slots for objects + bars is n + k - 1, and we are placing k - 1 bars, so we have the total number of outcomes given by
$ \binom{n+k-1}{k-1} $
Generating
The task of actually generating all of the permutations of words that contain a certain set of characters is an extremely important one. (Knuth covers this exclusively in Volume 4 Facsimile 2 of his Art of Computer Programming volumes.)
We can generate permutations in several ways:
- lexicographic order, or sorted order
- de Bruijn cycles, where we remove an item from the front and add an item to the rear
- binary reflected Gray code, where we change only a single item at a time
Excellent simple algorithm here: http://webhome.csc.uvic.ca/~haron/CoolMulti.pdf
Algorithm:
- Visits every permutation of integer multiset E
- Call to init(E) creates singly linked list that stores elements of E in non-increasing order
- head, min, and inc point to first, second to last, and last nodes, respectively
- All variables are pointers
- phi is null
- If E = {1,1,2,4}, then first three visit(E) calls will produce the configurations:
- 4 2 1 1
- 1 4 2 1
- 4 1 2 1
Here is the algorithm:
# head points to first node of current permutation
# i points to ith node of current permutation
# afteri points to the (i+1)st node of current permutation
[head, i afteri] <- init(E)
visit(head)
while (afteri.next != null or afteri.value < head.value) do:
if (afteri.next != null and i.value >= afteri.next.value):
beforek <- afteri
else
beforek <- i
end
k <- beforek.next
beforek.next <- k.next
k.next <- head
if (k.value < head.value):
i <- k
end
afteri <- i.next
head <- k
visit(head)
end