Skiena Chapter 7

Question 14

Write a function to find all permutations of the letters in a particular string (example: HELLO).

Enumerating

Enumerating - Easy Case

Let's start by counting the total number of permutations of letters in a particular string.

Consider the simplest case, no repeated letters: ABCD

Each position can hold any of the four letters, but we can only choose a letter once. So the total number of permutations of a k-letter string with no repeated letters is $k!$

Mathematically, we are picking 4 distinct objects, and we are picking all 4 objects at the same time. In other words, 4 pick 4.

This leads to a total number of possible strings:

$P(4,4)={\dfrac {4!}{(4-4)!}}={\dfrac {4!}{1}}=4!$

We are using pick, not choose, because order matters. Including a 4! on the bottom would indicate that order does not matter, and would lead to one possible outcome (obvious - any permutation with the letters ABCD will always be the same set of letters, but in different order, so there is only one outcome).

Enumerating - Multiset Case

Now consider the less trivial case where we have repeated letters: AAABCDD

To count the number of permutations of a multiset, we should use multinomial coefficients. (Note: see AOCP/Multisets and AOCP/Multinomial Coefficients).

Mathematically, this problem is what is known as a multiset problem, where we have a set of letters each occurring with some frequency:

$\{a\cdot A,b\cdot B,c\cdot C,d\cdot D\}$

In this case we can write the number of possible permutations by placing each character, one at a time.

Start by placing the A characters. The number of slots to place the A characters is (a) choose (number of slots):

${\binom {a+b+c+d}{a}}$

Once the As have been placed, we can enumerate the B characters The number of slots to place the B characters is (b) choose (number of slots, less A characters):

${\binom {b+c+d}{b}}$

Next, the Cs can be placed, where the number of slots to place the C characters is (c) choose (number of slots, less A and B characters):

${\binom {c+d}{c}}$

and finally, all of the other characters are placed, leaving the Ds with a single remaining configuration:

${\binom {d}{d}}$

Now, the total number of permutations is the product of each of these. The number of configurations of a multiset is given by the multinomial coefficient

${\binom {n}{a,b,c,d}}={\binom {n}{a}}{\binom {n-a}{b}}{\binom {n-a-b}{c}}{\binom {n-a-b-c}{d}}={\dfrac {n!}{a!b!c!d!}}$

More generally, for a multiset with k objects (denoted a) occurring a certain number of times (denoted r),

$\{r_{1}\cdot a_{1},r_{2}\cdot a_{2},\dots ,r_{k}\cdot a_{k}\}$

then if there are n total slots $r_{1}+r_{2}+\dots +r_{k}=n$ , the total number of permutations is given by the multinomial coefficient,

${\binom {n}{r_{1},r_{2},\dots ,r_{k}}}={\binom {n}{r_{1}}}{\binom {n-r_{1}}{r_{2}}}{\binom {n-r_{1}-r_{2}}{r_{3}}}\dots$

This can be expressed more concisely in terms of factorials:

${\binom {n}{r_{1},r_{2},\dots ,r_{k}}}={\dfrac {n!}{r_{1}!r_{2}!\dots r_{k}!}}$

Enumerating - Infinite Letters Case

Suppose we have an infinite pool of characters, and we wish to know the number of words of a given length that can be formed using these infinite pools of letters.

We can still treat this as a multiset problem, but this time the number of characters is infinite. This multiset is denoted:

$\{\infty \cdot A,\infty \cdot B,\infty \cdot C,\infty \cdot D\}$

In this case, the number of permutations of length n that can be formed from the 4 different characters is given by the simple expression,

$n^{4}$

More generally, if we have an infinite multiset with k distinct elements,

$\{\infty \cdot r_{1},\infty \cdot r_{2},\dots ,\infty \cdot r_{k}\}$

then the number of words of length n that can be formed is given by:

$n^{k}$

Enumerating - Stars and Bars

Main article: Stars and Bars

The stars and bars theorem/approach is another way to look at assembling/enumerating permutations.

If we have n objects being placed into k partitions, we can think of this as placing k-1 bars among the n objects.

In this case, the total number of slots for objects + bars is n + k - 1, and we are placing k - 1 bars, so we have the total number of outcomes given by

${\binom {n+k-1}{k-1}}$

Generating

The task of actually generating all of the permutations of words that contain a certain set of characters is an extremely important one. (Knuth covers this exclusively in Volume 4 Facsimile 2 of his Art of Computer Programming volumes.)

We can generate permutations in several ways:

lexicographic order, or sorted order
de Bruijn cycles, where we remove an item from the front and add an item to the rear
binary reflected Gray code, where we change only a single item at a time
Cool-lexicographic order (see http://webhome.csc.uvic.ca/~haron/CoolMulti.pdf)

Cool-Lexicographic Order

Main article: Cool

This is a nice simple algorithm that implements an iterative, non-recursive, loopless permutation generator.

Algorithm:

Visits every permutation of integer multiset E
Call to init(E) creates singly linked list that stores elements of E in non-increasing order
head, min, and inc point to first, second to last, and last nodes, respectively
All variables are pointers
phi is null
If E = {1,1,2,4}, then first three visit(E) calls will produce the configurations:
- 4 2 1 1
- 1 4 2 1
- 4 1 2 1

Notes on variables:

head points to first node of current permutation
i points to ith node of current permutation
afteri points to the (i+1)st node of current permutation

to apply the prefix shift of length k, the two pointers k and beforek are used:

k points to the kth node of the current permutation
beforek points to the (k-1)st node of the current permutation

Note that the visit(head) call denotes a new permutation that has been generated

visit(head) happens when new permutation pointed to by head
Represents passing control back to consumer requesting permutations
Work in init(E) would also be done by consumer

When does the loop stop?

Condition on loop ensures algorithm continues until it generates tail(E)
Rather than generating tail(E) as separate case after oop ends, it initializes linked list to tail(E)
This is the very first permutation visited

Here is the algorithm:

[head, i afteri] <- init(E)
visit(head)
while (afteri.next != null or afteri.value < head.value) do:
    if (afteri.next != null and i.value >= afteri.next.value):
        beforek <- afteri
    else
        beforek <- i
    end
    k <- beforek.next
    beforek.next <- k.next
    k.next <- head
    if (k.value < head.value):
        i <- k
    end
    afteri <- i.next
    head <- k
    visit(head)
end

Flags

String Permutations

From charlesreid1

Contents