From charlesreid1

Skiena Chapter 7

Question 14

Write a function to find all permutations of the letters in a particular string (example: HELLO).

Enumerating

Enumerating - Easy Case

Let's start by counting the total number of permutations of letters in a particular string.

Consider the simplest case, no repeated letters: ABCD

Each position can hold any of the four letters, but we can only choose a letter once. So the total number of permutations of a k-letter string with no repeated letters is k!

Mathematically, we are picking 4 distinct objects, and we are picking all 4 objects at the same time. In other words, 4 pick 4.

This leads to a total number of possible strings:


P(4,4) = \dfrac{4!}{(4-4)!} = \dfrac{4!}{1} = 4!

We are using pick, not choose, because order matters. Including a 4! on the bottom would indicate that order does not matter, and would lead to one possible outcome (obvious - any permutation with the letters ABCD will always be the same set of letters, but in different order, so there is only one outcome).

Enumerating - Multiset Case

Now consider the less trivial case where we have repeated letters: AAABCDD

To count the number of permutations of a multiset, we should use multinomial coefficients. (Note: see AOCP/Multisets and AOCP/Multinomial Coefficients).

Mathematically, this problem is what is known as a multiset problem, where we have a set of letters each occurring with some frequency:


\{ a \cdot A, b \cdot B, c \cdot C, d \cdot D \}

In this case we can write the number of possible permutations by placing each character, one at a time.

Start by placing the A characters. The number of slots to place the A characters is (a) choose (number of slots):


\binom{a+b+c+d}{a}

Once the As have been placed, we can enumerate the B characters The number of slots to place the B characters is (b) choose (number of slots, less A characters):


\binom{b+c+d}{b}

Next, the Cs can be placed, where the number of slots to place the C characters is (c) choose (number of slots, less A and B characters):


\binom{c+d}{c}

and finally, all of the other characters are placed, leaving the Ds with a single remaining configuration:


\binom{d}{d}

Now, the total number of permutations is the product of each of these. The number of configurations of a multiset is given by the multinomial coefficient


\binom{n}{a, b, c, d} = \binom{n}{a} \binom{n-a}{b} \binom{n-a-b}{c} \binom{n-a-b-c}{d} = \dfrac{n!}{a! b! c! d!}

More generally, for a multiset with k objects (denoted a) occurring a certain number of times (denoted r),


\{ r_1 \cdot a_1, r_2 \cdot a_2, \dots, r_k \cdot a_k \}

then if there are n total slots r_1 + r_2 + \dots + r_k = n, the total number of permutations is given by the multinomial coefficient,


\binom{n}{r_1, r_2, \dots, r_k} = \binom{n}{r_1} \binom{n-r_1}{r_2} \binom{n - r_1 - r_2}{r_3} \dots

This can be expressed more concisely in terms of factorials:


\binom{n}{r_1, r_2, \dots, r_k} = \dfrac{n!}{r_1! r_2! \dots r_k!}

Enumerating - Infinite Letters Case

Suppose we have an infinite pool of characters, and we wish to know the number of words of a given length that can be formed using these infinite pools of letters.

We can still treat this as a multiset problem, but this time the number of characters is infinite. This multiset is denoted:


\{ \infty \cdot A, \infty \cdot B, \infty \cdot C, \infty \cdot D \}

In this case, the number of permutations of length n that can be formed from the 4 different characters is given by the simple expression,


n^4

More generally, if we have an infinite multiset with k distinct elements,


\{ \infty \cdot r_1, \infty \cdot r_2, \dots, \infty \cdot r_k \}

then the number of words of length n that can be formed is given by:

n^k

Enumerating - Stars and Bars

The stars and bars theorem/approach is another way to look at assembling/enumerating permutations.

If we have n objects being placed into k partitions, we can think of this as placing k-1 bars among the n objects.

In this case, the total number of slots for objects + bars is n + k - 1, and we are placing k - 1 bars, so we have the total number of outcomes given by


\binom{n+k-1}{k-1}

Generating

The task of actually generating all of the permutations of words that contain a certain set of characters is an extremely important one. (Knuth covers this exclusively in Volume 4 Facsimile 2 of his Art of Computer Programming volumes.)

We can generate permutations in several ways:

  • lexicographic order, or sorted order
  • de Bruijn cycles, where we remove an item from the front and add an item to the rear
  • binary reflected Gray code, where we change only a single item at a time
  • Cool-lexicographic order (see http://webhome.csc.uvic.ca/~haron/CoolMulti.pdf)

Cool-Lexicographic Order

This is a nice simple algorithm that implements an iterative, non-recursive, loopless permutation generator.

Algorithm:

  • Visits every permutation of integer multiset E
  • Call to init(E) creates singly linked list that stores elements of E in non-increasing order
  • head, min, and inc point to first, second to last, and last nodes, respectively
  • All variables are pointers
  • phi is null
  • If E = {1,1,2,4}, then first three visit(E) calls will produce the configurations:
    • 4 2 1 1
    • 1 4 2 1
    • 4 1 2 1

Notes on variables:

  • head points to first node of current permutation
  • i points to ith node of current permutation
  • afteri points to the (i+1)st node of current permutation

to apply the prefix shift of length k, the two pointers k and beforek are used:

  • k points to the kth node of the current permutation
  • beforek points to the (k-1)st node of the current permutation

Note that the visit(head) call denotes a new permutation that has been generated

  • visit(head) happens when new permutation pointed to by head
  • Represents passing control back to consumer requesting permutations
  • Work in init(E) would also be done by consumer

When does the loop stop?

  • Condition on loop ensures algorithm continues until it generates tail(E)
  • Rather than generating tail(E) as separate case after oop ends, it initializes linked list to tail(E)
  • This is the very first permutation visited

Here is the algorithm:

[head, i afteri] <- init(E)
visit(head)
while (afteri.next != null or afteri.value < head.value) do:
    if (afteri.next != null and i.value >= afteri.next.value):
        beforek <- afteri
    else
        beforek <- i
    end
    k <- beforek.next
    beforek.next <- k.next
    k.next <- head
    if (k.value < head.value):
        i <- k
    end
    afteri <- i.next
    head <- k
    visit(head)
end



Flags