Letter Coverage: Difference between revisions
From charlesreid1
| Line 13: | Line 13: | ||
===A Simpler Example=== | ===A Simpler Example=== | ||
Suppose we are considering 2-letter words taken from a 5-letter alphabet ''abcde''. We can represent a given word as a binary string/bit vector: | Suppose we are considering 2-letter words taken from a 5-letter alphabet ''abcde''. We can represent a given word as a binary string/bit vector: for example, the two-letter word <code>ab</code> would be represented by the bit vector <code>11000</code>. | ||
Now let's look at a set of words and step through the algorithm: | |||
<pre> | <pre> | ||
ab = 11000 | W1 = ab = 11000 | ||
W2 = bc = 01100 | |||
W3 = aa = 10000 | |||
W4 = dd = 00010 | |||
W5 = de = 00011 | |||
W6 = bb = 01000 | |||
</pre> | </pre> | ||
Now, we wish to write a dynamic program that will find the smallest set of words such that taking the union of each bit vector gives 11111. At each step, we seek the words that will maximize the length of the sequence of 1s in the union of the bit vectors, for the minimum number of words. We take the union of the "longest sequence of 1s" bit vector from the prior step, plus the bit vector for the current step. | |||
Start with word W1: this is the only bit vector, so it sets the starting "longest sequence of 1s" bit vector. We wish to maximize "longest sequence of 1s" and minimize number of words. | |||
* W1 solution is therefore <math>11000</math>. The number of 1s is 1. The number of words is 1. | |||
Next is word W2: the "longest sequence of 1s" bit vector is the union of the prior step's "longest sequence of 1s" bit vector and the current word's bit vector. One option: | |||
* union of W1 solution and W2 <math>11000 \bigcup 01100 = 11100</math>. The number of 1s is 3. The number of words is 2. | |||
Next is word W3: the "longest sequence of 1s" bit vector is the union that maximizes the number of 1s and minimizes the number of words. Two options: | |||
* union of W1 solution and W3 <math>11000 \bigcup 10000 = 11000</math>. The number of 1s is 2. The number of words is 2. | |||
* union of W2 solution and W3 <math>11100 \bigcup 10000 = 11100</math>. The number of 1s is 3. The number of words is 3. | |||
Next is word W4: the "longest sequence of 1s" bit vector is the union that maximizes the number of 1s and minimizes the number of words. Three options: | |||
* union of W1 solution and W4 <math>11000 \bigcup 00010 = 11010</math>. The number of 1s is 3. The number of words is 2. | |||
* union of W2 solution and W4 <math>11100 \bigcup 00010 = 11110</math>. The number of 1s is 4. The number of words is 3. (W4 SOLUTION) | |||
* union of W3 solution and W4 <math>11100 \bigcup 00010 = 11110</math>. The number of 1s is 4. The number of words is 4. | |||
Next is word W5: the "longest sequence of 1s" bit vector is the union maximizing number of 1s and minimizing number of words. Four options: | |||
* union of W1 solution and W5 <math>11000 \bigcup 00011 = 11011</math>. The number of 1s is 4. The number of words is 2. | |||
* union of W2 solution and W5 | |||
* union of W3 solution and W5 | |||
* union of W4 solution and W5 | |||
(NOTE: We don't need to consider every possible combination of W1, W2, W3, W4, W5, and W6; we only need to consider each word once, because each word's current solution can be written in terms of the prior word's solution, so we only need to consider solutions for each word. We've already considered the non-solutions and can therefore ignore them because they don't maximize number of 1s and minimize number of words.) | |||
===Definitions=== | ===Definitions=== | ||
Revision as of 09:49, 16 August 2017
Problem Description
The letter/word coverage problem is the problem of finding the minimum number of words from the five-letter-words set (see Five Letter Words) to cover X letters of the alphabet.
The problem has a couple of variations: we might provide a set of letters, and search for the smallest number of words that can cover those particular letters. Or we might give an integer N <= 26, and search for the smallest number of words that can cover the first N letters of the alphabet.
We may also wish to restrict our search to the first M words of the 5757 total five letter words.
For the sake of simplicity, we will start with the problem of considering the first N letters of the alphabet, and the shortest sequence of words that will provide coverage of those first N letters of the alphabet.
Solution
A Simpler Example
Suppose we are considering 2-letter words taken from a 5-letter alphabet abcde. We can represent a given word as a binary string/bit vector: for example, the two-letter word ab would be represented by the bit vector 11000.
Now let's look at a set of words and step through the algorithm:
W1 = ab = 11000 W2 = bc = 01100 W3 = aa = 10000 W4 = dd = 00010 W5 = de = 00011 W6 = bb = 01000
Now, we wish to write a dynamic program that will find the smallest set of words such that taking the union of each bit vector gives 11111. At each step, we seek the words that will maximize the length of the sequence of 1s in the union of the bit vectors, for the minimum number of words. We take the union of the "longest sequence of 1s" bit vector from the prior step, plus the bit vector for the current step.
Start with word W1: this is the only bit vector, so it sets the starting "longest sequence of 1s" bit vector. We wish to maximize "longest sequence of 1s" and minimize number of words.
- W1 solution is therefore $ 11000 $. The number of 1s is 1. The number of words is 1.
Next is word W2: the "longest sequence of 1s" bit vector is the union of the prior step's "longest sequence of 1s" bit vector and the current word's bit vector. One option:
- union of W1 solution and W2 $ 11000 \bigcup 01100 = 11100 $. The number of 1s is 3. The number of words is 2.
Next is word W3: the "longest sequence of 1s" bit vector is the union that maximizes the number of 1s and minimizes the number of words. Two options:
- union of W1 solution and W3 $ 11000 \bigcup 10000 = 11000 $. The number of 1s is 2. The number of words is 2.
- union of W2 solution and W3 $ 11100 \bigcup 10000 = 11100 $. The number of 1s is 3. The number of words is 3.
Next is word W4: the "longest sequence of 1s" bit vector is the union that maximizes the number of 1s and minimizes the number of words. Three options:
- union of W1 solution and W4 $ 11000 \bigcup 00010 = 11010 $. The number of 1s is 3. The number of words is 2.
- union of W2 solution and W4 $ 11100 \bigcup 00010 = 11110 $. The number of 1s is 4. The number of words is 3. (W4 SOLUTION)
- union of W3 solution and W4 $ 11100 \bigcup 00010 = 11110 $. The number of 1s is 4. The number of words is 4.
Next is word W5: the "longest sequence of 1s" bit vector is the union maximizing number of 1s and minimizing number of words. Four options:
- union of W1 solution and W5 $ 11000 \bigcup 00011 = 11011 $. The number of 1s is 4. The number of words is 2.
- union of W2 solution and W5
- union of W3 solution and W5
- union of W4 solution and W5
(NOTE: We don't need to consider every possible combination of W1, W2, W3, W4, W5, and W6; we only need to consider each word once, because each word's current solution can be written in terms of the prior word's solution, so we only need to consider solutions for each word. We've already considered the non-solutions and can therefore ignore them because they don't maximize number of 1s and minimize number of words.)
Definitions
Solution Procedure
To solve dynamic programming problems, apply these three steps:
1. Formulate answer as recurrence relation/recursive algorithm
2. Show that the number of parameters taken on by the recurrence is bounded by a polynomial
3. Specify the order of evaluation for the recurrence so that partial results that are needed are always available