Algorithms/Dynamic Programming
From charlesreid1
Notes
Skiena Chapter 8
Fibonacci dynamic programming example
Skiena begins the chapter with the Fibonacci example.
Worst way to do Fibonacci calculation: using recursion
More efficient way to do Fibonacci calculation: cache the intermediate results
Most efficient way: specify the order of evaluation in such a way that the results that are needed are always available. No caching necessary. It's a bit trivial once you reach this point, but it is still dynamic programming - you're just specifying the order in which to evaluate the sub-problems so that you always have the information you need.
Binomial dynamic programming example
Skiena uses the binomial coefficient as an example of a problem with a single expression that can be evaluated, but that single expression will cause integer overflow.
Instead: use identity,
and the base cases:
To implement this, we can build a table/array, which will contain Pascal's triangle...
Left column: m, top column: n
Now the order of operations is to fill in the edges of Pascal's triangle (the base cases) first
- Initialize column 0 = 1
- Initialize diagonal = 1
Now loop over each row and each column of the row to compute the binomial coefficient:
for i = 1 to n: for j = 1 to i: calculate binomial coefficient
Fuzzy string matching example
The next example Skiena covers is an important one: fuzzy string matching.
We are trying to pattern-match the string P in a larger text T.
To do approximate string matching, we must begin with a cost function. The cost function assigns a cost to each possible modification of our original string.
3 classes of edits:
- Substitution (turning one letter into another)
- Insertion
- Deletion
To find the minimum cost function to turn our pattern P into something in the original text T, we need to work backwards: from the ending point, work backwards 1 edit at a time.
Maintain a table:
- D(i,j) is the distance (the number of differences) between P1, P2, ..., Pi and the segment of T ending at j
- D(i,j) is the minimum of the three possible ways to extend the current string
- In other words, it is the minimum of the following:
Substitution: if then , otherwise
Insert:
Delete:
Pseudocode for program:
# Non-Dynamic Programming Version of Fuzzy String Matching def string_compare(s, t, i, j): # Base case if i==0 or j==0: return # Option: Match match = string_compare(s, t, i-1, j-1) + match(s[i], t[j]) # Option: Insert insert = string_compare(s, t, i, j-1) + insertdelete(t[j]) # Option: Delete delete = string_compare(s, t, i-1, j) + insertdelete(s[i]) # Pick minimum of these return min(match, insert, delete)
But this is horribly inefficient. Why? Because it does not memoize. This means that it is repeating the same tasks over and over again.
Fuzzy string matching dynamic programming example
Define a C struct to hold information for cells. Each cell is a particular D(i,j) location.
Each cell holds cost, and parent cell. Now, in place of a recursive call, we have a table/cell lookup.
There is a fixed number of entries: size of P times size of T . The table has one column for each character in T, and one row for each character in P.
There are two ways to approach this problem:
- We can get the edit distance, but not the actual sequence of edits. This is a bit easier.
- We can get the actual sequence of edits. This is a little harder, but more useful.
To reconstruct the tour, we can use the cost matrix D(i,j) and work backwards to reconstruct the tour from D(i,j).
Here are the functions to define:
- Initialization - this creates the "boundary conditions" or base cases for the dynamic program. This is the starting point that we build on
- Penalty cost - this is a function that determines the actual cost of each edit (modify, insert, delete). The simplest case assigns a cost of 1, but this can tackle more general problems by making it a function.
- Goal cell - this is a function that returns the indices of the endpoint cell. Again, the simplest case is the last cell in the table (which has a fixed size), but making this a function makes it more general.
- Traceback - defining a "trace" action allows printing information about where in the dynamic program we are (for example, printing out the name of each operation, so we can see MMDMDIIMMD to indicate the sequence of edits being made)
Dynamic programming algorithms are easy to design, but getting boundary conditions and index manipulations correct requires great care.
Why is there so much extra fluff?
- Fuzzy string matching is solved, but the extra fluff solves other problems too
- Fuzzy string matching in large text body (slight modifications required to row initialization and goal cell functions)
- Largest common subsequence - finding number of characters in common between two strings, also called the LCS (longest common subsequence) problem; requires modifying match cost function
- Maximum monotone subsequence - finding the fewest numbr of elements to delete to make a sequence monotonic - for example, 243517698 -> 23568 (this is identical to the LCS problem)
Longest Increasing Sequence Example
Code and Example Problems
Link to some practice problems: [1]
Longest Increasing Subsequence
Link: https://git.charlesreid1.com/cs/java/src/master/dynamic-programming
Racing Gems
Link: https://git.charlesreid1.com/cs/java/src/master/dynamic-programming/racing-gems
Maximum Value Contiguous Subsequence
Link: https://git.charlesreid1.com/cs/java/src/master/dynamic-programming/max-val-seq
Letter Coverage
Link: https://git.charlesreid1.com/cs/five-letter-words/src/master/word_coverage
Resources
MIT Open Courseware 6.006
Links to Videos
Dynamic programming 1: Fibonacci, shortest paths: YouTube video link
Dynamic programming 2: Blackjack: YouTube video link
Dynamic programming 3: Parenthesization, edit distance, knapsack: YouTube video link
Dynamic programming 4: guitars, tetris, super mario: YouTube video link
MIT Open Courseware 6.046
Links to Videos
The topic of dynamic programming is covered in the MIT 6.046 open courseware playlist
The course begins with dynamic programming, by covering the divide and conquer approach for finding a convex hull: YouTube video link
It continues with a divide and conquer approach for matrix algebra and Fast Fourier Transforms: YouTube video link
It then covers divide and conquer for van Emde Boas trees: YouTube video link
Recitation with more dynamic programming examples: YouTube video link
Advanced dynamic programming: YouTube video link
Dynamic programming for all pairs, shortest paths: YouTube video link