DP_LCS
The Longest Common Subsequence problem
Give two strings a and b. return the length of the longest common subsequence of them.
Note: characters in subsequence needn’t to be consecutive, but in substring, it is
For example, string a = “abcdefg”, string b = “bcdg”, Then, the longest commons subsequence is “bcdg”, but the longest common substring is “bcd”.
Analysis:
Most DP problem needs to hold an array to store the partial solutions, so does this one.
Assume t[,] is the array to hold the solution, t[i, j] means the length of the LCS between a0a1…ai-1ai and b0b1…bj-1bj, And we can compute t[i,j] in such way
- If(a[i] == b[j]) then increase the length by 1
t[i, j] = t[i – 1, j – 1] + 1
- else
t[i, j] = max(t[i – 1, j], t[i, j – 1]) ;
How to initialize array t? If string a or string b is empty, then the length will be 0. So we initialize the first row and first column of t to 0, and we need an extra row and column to hold these 0s
The following table shows you the detail steps to compute the solution
First, initialize array t
|
Index a→ |
0 |
1 |
2 |
3 |
4 |
5 |
Index b ↓ |
|
|
a |
b |
c |
d |
e |
0 |
|
0 |
0 |
0 |
0 |
0 |
0 |
1 |
b |
0 |
|
|
|
|
|
2 |
c |
0 |
|
|
|
|
|
3 |
e |
0 |
|
|
|
|
|
Compute the first row
|
Index a→ |
0 |
1 |
2 |
3 |
4 |
5 |
Index b ↓ |
|
|
a |
b |
c |
d |
e |
0 |
|
0 |
0 |
0 |
0 |
0 |
0 |
1 |
b |
0 |
0 |
1 |
1 |
1 |
1 |
2 |
c |
0 |
|
|
|
|
|
3 |
e |
0 |
|
|
|
|
|
Compute the second row
|
Index a→ |
0 |
1 |
2 |
3 |
4 |
5 |
Index b ↓ |
|
|
a |
b |
c |
d |
e |
0 |
|
0 |
0 |
0 |
0 |
0 |
0 |
1 |
b |
0 |
0 |
1 |
1 |
1 |
1 |
2 |
c |
0 |
0 |
1 |
2 |
2 |
2 |
3 |
e |
0 |
0 |
1 |
2 |
2 |
3 |
So the result is the last element 3
When you compute any value t[i,j]in the table, first see if a[i] == b[j], it that’s true
Then t[i,j] wil be it’s upper left value in table plus one, t[1,2] is the case
Because b[1] = ‘b’,a[2] = ‘b’, so t[1,2] == t[0,1] + 1 = 0 + 1 = 1. if that’s false(a[i] != b[j]), then t[i,j] will be the max value between it’s left value and up value in the table, t[1,3] is in this case, since b[1] = ‘b’, and a[3] = ‘c’, they are different, so t[1,3] = max(t[1, 2] + t[0,3])=max(1,0)=1 , repeat this process until you get the last element of table t. that’s the value you want.
See code