[Algorithms] Longest Common Subsequence

The Longest Common Subsequence (LCS) problem is as follows:

Given two sequences s and t, find the length of the longest sequence r, which is a subsequence of both s and t.

Do you know the difference between substring and subequence? Well, substring is a contiguous series of characters while subsequence is not necessarily. For example, "abc" is a both a substring and a subseqeunce of "abcde" while "ade" is only a subsequence.

This problem is a classic application of Dynamic Programming. Let's define the sub-problem (state) P[i][j] to be the length of the longest subsequence ends at i of s and j of t. Then the state equations are

  1. P[i][j] = max(P[i][j - 1], P[i - 1][j]) if s[i] != t[j];
  2. P[i][j] = P[i - 1][j - 1] + 1 if s[i] == t[j].

This algorithm gives the length of the longest common subsequence.  The code is as follows.

1 int longestCommonSubsequence(string s, string t) {
2     int m = s.length(), n = t.length();
3     vector<vector<int> > dp(m + 1, vector<int> (n + 1, 0));
4     for (int i = 1; i <= m; i++)
5         for (int j = 1; j <= n; j++)
6             dp[i][j] = (s[i - 1] == t[j - 1] ? dp[i - 1][j - 1] + 1 : max(dp[i - 1][j], dp[i][j - 1]));
7     return dp[m][n];
8 }

Well, this code has both time and space complexity of O(m*n). Note that when we update dp[i][j], we only need dp[i - 1][j - 1], dp[i - 1][j] and dp[i][j - 1]. So we simply need to maintain two columns for them. The code is as follows.

复制代码
 1 int longestCommonSubsequenceSpaceEfficient(string s, string t) {
 2     int m = s.length(), n = t.length();
 3     int maxlen = 0;
 4     vector<int> pre(m, 0);
 5     vector<int> cur(m, 0);
 6     pre[0] = (s[0] == t[0]);
 7     maxlen = max(maxlen, pre[0]);
 8     for (int i = 1; i < m; i++) {
 9         if (s[i] == t[0] || pre[i - 1] == 1) pre[i] = 1;
10         maxlen = max(maxlen, pre[i]);
11     }
12     for (int j = 1; j < n; j++) {
13         if (s[0] == t[j] || pre[0] == 1) cur[0] = 1;
14         maxlen = max(maxlen, cur[0]);
15         for (int i = 1; i < m; i++) {
16             if (s[i] == t[j]) cur[i] = pre[i - 1] + 1;
17             else cur[i] = max(cur[i - 1], pre[i]);
18             maxlen = max(maxlen, cur[i]);
19         }
20         swap(pre, cur);
21         fill(cur.begin(), cur.end(), 0);
22     }
23     return maxlen;
24 }
复制代码

Well, keeping two columns is just for retriving pre[i - 1], we can maintain a single variable for it and keep only one column. The code becomes more efficient and also shorter. However, you may need to run some examples to see how it achieves the things done by the two-column version.

复制代码
 1 int longestCommonSubsequenceSpaceMoreEfficient(string s, string t) {
 2     int m = s.length(), n = t.length();
 3     vector<int> cur(m + 1, 0);
 4     for (int j = 1; j <= n; j++) {
 5         int pre = 0;
 6         for (int i = 1; i <= m; i++) {
 7             int temp = cur[i];
 8             cur[i] = (s[i - 1] == t[j - 1] ? pre + 1 : max(cur[i], cur[i - 1]));
 9             pre = temp;
10         }
11     }
12     return cur[m];
13 }
复制代码

Now you may try this problem on UVa Online Judge and get Accepted:)

Of course, the above code only returns the length of the longest common subsequence. If you want to print the lcs itself, you need to visit the 2-d table from bottom-right to top-left. The detailed algorithm is clearly explained here. The code is as follows.

复制代码
 1 int longestCommonSubsequence(string s, string t) {
 2     int m = s.length(), n = t.length();
 3     vector<vector<int> > dp(m + 1, vector<int> (n + 1, 0));
 4     for (int i = 1; i <= m; i++)
 5         for (int j = 1; j <= n; j++)
 6             dp[i][j] = (s[i - 1] == t[j - 1] ? dp[i - 1][j - 1] + 1 : max(dp[i - 1][j], dp[i][j - 1]));
 7     int len = dp[m][n];
 8     // Print out the longest common subsequence
 9     string lcs(len, ' ');
10     for (int i = m, j = n, index = len - 1; i > 0 && j > 0;) {
11         if (s[i - 1] == t[j - 1]) {
12             lcs[index--] = s[i - 1];
13             i--;
14             j--;
15         }
16         else if (dp[i - 1][j] > dp[i][j - 1]) i--;
17         else j--;
18     }
19     printf("%s\n", lcs.c_str());
20     return len;
21 }
复制代码

 

posted @   jianchao-li  阅读(486)  评论(0)    收藏  举报
编辑推荐:
· 记一次 .NET某旅行社酒店管理系统 卡死分析
· 长文讲解 MCP 和案例实战
· Hangfire Redis 实现秒级定时任务,使用 CQRS 实现动态执行代码
· Android编译时动态插入代码原理与实践
· 解锁.NET 9性能优化黑科技:从内存管理到Web性能的最全指南
阅读排行:
· 工良出品 | 长文讲解 MCP 和案例实战
· 一天 Star 破万的开源项目「GitHub 热点速览」
· 多年后再做Web开发,AI帮大忙
· 记一次 .NET某旅行社酒店管理系统 卡死分析
· 别再堆文档了,大模型时代知识库应该这样建
点击右上角即可分享
微信分享提示