Longest Common Substring

Problem Statement

Give two string s1 and s2, find the longest common substring (LCS). E.g: X = [111001], Y = [11011], the longest common substring is [110] with length 3.

One terse way is to use Dynamic Programming (DP) to analyze the complex problem.

Instead of dealing with irregular substring, we can first deal with substring indexed by last character.

Define dp[i][j]= the length of longest common substring of s1[0~i] and s2[0~j] ending with s1[i] and s2[j].

Then, the maximum LCS length could be the maximum number in array dp.

In order to get the value of dp[i][j], we need to know if s1[i] == s2[j]. If it is, then the dp[i][j]=dp[i1][j1]+1, else it'll be zero. Thus:

1
dp[i][j] = (s1[i] == s2[j] ? (dp[i-1][j-1] + 1) : 0);

As we want to know the concrete string with LCM, we just need to do a few modifications.

When we get a larger dp[i][j] than present maxLength, we'll update the maxLength by dp[i][j].

1
2
if(dp[i][j] > maxLen)
    maxLen = dp[i][j];

At the same time, we can also record the starting index of the new longer substring. For string s1, the beginning index of LCM is the present index i adding 1 minus the length of LCM, i.e.

1
2
3
4
if(dp[i][j] > maxLen){
    maxLen = dp[i][j];
    maxIndex = i + 1 - maxLen;
}

 

Finally, we need to initialize state of dp. That's simple:

1
2
3
4
5
for(int i = 0; i < s1.length(); ++i)
    dp[i][0] = (s1[i] == s2[0] ? 1 : 0);
 
for(int j = 0; j < s2.length(); ++j)
    dp[0][j] = (s1[0] == s2[j] ? 1 : 0);

 

 


The complete code is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
void LCM(const string s1, const string s2, int &sIndex, int &length)
{
    n1 = s1.length();
    n2 = s2.length();
     
    if(0 == n1 || 0 == n2)
    {
        sIndex = -1;
        length = 0;
        return;
    }
     
    // initialize dp
    vector<vector<int> > dp;
    for(int i = 0; i < n1; ++i){
        vector<int> tmp;
        tmp.push_back((s1[i] == s2[0] ? 1 : 0));  // Initialize the bottom line
        for(int j = 1; j < n2; ++j)
        {
            if(0 == i){
                tmp.push_back((s1[0] == s2[j] ? 1 : 0));  // Initialize the left line
            }else{
                tmp.push_back(0);  // Empty the interior area
            }
        }
         
        dp.push_back(tmp);
    }
     
    // compute max length and index
    length = 0;
    for(int i = 1; i < n1; ++i){
        for(int j = 1; j < n2; ++j){
            if(st1[i] == st2[j])
                dp[i][j] = dp[i-1][j-1] + 1;
                 
            if(dp[i][j] > length){
                length = dp[i][j];
                sIndex = i + 1 - length;
            }
        }
    }   
}

 

posted @   kid551  阅读(208)  评论(0编辑  收藏  举报
编辑推荐:
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
阅读排行:
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· winform 绘制太阳,地球,月球 运作规律
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 超详细:普通电脑也行Windows部署deepseek R1训练数据并当服务器共享给他人
· 上周热点回顾(3.3-3.9)
点击右上角即可分享
微信分享提示