[Coding Made Simple] Longest Common Substring

Given two strings, find longest common substring between them.

 

Solution 1. Brute force search, O(n^2 * m), O(1) memory

Algorithm.

O(n^2) runtime to find all substrings from string A. For each substring, it takes O(m) time to check if it exists in string B.

 1 public class Solution {
 2     public int longestCommonSubstring(String A, String B) {
 3         if(A == null || B == null || A.length() == 0 || B.length() == 0){
 4             return 0;
 5         }
 6         int max = Integer.MIN_VALUE;
 7         for(int i = 0; i < A.length(); i++){
 8             for(int j = i + 1; j <= A.length(); j++){
 9                 if(B.indexOf(A.substring(i, j)) >= 0){
10                     max = Math.max(max, j - i);                    
11                 }
12                 else{
13                     break;
14                 }
15             }
16         }
17         return max;
18     }
19 }

 

Solution 2. Dynamic Programming, O(n * m) runtime, O(n * m) space

Solution 1 is not efficient in that each time a substring is checked, it always start from index 0 of string B. The previous check

result of smaller substrings are not used at all. For example, if A[i...j] is a substring in B, then in order to check if A[i...j + 1] is 

a substring of B or not, we just need to check if the next character of substring A[i...j] in B equals to A[j + 1] or not. This only takes

O(1) time. But in solution 1 there is no memorization of previous results, so it takes O(m) time for each check.

 

State:

lcs[i][j]:  the length of the common substring that ends at A[i - 1] and B[j - 1].

Function:

lcs[i][j] = 1 + lcs[i - 1][j - 1],  if A[i - 1] == B[j - 1];

lcs[i][j] = 0, if A[i - 1] != B[j - 1];

Initialization:

lcs[i][0] = 0; lcs[0][j] = 0;

Answer:

max value of lcs[i][j]

 1 public class Solution {
 2     public int longestCommonSubstring(String A, String B) {
 3         if(A == null || B == null || A.length() == 0 || B.length() == 0){
 4             return 0;
 5         }    
 6         int n = A.length();
 7         int m = B.length();
 8         int[][] lcs = new int[n + 1][m + 1];
 9         for(int i = 0; i <= n; i++){
10             lcs[i][0] = 0;
11         }
12         for(int j = 0; j <= m; j++){
13             lcs[0][j] = 0;
14         }
15         for(int i = 1; i <= n; i++){
16             for(int j = 1; j <= m; j++){
17                 if(A.charAt(i - 1) == B.charAt(j - 1)){
18                     lcs[i][j] = lcs[i - 1][j - 1] + 1;    
19                 }
20                 else{
21                     lcs[i][j] = 0;
22                 }
23             }
24         }
25         int max = Integer.MIN_VALUE;
26         for(int i = 1; i <= n; i++){
27             for(int j = 1; j <= m; j++){
28                 max = Math.max(max, lcs[i][j]);
29             }
30         }
31         return max;
32     }
33 }

 

Follow up question: Find one longest common substring.

Answer: find the max value in lcs[i][j]; then go diagonal top left one grid at a time, until the value of the grid is 0.

 

Related Problems

Longest Common Subsequence

posted @ 2017-08-18 01:00  Review->Improve  阅读(284)  评论(0编辑  收藏  举报