字符串KMP算法思考
kmp算法不是查找最长公共子字符串算法,而是一个判断字符串A是否包含字符串B的更优的查找算法。
kmp算法的核心是next数组的计算(最长相同前缀和后缀的字符串)
比如ABCDABD的next数组是-1,0,0,0,0,1,2
kmp算法查询逻辑和获取next数组的逻辑非常相似,都是while循环里面的if else。
next数组匹配过程:
/**
* 匹配过程,自己跟自己比较
* ABADABAEABC
* ABADABAEABC
*
* ABADABAEABC
* ABADABAEABC
*
* ABADABAEABC
* ABADABAEABC
*
* ABADABAEABC
* ABADABAEABC
*
* ABADABAEABC
* ABADABAEABC
*
* ABADABAEABC
* ABADABAEABC
*
* ABADABAEABC
* ABADABAEABC
*/
/** * KMP匹配算法 * * @param sStr 父串 * @param dStr 子串 * @return 子串在父串中下标index[int] */ public static int find(String sStr, String dStr) { int sLength = sStr.length(); int dLength = dStr.length(); int sIndex = 0, dIndex = 0; int[] next = getNextArray(dStr); while (sIndex < sLength && dIndex < dLength) { //当前字符匹配 if (dIndex == -1 || sStr.charAt(sIndex) == dStr.charAt(dIndex)) { //父串和子串同时后移一个字符 sIndex++; dIndex++; } else {//不匹配 sIndex不变dIndex取next[j] System.out.println("sStr ele is " + sStr.charAt(sIndex) + ",dStr ele is " + dStr.charAt(dIndex)); int temp = dIndex; dIndex = next[dIndex]; System.out.println("current dIndex is " + temp + ",next dIndex is " + dIndex); } } //字符串匹配结束 if (dIndex == dLength) { return sIndex - dLength; } return -1; }
查找包含子字符串的匹配过程:
/**
* 匹配过程
* BBC ABCDAB ABCDABCDABDE
* ABCDABD
*
* BBC ABCDAB ABCDABCDABDE
* ABCDABD
*
* BBC ABCDAB ABCDABCDABDE
* ABCDABD
*
* BBC ABCDAB ABCDABCDABDE
* ABCDABD
*
* BBC ABCDAB ABCDABCDABDE
* ABCDABD
*
* BBC ABCDAB ABCDABCDABDE
* ABCDABD
*
*
* BBC ABCDAB ABCDABCDABDE
* ABCDABD
*
*
* BBC ABCDAB ABCDABCDABDE
* ABCDABD
*
* BBC ABCDAB ABCDABCDABDE
* ABCDABD
*/
参考文章:https://blog.csdn.net/Thousa_Ho/article/details/72842029
/** * 获取next数组 * * @param destStr 目的字符串 * @return next数组 */ public static int[] getNextArray2(String destStr) { int[] nextArr = new int[destStr.length()]; nextArr[0] = -1; int k = -1, j = 0; while (j < destStr.length() - 1) { //匹配上 if (k == -1 || (destStr.charAt(k) == destStr.charAt(j))) { ++k; ++j; nextArr[j] = k;//代表当前字符之前的字符串中,有多大长度的相同前缀后缀 System.out.println("nextArr[" + j + "] is " + k); } else { int temp = k; k = nextArr[k]; System.out.println("before k is " + temp + ",now is " + k); } } return nextArr; }