KMP string pattern matching

The function used here is from the leetcode. Details can be found in leetcode problem: Implement strStr()

 

The best explanation should be made in the comments, which can be understood by the leading of code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
// next[j]: the smallest valid position we need to check next when detect mismatch at jth character pattern[j]
// Here, valid position means "pattern[0, ..., next[j]-1]" are matched with "text".
 
void getNext(char *pattern, int next[]){
    int i = 0, j = -1;
     
    // If the "text[i]" fails to match "patter[0]", then we need to
    // check "text[i+1]" and "patter[0]", which also means "text[i]"
    // would check with "patter[-1]".
    next[i] = j;    
     
    // while loop 1:
    while(pattern[i] != '\0){
            // while loop 2:
            while(j >= 0 && pattern[i] != pattern[j]){
                // First, j need to be valid index, so it needs to be not less than 0.
                // Then, if "pattern[i]" fails to match "pattern[j]", we can also think as
                // "text[i]" fails to match "pattern[j]".
                // So we need to check if "text[i]" matches with "pattern[next[j]]", as next[j]
                // is the position we need to check when we fail at position j.
                j = next[j];               
            }
             
            // After the above while loop, we can know that "text[0, ..., i]" matches
            // "pattern[0, ..., j]", so we can move one more step for both "text" and "pattern".
            ++i; ++j;
             
            // For the new i, marked as i_new, we can determine its "next value" now!!
            // As we've known that "text[0, ..., i_new - 1]" matches "pattern[0, ..., j_new - 1]",
            // if we fail to match at position "text[i_new]", we can move pattern to the j_new position to
            // check if "text[i_new]" matches "pattern[j_new]".
            // P.S:
            // Also, we can know the j_new position is the optimized position. If we can get a valid position j' (valid
            // means "pattern[0, ..., j'-1]" are matched with "text") smaller
            // than j_new, then we'd get "(j' - 1)" (which is valid at position j'-1) is smaller than "next[j]",
            // which is contradicted to the definition of "next" table.
            if(pattern[i] == pattern[j])
                next[i] = next[j];
            else
                next[i] = j;           
    }  
}
 
 
char *strStr(char *text, char *pattern){
    if(NULL == text || NULL == pattern)
        return NULL;
    if('\0' == pattern[0])
        return text;
     
    // i is the pointer of text, j is the pointer of pattern.
    int i = 0, j = 0;
    char *pos = NULL;
    int *next = new int[strlen(pattern) + 1];   // include the '\0'
     
    getNext(pattern, next);
     
    while(text[i] != '\0'){
        // Same optimization in getNext(), that is
        // if we fail at one position, we may also fail at the
        // next position, which means we can continue along the "next" table
        // Also, we need the index to be valid first.
        while(j >= 0 && text[i] != pattern[j])
            j = next[j];
         
        // After the while loop, we can know "text[0, ..., i]" matches "pattern[0, ..., j]"
        // So we need to move one more step for both "text" and "pattern".
        ++i; ++j;
         
        if(pattern[j] == '\0'){
            pos = (text + i) - j;   // The beginning position in text which corresponding to the matched pattern position.
            return pos;
        }      
    }
     
    return pos;
     
}

 

posted @   kid551  阅读(228)  评论(0编辑  收藏  举报
编辑推荐:
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
阅读排行:
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· winform 绘制太阳,地球,月球 运作规律
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 超详细:普通电脑也行Windows部署deepseek R1训练数据并当服务器共享给他人
· 上周热点回顾(3.3-3.9)
点击右上角即可分享
微信分享提示