在KMP算法的使用中，首要任务就是获取一个字符串的next数组，所以我们得明白next数组的含义（最好的方法是自己弄个例子，在草稿纸上模拟一下），在这里，通俗一点讲，next[k] 表示，在模式串的 k 个字符失配了，然后下一次匹配从 next[k] 开始（next[k] 中保存的是该失配字符的前一个字符在前面出现过的最近一次失配的字符后面的一个字符的位置，有点绕口，自己写个例子看看就明白了，也可以继续往下看，有介绍，然后再自己尝试写写）。

至于next数组为什么可以用来求重复前缀呢，而且求出来的重复前缀是最小的呢？

 1 void Getp()
 2 {
 3     int i = 0, j = -1;
 4     p[i] = j;
 5     while(i != lenb)
 6     {
 7         if(j == -1 || b[i] == b[j])
 8         {
 9             i++;  j++;
10             p[i] = j; 
11         }
12         else
13             j = p[j];
14     }
15 }

个人认为，next数组在求解的过程中，用到了KMP的思想，当前失配了，就回溯到上一个next，请见 j=next[j] ，先说个结论，如果到位置 i ，如果有 i%(i-next(i))==0 , 那说明字符串开始循环了，并且循环到 i-1 结束，为什么这样呢？

我们先假设到达位置 i-1 的时候，字符串循环了（到i-1完毕），那么如果到第i个字符的时候，失配了，根据next数组的求法，我们是不是得回溯？

然而回溯的话，由于字符串是循环的了（这个是假定的），next[i] 是不是指向上一个循环节的后面一个字符呢？？

是的，上一个循环节的末尾是 next[i]-1 ，然后现在循环节的末尾是 i-1 ，然么循环节的长度是多少呢？

所以，我们有 (i - 1) - ( next[i] - 1 ) = i - next[i] 就是循环节的长度（假设循环成立的条件下），但是我们怎么知道这个循环到底成立吗？

现在我们已经假设了 0————i-1 循环了，那么我们就一共有i 个字符了，如果有 i % ( i - next[i] ) == 0，总的字符数刚好是循环节的倍数，那么说明这个循环是成立的。

注意还有一点，如果 next[i] == 0，即使符合上述等式，这也不是循环的，举个反例

0 1 2 3 4 5

a b c a b d

-1 0 0 0 1 2

下标为1,2,3的next值均为0，那么 i%（i-next【i】）=i%i==0，但是这个并不是循环。

解释完毕，然后再来看下，为什么求出来的循环节长度是最小的呢？

因为next数组失配的时候，总是回溯到最近的循环节，所以i-next【i】就是最小的循环节长度

　　　　为什么求出来的循环次数是最多的呢？

　　　　循环节长度是最小的了，那么循环次数肯定是最多的了。

总结一下，如果对于next数组中的 i，符合 i % ( i - next[i] ) == 0 && next[i] != 0 , 则说明字符串循环，而且

循环节长度为: i - next[i]

循环次数为: i / ( i - next[i] )；

---------------------------------------------------

Oulipo

Time Limit: 3000/1000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others)
Total Submission(s): 7280 Accepted Submission(s): 2911

Problem Description

The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter 'e'. He was a member of the Oulipo group. A quote from the book:

Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…

Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive 'T's is not unusual. And they never use spaces.

So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {'A', 'B', 'C', …, 'Z'} and two finite strings over that alphabet, a word W and a text T, count the number of occurrences of W in T. All the consecutive characters of W must exactly match consecutive characters of T. Occurrences may overlap.

Input

The first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:

One line with the word W, a string over {'A', 'B', 'C', …, 'Z'}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W).
One line with the text T, a string over {'A', 'B', 'C', …, 'Z'}, with |W| ≤ |T| ≤ 1,000,000.

Output

For every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the word W in the text T.

Sample Input

3

BAPC 

BAPC 

AZA

AZAZAZA 

VERDI 

AVERDXIVYERDIAN

Sample Output

1

3

0

Source

华东区大学生程序设计邀请赛_热身赛

Recommend

lcy | We have carefully selected several similar problems for you: 1358 1711 3336 3746 2203

//短串（模式串）在长串中出现次数，有一句比较经典的话：Kmp算法中，模式串当前位置字符匹配失败，会根据已经匹配过字符寻找串中新位置进行匹配，模式串不必从头开始比较 → → 省时。

 1 #include <cstdio>
 2 #include <cstring>
 3 #include <iostream>
 4 using namespace std;
 5 const int M  = 10010 + 10;
 6 int p[M * 100], lena, lenb, ans;
 7 char a[M], b[M * 100]; 
 8 
 9 void Getp()
10 {
11     int i = 0, j = -1;
12     p[i] = j;
13     while(i != lena)
14     {
15         if(j == -1 || a[i] == a[j])
16         {
17             i++;  j++;
18             p[i] = j; 
19         }
20         else
21             j = p[j];
22     }
23 }
24 void Kmp()
25 {
26     int i = 0, j = 0;
27     while(i != lenb)
28     {
29         if(b[i] == a[j] || j == -1)
30             i++, j++;
31         else
32             j = p[j];
33         if(j == lena)
34             ans++;
35     }
36     printf("%d\n", ans);
37 }
38 int main()
39 {
40     int t;
41     scanf("%d", &t);
42     while(t--)
43     {
44         scanf("%s %s", a, b);
45         lenb = strlen(b);
46         lena = strlen(a);
47         ans = 0;
48         Getp();
49         Kmp(); 
50      }
51     return 0;
52 }

剪花布条

Time Limit: 1000/1000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others)
Total Submission(s): 11270 Accepted Submission(s): 7231

Problem Description

一块花布条，里面有些图案，另有一块直接可用的小饰条，里面也有一些图案。对于给定的花布条和小饰条，计算一下能从花布条中尽可能剪出几块小饰条来呢？

Input

输入中含有一些数据，分别是成对出现的花布条和小饰条，其布条都是用可见ASCII字符表示的，可见的ASCII字符有多少个，布条的花纹也有多少种花样。花纹条和小饰条不会超过1000个字符长。如果遇见#字符，则不再进行工作。

Output

输出能从花纹布中剪出的最多小饰条个数，如果一块都没有，那就老老实实输出0，每个结果之间应换行。

Sample Input

abcde a3 

aaaaaa aa

#

Sample Output

0

3

Author

qianneng

Source

冬练三九之二

Recommend

lcy | We have carefully selected several similar problems for you: 3746 3336 1358 2094 3068

//分割字符串时注意两点：对短串求p[](对短串求p[]的情况还有找串在串中位置这种情况); p数组中循环条件为 len - 1;

 1 #include <cstdio>
 2 #include <cstring>
 3 #include <iostream>
 4 using namespace std;
 5 char a[1010], b[1010]; int p[1010], lena, lenb;
 6 void Getp()
 7 {    
 8     int i = 0, j = -1;
 9     p[i] = j;
10     while(i < lenb - 1)           //so important;
11     {
12         if(j == -1 || b[i] == b[j])   //sees.
13         {
14             i++; j++; p[i] = j;    
15         }    
16         else
17             j = p[j];
18     } 
19 }
20 int Kmp()
21 {
22     Getp();
23     int cnt = 0, i = 0, j = 0;
24     while(i < lena)
25     {
26         if(j == -1 || a[i] == b[j])
27         {
28             i++; j++;
29         }
30         else 
31             j = p[j];
32         if(j == lenb)
33             cnt++;
34     }
35     return cnt;
36 }
37 int main()
38 {
39     while(cin >> a , a[0] != '#')
40     {
41         cin >> b;
42         lena = strlen(a);
43         lenb = strlen(b);
44         printf("%d\n", Kmp());
45     }
46     return 0;
47 }

posted on 2015-08-07 13:09 cleverbiger 阅读(318) 评论(0) 收藏举报

刷新页面返回顶部