【后缀数组】【RMQ】HDU 6194 - string string string (2017ICPC沈阳网络赛)
string string string
Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others)
Problem Description
Uncle Mao is a wonderful ACMER. One day he met an easy problem, but Uncle Mao was so lazy that he left the problem to you. I hope you can give him a solution.
Given a string s, we define a substring that happens exactly k times as an important string, and you need to find out how many substrings which are important strings.
Given a string s, we define a substring that happens exactly k times as an important string, and you need to find out how many substrings which are important strings.
Input
The first line contains an integer T (T≤100) implying the number of test cases.
For each test case, there are two lines:
the first line contains an integer k (k≥1) which is described above;
the second line contain a string s (length(s)≤105).
It's guaranteed that ∑length(s)≤2∗106.
For each test case, there are two lines:
the first line contains an integer k (k≥1) which is described above;
the second line contain a string s (length(s)≤105).
It's guaranteed that ∑length(s)≤2∗106.
∑length(s)≤2∗106
Output
For each test case, print the number of the important substrings in a line.
Sample Input
2
2
abcabc
3
abcabcabcabc
Sample Output
6 9
Solution
题意:给一个串,问你其中正好出现k次的子串有多少个
思路:求出给的串的后缀数组和height数组,然后对于height数组中一个长度k-1的区间,如果两边的值比区间内的最小值(可以RMQ进行O(1)的查询)都要小,那这段区间就是对答案有贡献的。注意k=1时需要特殊处理一下,此时就是后缀数组派上用场的时候了。有不少细节在里面,具体请看代码。还有一个坑到哭泣的地方:由于是多组数据,一定要记得每次清空height数组(不清空的话如果这次的字符串比上一次的短,访问height[n+1]时预期的是0,但实际上是上次字符串的height值),否则要找十年才知道WA在哪。
1 #include<bits/stdc++.h> 2 using namespace std; 3 #define MAXN 101111 4 using namespace std; 5 char s[MAXN]; 6 int t1[MAXN],t2[MAXN],cc[MAXN],sa[MAXN],rnk[MAXN],height[MAXN]; 7 int len; 8 bool cmp(int *y,int a,int b,int k) 9 { 10 int a1=y[a]; 11 int b1=y[b]; 12 int a2=a+k>=len ? -1:y[a+k]; 13 int b2=b+k>=len ? -1:y[b+k]; 14 return a1==b1 && a2==b2; 15 } 16 int make_sa() 17 { 18 int *x=t1,*y=t2; 19 int m=30; 20 for(int i=0; i<m; i++) cc[i]=0; 21 for(int i=0; i<len; i++) ++cc[x[i]=s[i]-'a']; 22 for(int i=1; i<m; i++) cc[i]+=cc[i-1]; 23 for(int i=len-1; i>=0; i--) sa[--cc[x[i]]]=i; 24 25 for(int k=1; k<=len; k<<=1) 26 { 27 int p=0; 28 for(int i=len-k; i<len; i++) y[p++]=i; 29 for(int i=0; i<len; i++) 30 if( sa[i]>=k ) y[p++]=sa[i]-k; 31 32 for(int i=0; i<m; i++) cc[i]=0; 33 for(int i=0; i<len; i++) ++cc[x[y[i]]]; 34 for(int i=1; i<m; i++) cc[i]+=cc[i-1]; 35 for(int i=len-1; i>=0; i--) sa[--cc[x[y[i]]]]=y[i]; 36 37 swap(x,y); 38 m=1; x[sa[0]]=0; 39 for(int i=1; i<len; i++) 40 x[sa[i]]=cmp(y,sa[i],sa[i-1],k) ? m-1:m++; 41 42 if( m>=len ) break; 43 } 44 } 45 void make_height() 46 { 47 for(int i=0; i<len; i++) rnk[sa[i]]=i; 48 height[0]=0; 49 int k=0; 50 for(int i=0; i<len; i++) 51 { 52 if(!rnk[i]) continue; 53 int j=sa[rnk[i]-1]; 54 if(k) k--; 55 while(s[i+k]==s[j+k]) k++; 56 height[rnk[i]]=k; 57 } 58 } 59 60 61 int n,minl[100022][18]; 62 63 void S_table(){ 64 int l = log(len) / log(2); 65 for(int j = 1;j <= l;++j){ 66 for(int i = 1;i + (1 << (j - 1)) - 1 <= len;++i){ 67 minl[i][j] = min(minl[i][j-1],minl[i+(1<<(j-1))][j-1]); 68 } 69 } 70 } 71 72 int rmq(int l,int r){ 73 int k = log(r-l+1) / log(2); 74 return min(minl[l][k],minl[r-(1<<k)+1][k]); 75 } 76 77 78 int T,k,cnt[30]; 79 80 int main(){ 81 cin >> T; 82 while(T--){ 83 int ans = 0; 84 memset(height,0,sizeof height); /**< !!! */ 85 scanf("%d",&k); 86 scanf("%s",s); len = strlen(s); 87 make_sa(); 88 make_height(); 89 n = len - 1; 90 for(int i = 0;i < 100022;++i) minl[i][0] = 1e9; 91 for(int i = 1;i <= len;++i) 92 minl[i][0] = height[i]; 93 S_table(); 94 95 96 if(k == 1){ 97 for(int i = 1;i <= n+1;++i){ 98 int tmp = len - sa[i-1]; 99 if(height[i-1] < tmp && height[i] < tmp) 100 ans += tmp - max(height[i-1],height[i]); 101 } 102 printf("%d\n",ans); 103 continue; 104 } 105 106 for(int i = 1;i <= n - k + 2;++i){ 107 int tmp = rmq(i,i+k-2); 108 if(height[i-1] < tmp && height[i+k-1] < tmp) 109 ans += tmp - max(height[i-1],height[i+k-1]); 110 } 111 112 printf("%d\n",ans); 113 } 114 }