HDU4787 GRE Words Revenge(AC自动机 分块 合并)
题目
Source
http://acm.hdu.edu.cn/showproblem.php?pid=4787
Description
Now Coach Pang is preparing for the Graduate Record Examinations as George did in 2011. At each day, Coach Pang can:
"+w": learn a word w
"?p": read a paragraph p, and count the number of learnt words. Formally speaking, count the number of substrings of p which is a learnt words.
Given the records of N days, help Coach Pang to find the count. For convenience, the characters occured in the words and paragraphs are only '0' and '1'.
Input
The first line of the input file contains an integer T, which denotes the number of test cases. T test cases follow.
The first line of each test case contains an integer N (1 <= N <= 105), which is the number of days. Each of the following N lines contains either "+w" or "?p". Both p and w are 01-string in this problem.
Note that the input file has been encrypted. For each string occured, let L be the result of last "?" operation. The string given to you has been shifted L times (the shifted version of string s1s2 ... sk is sks1s2 ... sk-1). You should decrypt the string to the original one before you process it. Note that L equals to 0 at the beginning of each test case.
The test data guarantees that for each test case, total length of the words does not exceed 105 and total length of the paragraphs does not exceed 5 * 106.
Output
For each test case, first output a line "Case #x:", where x is the case number (starting from 1).
And for each "?" operation, output a line containing the result.
Sample Input
2
3
+01
+01
?01001
3
+01
?010
?011
Sample Output
Case #1:
2
Case #2:
1
0
分析
题目大概说有依次进行N个操作,每个操作可以是学习一个单词,或者读一个段落并求出段落里有多少个子串是已经学习的单词。
建立两个AC自动机,一个大的,一个小的。每次更新插入到小的自动机并重构,小的自动机结点数有限制,一旦超过限制就将其合并到大的,然后大的重构,小的清空。。如此就OK了。。
这么做的时间复杂度——
- 不妨设小的自动机大小限制为$\sqrt L$,$L$为插入的模式串总长,于是最多插入$L$次,每次重构fail时间复杂度可以做到线性的即$O(\sqrt L)$,这样小的自动机这儿总时间复杂度是$O(L\sqrt L)$;
- 对于大的来说,最多的合并次数为$\frac L{\sqrt L}$即$\sqrt L$,每次合并时间复杂度$O(\sqrt L)$,每次重构$O(L)$,那么总的时间复杂度是$O(L\sqrt L)$。
- 而查询,就是在两个AC自动机上跑一遍主串即可,也是可以做到线性的,即$O(L+\sum |主串|)$。
有点神奇。。
代码
#include<cstdio> #include<cstring> #include<queue> #include<algorithm> using namespace std; #define MAXN 100100 struct AC_auto{ int ch[MAXN][2],fail[MAXN],tn; bool flag[MAXN]; void init(){ for(int i=0; i<=tn; ++i){ ch[i][0]=ch[i][1]=flag[i]=0; } tn=0; } void insert(char *s){ int x=0; for(int i=0; s[i]; ++i){ int y=s[i]-'0'; if(ch[x][y]==0) ch[x][y]=++tn; x=ch[x][y]; } flag[x]=1; } void getfail(){ for(int i=0; i<=tn; ++i) fail[i]=0; queue<int> que; for(int i=0; i<2; ++i){ if(ch[0][i]) que.push(ch[0][i]); } while(!que.empty()){ int x=que.front(); que.pop(); for(int i=0; i<2; ++i){ if(ch[x][i]==0) continue; que.push(ch[x][i]); int tmp=fail[x]; while(tmp && ch[tmp][i]==0){ tmp=fail[tmp]; } fail[ch[x][i]]=ch[tmp][i]; } } } int match(char *s){ int x=0,ret=0; for(int i=0; s[i]; ++i){ int y=s[i]-'0'; while(x && ch[x][y]==0) x=fail[x]; x=ch[x][y]; int tmp=x; while(tmp){ if(flag[tmp]) ++ret; tmp=fail[tmp]; } } return ret; } bool query(char *s){ int x=0; for(int i=0; s[i]; ++i){ int y=s[i]-'0'; if(ch[x][y]==0) return 0; x=ch[x][y]; } return flag[x]; } }ac,buf; void dfs(int u,int v){ for(int i=0; i<2; ++i){ if(buf.ch[v][i]==0) continue; if(ac.ch[u][i]==0){ ac.ch[u][i]=++ac.tn; ac.ch[ac.tn][0]=ac.ch[ac.tn][1]=0; ac.flag[ac.tn]=0; } if(buf.flag[buf.ch[v][i]]) ac.flag[ac.ch[u][i]]=1; dfs(ac.ch[u][i],buf.ch[v][i]); } } void join(){ dfs(0,0); buf.init(); ac.getfail(); } char str[5111111],s[5111111]; int main(){ int t; scanf("%d",&t); for(int cse=1; cse<=t; ++cse){ printf("Case #%d:\n",cse); ac.init(); buf.init(); int n; scanf("%d",&n); int lastans=0; char op; while(n--){ scanf(" %c",&op); scanf("%s",str); int len=strlen(str); for(int i=0; i<len; ++i){ s[i]=str[(i+lastans)%len]; } s[len]=0; if(op=='+'){ if(ac.query(s) || buf.query(s)) continue; buf.insert(s); buf.getfail(); if(buf.tn>2000) join(); }else{ lastans=ac.match(s)+buf.match(s); printf("%d\n",lastans); } } } return 0; }