Live2d Test Env

HDU4416Good Article Good sentence(后缀自动机)

Problem Description
In middle school, teachers used to encourage us to pick up pretty sentences so that we could apply those sentences in our own articles. One of my classmates ZengXiao Xian, wanted to get sentences which are different from that of others, because he thought the distinct pretty sentences might benefit him a lot to get a high score in his article.
Assume that all of the sentences came from some articles. ZengXiao Xian intended to pick from Article A. The number of his classmates is n. The i-th classmate picked from Article Bi. Now ZengXiao Xian wants to know how many different sentences she could pick from Article A which don't belong to either of her classmates?Article. To simplify the problem, ZengXiao Xian wants to know how many different strings, which is the substring of string A, but is not substring of either of string Bi. Of course, you will help him, won't you?
 

 

Input
The first line contains an integer T, the number of test data. 
For each test data
The first line contains an integer meaning the number of classmates.
The second line is the string A;The next n lines,the ith line input string Bi.
The length of the string A does not exceed 100,000 characters , The sum of total length of all strings Bi does not exceed 100,000, and assume all string consist only lowercase characters 'a' to 'z'.
 

 

Output
For each case, print the case number and the number of substrings that ZengXiao Xian can find.
 

 

Sample Input
3
2
abab
ab
ba
1
aaa
bbb
2
aaaa
aa
aaa
 
Sample Output
Case 1: 3
Case 2: 3
Case 3: 1
 

 

T中的字符串依次拿去和S的自动机匹配。每次匹配到一个状态,更新这个状态所匹配的最大的长度p,那么这个状态所表示的子串中长度大于p的即为我们要找的。在计算答案的时候,我们还要同时更新目前状态的pre状态的p值,所以要按逆拓扑序计算总答案。

 

个人照hihocoder的代码写了几条,发现hihocoder的SAM代码的确没有别人的优美啊,所以我决定用别人的风格:

觉得这种代码好的理由:

【结构体】:

                1,在写矩阵的时候我习惯把函数写在结构体里,感觉是要方便些。

                2,根据不同的题有不同的改变,在结构体里做点改变不容易搞混而出错。

                3,事实证明结构体里面的函数运行快一些(我也记不得在哪里看过这个说法了)

【膜拜作者】:

       第一次看到Max这样写,很six。

 

处理的时候:

             1,匹配部分和hiho1465是一样的道理。

             2,拓扑没有用前面hihocoder的题一样利用入度ind用队列来做,而是用基数排序来得到拓扑序列,殊途同归,但是代码简洁一些。

             3,注意start到底是1还是0,各自有不同的临界条件。

 

感受:多写几遍自动机,感觉还是有点入门了,233......后缀数组失宠了,回头再结合后缀数组分析一遍这些题。

#include<iostream>
#include<cstring>
#include<cstdio>
#include<cstring>
#include<algorithm>
#include<string>
using namespace std;
typedef long long ll;
void Min(ll &a,ll b) {  if(a>b)  a=b; }
void Max(ll &a,ll b) {  if(a<b)  a=b; }
const int maxn=1e5+10;
const int Sigma=26;
char S[maxn];
struct SAM{
    int len[maxn<<1],fa[maxn<<1];//len是maxlen;fa是slink;next是trans; 
    int next[maxn<<1][Sigma];
    int cnt[maxn<<1],b[maxn<<1];
    ll  dp[maxn<<1];
    int sz,last;
    void init(){//start from 1
        sz=last=1;
        len[1]=fa[1]=0;
        memset(next[1],0,sizeof(next[1]));
    }
    void add(int x){//np,nq是新产生的状态;p,q是转移用的变量 
        int p=last,np=++sz;last=np;
        len[np]=len[p]+1;
        memset(next[np],0,sizeof(next[np]));
        while(p&&!next[p][x]) next[p][x]=np,p=fa[p];
        if(!p) fa[np]=1;
        else {
             int q=next[p][x];
             if(len[q]==len[p]+1) fa[np]=q;
             else{
                    int nq=++sz;
                    memcpy(next[nq],next[q],sizeof(next[q]));//nq代替q;p继续接受后面的 
                    fa[nq]=fa[q],fa[np]=fa[q]=nq;
                    len[nq]=len[p]+1;
                    while(p&&next[p][x]==q) next[p][x]=nq,p=fa[p];
             }
        }    
    }
    void sort(){
        memset(cnt,0,sizeof(cnt));
        for(int i=1;i<=sz;i++) ++cnt[len[i]];//基数排序,得到top序列 
        for(int i=1;i<=sz;i++) cnt[i]+=cnt[i-1];
        for(int i=1;i<=sz;i++) b[cnt[len[i]]--]=i;
    }
    void solve(int n){
        memset(dp,0,sizeof(dp));
        while(n--){
            scanf("%s",S);
            int q=1,l=0;//从start开始匹配自动机 
            for(char *p=S;*p;++p){
                int x=*p-'a';
                while(q>1&&!next[q][x]) q=fa[q],l=len[q];
                if(next[q][x]) q=next[q][x],++l;
                Max(dp[q],l);
            }
        }
        ll ans=0;
        for(int i=sz;i>1;i--){
            Max(dp[fa[b[i]]],dp[b[i]]);//长到短 
            Min(dp[fa[b[i]]],len[fa[b[i]]]);//长度加以限制 
        }
        for(int i=1;i<=sz;i++){
            ll minlen=dp[i];
            if(fa[i]) Max(minlen,len[fa[i]]);
            ans+=len[i]-minlen;
        }
        printf("%lld\n",ans);
    }
};
SAM sam;
int main()
{
    int T,Case=0;scanf("%d",&T);
    while(T--){
        sam.init();
        int n;
        scanf("%d",&n);
        scanf("%s",S);
        for(char *p=S;*p;++p) sam.add(*p-'a');//字符串的指针,学到了 
        sam.sort();    
        printf("Case %d: ",++Case);
        sam.solve(n);
    }
    return 0;
}
 

 

posted @ 2017-11-25 20:19  nimphy  阅读(245)  评论(0编辑  收藏  举报