SPOJ 7758 Growing Strings AC自动机DP

7758. Growing Strings

Problem code: MGLAR10

 

 

English Vietnamese

 

 

 

Gene and Gina have a particular kind of farm. Instead of growing animals and vegetables, as
it is usually the case in regular farms, they grow strings. A string is a sequence of characters.
Strings have the particularity that, as they grow, they add characters to the left and/or to the
right of themselves, but they never lose characters, nor insert new characters in the middle.
Gene and Gina have a collection of photos of some strings at different times during their growth.
The problem is that the collection is not annotated, so they forgot to which string each photo
belongs to. They want to put together a wall to illustrate strings growing procedures, but they
need your help to find an appropriate sequence of photos.
Each photo illustrates a string. The sequence of photos must be such that if si comes imme-
diately before si+1 in the sequence, then si+1 is a string that may have grown from si (i.e., si
appears as a consecutive substring of si+1). Also, they do not want to use repeated pictures,
so all strings in the sequence must be different.
Given a set of strings representing all available photos, your job is to calculate the size of the
largest sequence they can produce following the guidelines above.
Gene and Gina have a particular kind of farm. Instead of growing animals and vegetables, as it is usually the case in regular farms, they grow strings. A string is a sequence of characters. Strings have the particularity that, as they grow, they add characters to the left and/or to the right of themselves, but they never lose characters, nor insert new characters in the middle. 
 Gene and Gina have a collection of photos of some strings at different times during their growth. The problem is that the collection is not annotated, so they forgot to which string each photo belongs to. They want to put together a wall to illustrate strings growing procedures, but they need your help to find an appropriate sequence of photos.
Each photo illustrates a string. The sequence of photos must be such that if si comes immediately before si+1 in the sequence, then si+1 is a string that may have grown from si (i.e., si appears as a consecutive substring of si+1). Also, they do not want to use repeated pictures, so all strings in the sequence must be different.
Given a set of strings representing all available photos, your job is to calculate the size of the largest sequence they can produce following the guidelines above.

 

Input

 

Each test case is given using several lines. The first line contains an integer N representing the number of strings in the set (1 ≤ N ≤ 10^4). Each of the following N lines contains a different non-empty string of at most 1000 lowercase letters of the English alphabet. Within each test case, the sum of the lengths of all strings is at most 10^6.

The last test case is followed by a line containing one zero.

 

Output

 

For each test case output a single line with a single integer representing the size of the largest sequence of photos that can be produced.

 

Sample

input
6
plant
ant
cant
decant
deca
an
2
supercalifragilisticexpialidocious
rag
0

output
4
2

 

-----------

题目给定n个字符串,让我们找出若干个字符串组成一个序列,前面一个字符串是后面一个字符串的子串,问我们能获得得最长序列的长度。

最长串为temp的子节点时能获得的最长序列长度=max(其父节点的长度,其fail指针指向的节点的长度)+以该节点结尾的单词的数量

temp->next[i]->sum=max( temp->sum, temp->next[i]->fail->sum )+temp->next[i]->count;

-----------

#include <iostream>
#include <cstring>
#include <queue>
#include <cstdio>
using namespace std;

const int CHARSET = 26;
const int MAX_N_NODES = 2111111;
struct Aho_Corasick{
    struct Node{
        Node *next[CHARSET];
        Node *fail;
        int count;//记录当前前缀是完整单词出现的个数
        int sum;
        Node(){
            memset(next,0,sizeof(next));
            fail = NULL;
            count = 0;
            sum=0;
        }
        void clear(){
            memset(next,0,sizeof(next));
            fail = NULL;
            count = 0;
            sum=0;
        }
    };
    queue<Node*>Q;
    Node *root;
    Node nodePool[MAX_N_NODES], *cur;
    Node* newNode(){
        Node* t=cur++;
        t->clear();
        return t;
    }
    void init(){
        cur=nodePool;
        root=newNode();
    }
    void insert(char *str){
        Node* p=root;
        int index;
        int len=strlen(str);
        for (int i=0;i<len;i++){
            index=str[i]-'a';
            if(p->next[index]==NULL) p->next[index]=newNode();
            p=p->next[index];
        }
        p->count++;
    }
    void build_ac_automation(){
        int i;
        while (!Q.empty()) Q.pop();
        root->fail=NULL;
        Q.push(root);
        while(!Q.empty()){
            Node* temp=Q.front();
            Q.pop();
            Node* p=NULL;
            for(i=0;i<CHARSET;i++){
                if(temp->next[i]!=NULL){//寻找当前子树的失败指针
                    if (temp==root){
                        temp->next[i]->fail=temp;
                        temp->next[i]->sum=temp->sum+temp->next[i]->count;
                    }
                    else{
                        p = temp->fail;
                        while(p!=NULL){
                            if(p->next[i]!=NULL){//找到失败指针
                                temp->next[i]->fail=p->next[i];
                                break;
                            }
                            p=p->fail;
                        }
                        if(p==NULL) temp->next[i]->fail=root;//无法获取,当前子树的失败指针为根
                        //==========
                        temp->next[i]->sum=max( temp->sum, temp->next[i]->fail->sum )+temp->next[i]->count;
                        //==========
                    }
                    Q.push(temp->next[i]);
                }
            }
        }
    }
    int query(){//询问str中包含n个关键字中多少种即匹配
        int res=-1;
        for (Node* it=nodePool;it!=cur;it++){
            res=max(res,it->sum);
        }
        return res;
    }
}AC;
int main()
{
    char s[21111];
    int n;
    while (~scanf("%d",&n)){
        if (n==0) break;
        getchar();
        AC.init();
        for (int i=0;i<n;i++){
            //scanf("%s",s);
            gets(s);
            AC.insert(s);
        }
        AC.build_ac_automation();
        int ans=AC.query();
        printf("%d\n",ans);
    }
    return 0;
}





posted on 2013-09-05 16:43  电子幼体  阅读(149)  评论(0编辑  收藏  举报

导航