字符串-AC自动机(详细图解)

AC自动机


AC自动机 (Aho-Corasick automaton)是KMP的升级版。即KMP是单模匹配算法,处理一个文本串中查找一个模式串的问题;而AC自动机能在一个文本串中同时查找多个不同的模式串 ,是多模匹配算法。 
其实KMP也能做多模匹配,对每一个模式串做一次KMP,复杂度是 O ( k ( n + m ) ) O(k(n+m)) ) ;AC自动机算法只需搜索一遍,搜索时匹配所有的模式串,复杂度是 O ( k m + n m ) O(km+nm) ) ,当 m < < k m<<k k 时,AC自动机优势很大。

原理

建议先了解 KMP 和 字典树 。 
那么如何同时匹配所有的模式串 P P P ? 
这时结合KMP和字典树就可以开始秀了。如果把所有的 P P P 做成一个字典树,然后在匹配的时候查找这个 P P P 对应的 n e x t next xt ,即失败指针 F a i l Fail l , 使当前字符失配时跳转到具有最长公共前后缀的字符继续匹配 (同KMP),这样就能实现同时搜索的快速匹配了。

步骤

  • 构建字典树
  • 构造fail失败指针
  • 搜索待处理文本

Fail指针 
同KMP的next一样,Fail指针是AC自动机的核心,是在树上指出失配后下一个跳转的位置,而不用全部回溯,大大减少时间。那么Fail是怎么跳转的? 
以HDU-2222的样例为例说明,模式串P={“she”,“he”,“say”,“shr”,“her”},文本串S=“yasherhs”。

1.构建字典树 
在这里插入图片描述

  1. 构造fail指针 
    2.1 用bfs实现,将root子节点入队(第二层),并将其fail指向root。 
    在这里插入图片描述 
    2.2 h出队,父节点h的fail指针所指节点是root;此时root没有对应为e的子节点,匹配失败,则e的fail指针指向root,表示没有匹配序列,然后入队e;同样的s出队,其子节点a同理。 
    在这里插入图片描述 
    2.3 此时循环到s的子节点h,父节点s的fail指针所指节点也是root,但与前面不同的是:root有值为h的子节点,匹配成功,此时fail应指向匹配节点。 
    在这里插入图片描述 
    2.4 以此类推,求出所有fail指针,右侧e的父节点h的fail指针所指节点是左侧h,而左侧h有值为e的子节点,匹配成功,即右侧e的fail指向左侧e(蓝线),如图。 
    在这里插入图片描述
  2. 搜索待处理文本 
    ①首先根结点下无y和a,第1、2条线还是指向根结点; 
    ②从she开始一直可以匹配,即线3、4、5,到节点e(绿底),更新答案; 
    ③下一个字符是r,匹配失败,到fail指针所指节点(蓝线所指),即线6; 
    ④此时匹配到了r(线7),发现模式串末尾标记,更新答案; 
    ⑤下一个字符h,失配,回到fail所指(线8) 
    ⑥然后继续匹配,成功(线9) 
    ⑦继续下一个字符s,失配回溯(线10) 
    ⑧继续匹配,成功(线11)。最后一个字符结束,退出 
    在这里插入图片描述

模板

                      void   insert   (   char   *  p  )   {   //构建字典树   int  u  =   0   ;   int  ls  =   strlen   (  p  )   ;   for   (   int  i  =   0   ;  i  <  ls  ;  i  ++   )   {   int  v  =  p  [  i  ]   -   'a'   ;   if   (  trie  [  u  ]   [  v  ]   ==   0   )  trie  [  u  ]   [  v  ]   =   ++  pos  ;  u  =  trie  [  u  ]   [  v  ]   ;   }  cnt  [  u  ]   ++   ;   //当前节点单词数+1   }   void   getFail   (   )   {   //求fail  queue  <   int   >  q  ;   for   (   int  i  =   0   ;  i  <   26   ;  i  ++   )   {   //入队root子节点(第二层)   if   (  trie  [   0   ]   [  i  ]   )   {  fail  [  trie  [   0   ]   [  i  ]   ]   =   0   ;  q  .   push   (  trie  [   0   ]   [  i  ]   )   ;   }   }   while   (   !  q  .   empty   (   )   )   {   int  cur  =  q  .   front   (   )   ;   //当前父节点  q  .   pop   (   )   ;   for   (   int  i  =   0   ;  i  <   26   ;  i  ++   )   {   //26个字母   if   (  trie  [  cur  ]   [  i  ]   )   {   //存在子节点,将其fail指向对应匹配节点(父节点fail所指节点的对应子节点)  fail  [  trie  [  cur  ]   [  i  ]   ]   =  trie  [  fail  [  cur  ]   ]   [  i  ]   ;  q  .   push   (  trie  [  cur  ]   [  i  ]   )   ;   }   else   //若不存在相关子节点,字典树中赋值为fail所指节点  trie  [  cur  ]   [  i  ]   =  trie  [  fail  [  cur  ]   ]   [  i  ]   ;   }   }   }   int   query   (   char   *  s  )   {   //查询s中出现几个p   int  cur  =   0   ,  ans  =   0   ,  ls  =   strlen   (  s  )   ;   for   (   int  i  =   0   ;  i  <  ls  ;  i  ++   )   {  cur  =  trie  [  cur  ]   [  s  [  i  ]   -   'a'   ]   ;   for   (   int  j  =  cur  ;  j  &&  cnt  [  j  ]   ;  j  =  fail  [  j  ]   )   {   //一直向下寻找,直到匹配失败  ans  +   =  cnt  [  j  ]   ;   //更新答案  cnt  [  j  ]   =   0   ;   //防止重复计算   }   }   return  ans  ;   }   

例题

HDU-2222Keywords Search

HDU-2222Keywords Search

Problem Description 
In the modern time, Search engine came into the life of everybody like Google, Baidu, etc. 
Wiskey also wants to bring this feature to his image retrieval system. 
Every image have a long description, when users type some keywords to find the image, the system will match the keywords with description of image and show the image which the most keywords be matched. 
To simplify the problem, giving you a description of image, and some keywords, you should tell me how many keywords will be match. 
Input 
First line will contain one integer means how many cases will follow by. 
Each case will contain two integers N means the number of keywords and N keywords follow. (N <= 10000) 
Each keyword will only contains characters ‘a’-‘z’, and the length will be not longer than 50. 
The last line is the description, and the length will be not longer than 1000000. 
Output 
Print how many keywords are contained in the description. 
Sample Input 


she 
he 
say 
shr 
her 
yasherhs 
Sample Output 
3

分析 : 
模板题,分析同上

                      #  include   <bits/stdc++.h>    using   namespace  std  ;   typedef   long   long  ll  ;   const   int  maxn  =   1000006   ;   int  trie  [  maxn  ]   [   26   ]   ;   //字典树   int  cnt  [  maxn  ]   ;   //记录单词出现次数   int  fail  [  maxn  ]   ;   //失败时的回溯指针   int  pos  ;   void   insert   (   char   *  p  )   {   int  u  =   0   ;   int  ls  =   strlen   (  p  )   ;   for   (   int  i  =   0   ;  i  <  ls  ;  i  ++   )   {   int  v  =  p  [  i  ]   -   'a'   ;   if   (  trie  [  u  ]   [  v  ]   ==   0   )  trie  [  u  ]   [  v  ]   =   ++  pos  ;  u  =  trie  [  u  ]   [  v  ]   ;   }  cnt  [  u  ]   ++   ;   }   void   getFail   (   )   {  queue  <   int   >  q  ;   for   (   int  i  =   0   ;  i  <   26   ;  i  ++   )   {   if   (  trie  [   0   ]   [  i  ]   )   {  fail  [  trie  [   0   ]   [  i  ]   ]   =   0   ;  q  .   push   (  trie  [   0   ]   [  i  ]   )   ;   }   }   while   (   !  q  .   empty   (   )   )   {   int  cur  =  q  .   front   (   )   ;  q  .   pop   (   )   ;   for   (   int  i  =   0   ;  i  <   26   ;  i  ++   )   {   if   (  trie  [  cur  ]   [  i  ]   )   {  fail  [  trie  [  cur  ]   [  i  ]   ]   =  trie  [  fail  [  cur  ]   ]   [  i  ]   ;  q  .   push   (  trie  [  cur  ]   [  i  ]   )   ;   }   else  trie  [  cur  ]   [  i  ]   =  trie  [  fail  [  cur  ]   ]   [  i  ]   ;   }   }   }   int   query   (   char   *  s  )   {   int  cur  =   0   ,  ans  =   0   ,  ls  =   strlen   (  s  )   ;   for   (   int  i  =   0   ;  i  <  ls  ;  i  ++   )   {  cur  =  trie  [  cur  ]   [  s  [  i  ]   -   'a'   ]   ;   for   (   int  j  =  cur  ;  j  &&  cnt  [  j  ]   ;  j  =  fail  [  j  ]   )   {  ans  +   =  cnt  [  j  ]   ;  cnt  [  j  ]   =   0   ;   }   }   return  ans  ;   }   int   main   (   )   {   int  n  ,  t  ;   char  s  [  maxn  ]   ,  p  [  maxn  ]   ;   //不喜欢传参,全局也行   scanf   (   "%d"   ,   &  t  )   ;   while   (  t  --   )   {   scanf   (   "%d"   ,   &  n  )   ;   memset   (  trie  ,   0   ,   sizeof   (  trie  )   )   ;   memset   (  cnt  ,   0   ,   sizeof   (  cnt  )   )   ;  fail  [   0   ]   =  pos  =   0   ;   for   (   int  i  =   0   ;  i  <  n  ;  i  ++   )   {   scanf   (   "%s"   ,  p  )   ;   insert   (  p  )   ;   }   getFail   (   )   ;   scanf   (   "%s"   ,  s  )   ;   printf   (   "%d\n"   ,   query   (  s  )   )   ;   }   return   0   ;   }   

HDU-2896病毒侵袭

HDU-2896病毒侵袭

Problem Description 
当太阳的光辉逐渐被月亮遮蔽,世界失去了光明,大地迎来最黑暗的时刻。。。。在这样的时刻,人们却异常兴奋——我们能在有生之年看到500年一遇的世界奇观,那是多么幸福的事儿啊~~ 
但网路上总有那么些网站,开始借着民众的好奇心,打着介绍日食的旗号,大肆传播病毒。小t不幸成为受害者之一。小t如此生气,他决定要把世界上所有带病毒的网站都找出来。当然,谁都知道这是不可能的。自学英语教材小t却执意要完成这不能的任务,他说:“子子孙孙无穷匮也!”(愚公后继有人了)。 
万事开头难,小t收集了好多病毒的特征码,又收集了一批诡异网站的源码,他想知道这些网站中哪些是有病毒的,又是带了怎样的病毒呢?顺便还想知道他到底收集了多少带病毒的网站。这时候他却不知道何从下手了。所以想请大家帮帮忙。小t又是个急性子哦,所以解决问题越快越好哦~~ 
Input 
第一行,一个整数N(1<=N<=500),表示病毒特征码的个数。 
接下来N行,每行表示一个病毒特征码,特征码字符串长度在20—200之间。 
每个病毒都有一个编号,依此为1—N。 
不同编号的病毒特征码不会相同。 
在这之后一行,有一个整数M(1<=M<=1000),表示网站数。 
接下来M行,每行表示一个网站源码,源码字符串长度在7000—10000之间。 
每个网站都有一个编号,依此为1—M。 
以上字符串中字符都是ASCII码可见字符(不包括回车)。 
Output 
依次按如下格式输出按网站编号从小到大输出,带病毒的网站编号和包含病毒编号,每行一个含毒网站信息。 
web 网站编号: 病毒编号 病毒编号 … 
冒号后有一个空格,病毒编号按从小到大排列,两个病毒编号之间用一个空格隔开,如果一个网站包含病毒,病毒数不会超过3个。 
最后一行输出统计信息,如下格式 
total: 带病毒网站数 
冒号后有一个空格。 
Sample Input 

aaa 
bbb 
ccc 

aaabbbccc 
bbaacc 
Sample Output 
web 1: 1 2 3 
total: 1

分析 : 
套AC自动机,把P构建字典树,查询时用vector记录经过哪些单词,注意初始化。 
注意有坑最后输出total要带换行,字符串是ASCLL码范围0-126,还有要排序。

                      #  include   <bits/stdc++.h>    using   namespace  std  ;   const   int  maxn  =   100005   ;   int  trie  [  maxn  ]   [   130   ]   ;   int  fail  [  maxn  ]   ,  cnt  [  maxn  ]   ;   int  vis  [  maxn  ]   ,  tag  [  maxn  ]   ;   int  n  ,  m  ,  pos  =   0   ,  total  =   0   ;   char  s  [  maxn  ]   ,  p  [   202   ]   ;  queue  <   int   >  q  ;  vector  <   int   >  ans  ;   void   insert   (   int  idx  )   {   int  u  =   0   ,  lp  =   strlen   (  p  )   ;   for   (   int  i  =   0   ;  i  <  lp  ;  i  ++   )   {   int  v  =  p  [  i  ]   ;   if   (  trie  [  u  ]   [  v  ]   ==   0   )  trie  [  u  ]   [  v  ]   =   ++  pos  ;  u  =  trie  [  u  ]   [  v  ]   ;   }  cnt  [  u  ]   ++   ;  tag  [  u  ]   =  idx  ;   }   void   getfail   (   )   {   for   (   int  i  =   0   ;  i  <   130   ;  i  ++   )   {   if   (  trie  [   0   ]   [  i  ]   )   {  fail  [  trie  [   0   ]   [  i  ]   ]   =   0   ;  q  .   push   (  trie  [   0   ]   [  i  ]   )   ;   }   }   while   (   !  q  .   empty   (   )   )   {   int  cur  =  q  .   front   (   )   ;  q  .   pop   (   )   ;   for   (   int  i  =   0   ;  i  <   130   ;  i  ++   )   {   if   (  trie  [  cur  ]   [  i  ]   )   {  fail  [  trie  [  cur  ]   [  i  ]   ]   =  trie  [  fail  [  cur  ]   ]   [  i  ]   ;  q  .   push   (  trie  [  cur  ]   [  i  ]   )   ;   }   else  trie  [  cur  ]   [  i  ]   =  trie  [  fail  [  cur  ]   ]   [  i  ]   ;   }   }   }   void   query   (   )   {   int  ls  =   strlen   (  s  )   ,  u  =   0   ;   for   (   int  i  =   0   ;  i  <  ls  ;  i  ++   )   {  u  =  trie  [  u  ]   [  s  [  i  ]   ]   ;   for   (   int  j  =  u  ;  j  &&   !  vis  [  j  ]   &&  cnt  [  j  ]   ;  j  =  fail  [  j  ]   )   {  ans  .   push_back   (  tag  [  j  ]   )   ;  vis  [  j  ]   =   1   ;   }   }   }   int   main   (   )   {   scanf   (   "%d"   ,   &  n  )   ;   for   (   int  i  =   1   ;  i  <=  n  ;  i  ++   )   {   scanf   (   "%s"   ,  p  )   ;   insert   (  i  )   ;   }   scanf   (   "%d"   ,   &  m  )   ;   getfail   (   )   ;   for   (   int  i  =   1   ;  i  <=  m  ;  i  ++   )   {  ans  .   clear   (   )   ;   memset   (  vis  ,   0   ,   sizeof   (  vis  )   )   ;   scanf   (   "%s"   ,  s  )   ;   query   (   )   ;   if   (   !  ans  .   empty   (   )   )   {  total  ++   ;   sort   (  ans  .   begin   (   )   ,  ans  .   end   (   )   )   ;   printf   (   "web %d:"   ,  i  )   ;   for   (   int  j  =   0   ;  j  <  ans  .   size   (   )   ;  j  ++   )   printf   (   " %d"   ,  ans  [  j  ]   )   ;   printf   (   "\n"   )   ;   }   }   printf   (   "total: %d\n"   ,  total  )   ;   return   0   ;   }   

HDU-3065病毒侵袭持续中

HDU-3065病毒侵袭持续中

Problem Description 
小t非常感谢大家帮忙解决了他的上一个问题。然而病毒侵袭持续中。在小t的不懈努力下,他发现了网路中的“万恶之源”。这是一个庞大的病毒网站,他有着好多好多的病毒,但是这个网站包含的病毒很奇怪,这些病毒的特征码很短,而且只包含“英文大写字符”。当然小t好想好想为民除害,但是小t从来不打没有准备的战争。知己知彼,百战不殆,小t首先要做的是知道这个病毒网站特征:包含多少不同的病毒,每种病毒出现了多少次。大家能再帮帮他吗? 
Input 
第一行,一个整数N(1<=N<=1000),表示病毒特征码的个数。 
接下来N行,每行表示一个病毒特征码,特征码字符串长度在1—50之间,并且只包含“英文大写字符”。任意两个病毒特征码,不会完全相同。 
在这之后一行,表示“万恶之源”网站源码,源码字符串长度在2000000之内。字符串中字符都是ASCII码可见字符(不包括回车)。 
Output 
按以下格式每行一个,输出每个病毒出现次数。未出现的病毒不需要输出。 
病毒特征码: 出现次数 
冒号后有一个空格,按病毒特征码的输入顺序进行输出。 
Sample Input 

AA 
BB 
CC 
ooxxCC%dAAAoen…END 
Sample Output 
AA: 2 
CC: 1 
Hint 
Hit: 
题目描述中没有被提及的所有情况都应该进行考虑。比如两个病毒特征码可能有相互包含或者有重叠的特征码段。 
计数策略也可一定程度上从Sample中推测。

分析 : 
统计各单词出现次数,记录各单词末尾节点对应单词编号,查询时若经过则维护更新对应单词数量。 
注意数组大小和初始化,还有神坑 多组数据 

                      #  include   <bits/stdc++.h>    using   namespace  std  ;   const   int  maxn  =   50004   ;   int  trie  [  maxn  ]   [   130   ]   ;   int  fail  [  maxn  ]   ,  num  [  maxn  ]   ,  cnt  [  maxn  ]   ;   char  p  [   1003   ]   [   55   ]   ,  s  [   2000006   ]   ;   int  n  ,  pos  ;   void   insert   (   int  idx  )   {   int  lp  =   strlen   (  p  [  idx  ]   )   ,  u  =   0   ;   for   (   int  i  =   0   ;  i  <  lp  ;  i  ++   )   {   int  v  =  p  [  idx  ]   [  i  ]   ;   if   (  trie  [  u  ]   [  v  ]   ==   0   )  trie  [  u  ]   [  v  ]   =   ++  pos  ;  u  =  trie  [  u  ]   [  v  ]   ;   }  num  [  u  ]   =  idx  ;   //对应单词编号   }   void   getfail   (   )   {  queue  <   int   >  q  ;   for   (   int  i  =   0   ;  i  <   130   ;  i  ++   )   {   if   (  trie  [   0   ]   [  i  ]   )   {  fail  [  trie  [   0   ]   [  i  ]   ]   =   0   ;  q  .   push   (  trie  [   0   ]   [  i  ]   )   ;   }   }   while   (   !  q  .   empty   (   )   )   {   int  cur  =  q  .   front   (   )   ;  q  .   pop   (   )   ;   for   (   int  i  =   0   ;  i  <   130   ;  i  ++   )   {   if   (  trie  [  cur  ]   [  i  ]   )   {  fail  [  trie  [  cur  ]   [  i  ]   ]   =  trie  [  fail  [  cur  ]   ]   [  i  ]   ;  q  .   push   (  trie  [  cur  ]   [  i  ]   )   ;   }   else  trie  [  cur  ]   [  i  ]   =  trie  [  fail  [  cur  ]   ]   [  i  ]   ;   }   }   }   void   query   (   )   {   int  u  =   0   ,  ls  =   strlen   (  s  )   ;   for   (   int  i  =   0   ;  i  <  ls  ;  i  ++   )   {  u  =  trie  [  u  ]   [  s  [  i  ]   ]   ;   for   (   int  j  =  u  ;  j  ;  j  =  fail  [  j  ]   )   {  cnt  [  num  [  j  ]   ]   ++   ;   //经过时更新数量++   }   }   }   int   main   (   )   {   while   (   ~   scanf   (   "%d"   ,   &  n  )   )   {   memset   (  trie  ,   0   ,   sizeof   (  trie  )   )   ;   memset   (  cnt  ,   0   ,   sizeof   (  cnt  )   )   ;   memset   (  num  ,   0   ,   sizeof   (  num  )   )   ;  pos  =   0   ;   for   (   int  i  =   1   ;  i  <=  n  ;  i  ++   )   {   scanf   (   "%s"   ,  p  [  i  ]   )   ;   insert   (  i  )   ;   }   getfail   (   )   ;   scanf   (   "%s"   ,  s  )   ;   query   (   )   ;   for   (   int  i  =   1   ;  i  <=  n  ;  i  ++   )   if   (  cnt  [  i  ]   )   printf   (   "%s: %d\n"   ,  p  [  i  ]   ,  cnt  [  i  ]   )   ;   }   return   0   ;   }   

POJ-2778DNA Sequence

POJ-2778DNA Sequence

Description 
It’s well known that DNA Sequence is a sequence only contains A, C, T and G, and it’s very useful to analyze a segment of DNA Sequence,For example, if a animal’s DNA sequence contains segment ATC then it may mean that the animal may have a genetic disease. Until now scientists have found several those segments, the problem is how many kinds of DNA sequences of a species don’t contain those segments. 
Suppose that DNA sequences of a species is a sequence that consist of A, C, T and G,and the length of sequences is a given integer n. 
Input 
First line contains two integer m (0 <= m <= 10), n (1 <= n <=2000000000). Here, m is the number of genetic disease segment, and n is the length of sequences. 
Next m lines each line contain a DNA genetic disease segment, and length of these segments is not larger than 10. 
Output 
An integer, the number of DNA sequences, mod 100000. 
Sample Input 
4 3 
AT 
AC 
AG 
AA 
Sample Output 
36

分析 : 
求不包含病毒串的长度为n的DNA串有多少种。 
AC自动机+ 矩阵快速幂 :把病毒串建字典树,并在末尾标记;题目保证节点数不超过100,那么就可以用一个二维矩阵, i i i 行 j j j 列的值表示从节点 i i i 转移到节点 j j j 的方案数,然后进行 n n n 次转移,即该矩阵的 n n n 次幂(离散数学结论),然后统计从根节点0到其他节点的方案数(即所有长度为n的合法串)。 
注意long long。

                      #  include   <cstdio>    #  include   <cstring>    #  include   <queue>    #  include   <map>    using   namespace  std  ;   const   int  maxn  =   102   ;   const   int  mod  =   100000   ;   int  trie  [  maxn  ]   [   4   ]   ,  fail  [  maxn  ]   ,  tail  [  maxn  ]   ;   int  n  ,  m  ,  pos  ;   char  s  [   15   ]   ;  map  <   char   ,   int   >  idx  ;   void   insert   (   )   {   int  ls  =   strlen   (  s  )   ,  u  =   0   ;   for   (   int  i  =   0   ;  i  <  ls  ;  i  ++   )   {   int  v  =  idx  [  s  [  i  ]   ]   ;   if   (  trie  [  u  ]   [  v  ]   ==   0   )  trie  [  u  ]   [  v  ]   =   ++  pos  ;  u  =  trie  [  u  ]   [  v  ]   ;   }  tail  [  u  ]   =   1   ;   }   void   getfail   (   )   {  queue  <   int   >  q  ;   for   (   int  i  =   0   ;  i  <   4   ;  i  ++   )   {   if   (  trie  [   0   ]   [  i  ]   )   {  fail  [  trie  [   0   ]   [  i  ]   ]   =   0   ;  q  .   push   (  trie  [   0   ]   [  i  ]   )   ;   }   }   while   (   !  q  .   empty   (   )   )   {   int  cur  =  q  .   front   (   )   ;  q  .   pop   (   )   ;   for   (   int  i  =   0   ;  i  <   4   ;  i  ++   )   {   if   (  trie  [  cur  ]   [  i  ]   )   {  fail  [  trie  [  cur  ]   [  i  ]   ]   =  trie  [  fail  [  cur  ]   ]   [  i  ]   ;  q  .   push   (  trie  [  cur  ]   [  i  ]   )   ;   }   else  trie  [  cur  ]   [  i  ]   =  trie  [  fail  [  cur  ]   ]   [  i  ]   ;  tail  [  trie  [  cur  ]   [  i  ]   ]   |   =  tail  [  trie  [  fail  [  cur  ]   ]   [  i  ]   ]   ;   //注意是或,只要包含病毒就不行   }   }   }   struct  matrix  {   long   long  a  [  maxn  ]   [  maxn  ]   ;   matrix   (   )   {   memset   (  a  ,   0   ,   sizeof   (  a  )   )   ;   }   }   ;  matrix  operator   *   (   const  matrix  &  x  ,   const  matrix  &  y  )   {  matrix  m  ;   for   (   int  i  =   0   ;  i  <=  pos  ;   ++  i  )   for   (   int  j  =   0   ;  j  <=  pos  ;   ++  j  )   for   (   int  k  =   0   ;  k  <=  pos  ;   ++  k  )  m  .  a  [  i  ]   [  j  ]   =   (  m  .  a  [  i  ]   [  j  ]   +  x  .  a  [  i  ]   [  k  ]   *  y  .  a  [  k  ]   [  j  ]   )   %  mod  ;   return  m  ;   }  matrix  fastm   (  matrix a  ,   int  n  )   {  matrix res  ;   for   (   int  i  =   0   ;  i  <=  pos  ;   ++  i  )  res  .  a  [  i  ]   [  i  ]   =   1   ;   while   (  n  )   {   if   (  n  &   1   )  res  =  res  *  a  ;  a  =  a  *  a  ;  n  >>=   1   ;   }   return  res  ;   }   int   main   (   )   {  idx  [   'A'   ]   =   0   ,  idx  [   'C'   ]   =   1   ;  idx  [   'T'   ]   =   2   ,  idx  [   'G'   ]   =   3   ;   while   (   ~   scanf   (   "%d%d"   ,   &  m  ,   &  n  )   )   {  pos  =   0   ;   memset   (  trie  ,   0   ,   sizeof   (  trie  )   )   ;   memset   (  tail  ,   0   ,   sizeof   (  tail  )   )   ;   while   (  m  --   )   {   scanf   (   "%s"   ,  s  )   ;   insert   (   )   ;   }   getfail   (   )   ;  matrix x  ;   for   (   int  i  =   0   ;  i  <=  pos  ;   ++  i  )   //构建初始矩阵   if   (   !  tail  [  i  ]   )   //如果本身不含病毒   for   (   int  j  =   0   ;  j  <   4   ;   ++  j  )   if   (   !  tail  [  trie  [  i  ]   [  j  ]   ]   )   //其子节点也不含病毒  x  .  a  [  i  ]   [  trie  [  i  ]   [  j  ]   ]   ++   ;   //那么节点i到该子节点是可行方案+1  x  =   fastm   (  x  ,  n  )   ;   int  ans  =   0   ;   for   (   int  i  =   0   ;  i  <=  pos  ;   ++  i  )   //统计  ans  =   (  ans  +  x  .  a  [   0   ]   [  i  ]   )   %  mod  ;   printf   (   "%d\n"   ,  ans  )   ;   }   return   0   ;   }   

HDU-2296Ring

HDU-2296Ring

Problem Description 
For the hope of a forever love, Steven is planning to send a ring to Jane with a romantic string engraved on. The string’s length should not exceed N. The careful Steven knows Jane so deeply that he knows her favorite words, such as “love”, “forever”. Also, he knows the value of each word. The higher value a word has the more joy Jane will get when see it. 
The weight of a word is defined as its appeared times in the romantic string multiply by its value, while the weight of the romantic string is defined as the sum of all words’ weight. You should output the string making its weight maximal. 
Input 
The input consists of several test cases. The first line of input consists of an integer T, indicating the number of test cases. Each test case starts with a line consisting of two integers: N, M, indicating the string’s length and the number of Jane’s favorite words. Each of the following M lines consists of a favorite word Si. The last line of each test case consists of M integers, while the i-th number indicates the value of Si. 
Technical Specification

  1. T ≤ 15
  2. 0 < N ≤ 50, 0 < M ≤ 100.
  3. The length of each word is less than 11 and bigger than 0.
  4. 1 ≤ Hi ≤ 100.
  5. All the words in the input are different.
  6. All the words just consist of ‘a’ - ‘z’.

Output 
For each test case, output the string to engrave on a single line. 
If there’s more than one possible answer, first output the shortest one. If there are still multiple solutions, output the smallest in lexicographically order. 
The answer may be an empty string. 
Sample Input 

7 2 
love 
ever 
5 5 
5 1 
ab 

Sample Output 
lovever 
abab 
Hint 
Sample 1: weight(love) = 5, weight(ever) = 5, so weight(lovever) = 5 + 5 = 10 
Sample 2: weight(ab) = 2 * 5 = 10, so weight(abab) = 10

分析 : 
AC自动机+DP:给出若干待权值的模式串,输出长度不超过n的最大权值且字典序最小的S。先建AC自动机,定义状态 d p [ s t e p , u ] dp[step,u] ] 表示长度为step、在u节点上的最大权值。状态转移方程为

  • d p [ s t e p , u ] = m a x ( d p [ s t e p − 1 , v ] + c o s t [ u ] ) dp[step,u]=max(dp[step-1,v]+cost[u]) a− 
    其中,v为能到达u的前一个节点,cost是权值。
                      #  include   <bits/stdc++.h>    using   namespace  std  ;   const   int  maxn  =   1003   ;   int  trie  [  maxn  ]   [   26   ]   ,  fail  [  maxn  ]   ;   int  val  [  maxn  ]   ,  cost  [   102   ]   ,  dp  [   55   ]   [  maxn  ]   ;   char  h  [   105   ]   [   15   ]   ;  string path  [   55   ]   [  maxn  ]   ;   int  t  ,  n  ,  m  ,  pos  ;   void   insert   (   char   *  s  ,   int  idx  )   {   int  ls  =   strlen   (  s  )   ,  u  =   0   ;   for   (   int  i  =   0   ;  i  <  ls  ;   ++  i  )   {   int  v  =  s  [  i  ]   -   'a'   ;   if   (  trie  [  u  ]   [  v  ]   ==   0   )  trie  [  u  ]   [  v  ]   =   ++  pos  ;  u  =  trie  [  u  ]   [  v  ]   ;   }  val  [  u  ]   =  cost  [  idx  ]   ;   }   void   getfail   (   )   {  queue  <   int   >  q  ;  fail  [   0   ]   =   0   ;   for   (   int  i  =   0   ;  i  <   26   ;   ++  i  )   {   if   (  trie  [   0   ]   [  i  ]   )   {  fail  [  trie  [   0   ]   [  i  ]   ]   =   0   ;  q  .   push   (  trie  [   0   ]   [  i  ]   )   ;   }   }   while   (   !  q  .   empty   (   )   )   {   int  cur  =  q  .   front   (   )   ;  q  .   pop   (   )   ;   for   (   int  i  =   0   ;  i  <   26   ;   ++  i  )   {   if   (  trie  [  cur  ]   [  i  ]   )   {  fail  [  trie  [  cur  ]   [  i  ]   ]   =  trie  [  fail  [  cur  ]   ]   [  i  ]   ;  q  .   push   (  trie  [  cur  ]   [  i  ]   )   ;   }   else  trie  [  cur  ]   [  i  ]   =  trie  [  fail  [  cur  ]   ]   [  i  ]   ;   }   }   }   bool   cmp   (  string s  ,  string t  )   {   if   (  t  ==   ""   )   return   true   ;   if   (  s  .   size   (   )   <  t  .   size   (   )   )   return   true   ;   if   (  s  .   size   (   )   >  t  .   size   (   )   )   return   false   ;   return  s  <  t  ;   }  string  solve   (   )   {   for   (   int  i  =   0   ;  i  <=  n  ;   ++  i  )   for   (   int  j  =   0   ;  j  <=  pos  ;   ++  j  )  path  [  i  ]   [  j  ]   =   ""   ;   int  mx  =   0   ;   for   (   int  i  =   0   ;  i  <=  n  ;   ++  i  )   {   for   (   int  j  =   0   ;  j  <=  pos  ;   ++  j  )   {   if   (  dp  [  i  ]   [  j  ]   ==   -   1   )   continue   ;   for   (   int  k  =   0   ;  k  <   26   ;   ++  k  )   {   if   (  dp  [  i  +   1   ]   [  trie  [  j  ]   [  k  ]   ]   <  dp  [  i  ]   [  j  ]   +  val  [  trie  [  j  ]   [  k  ]   ]   )   {  dp  [  i  +   1   ]   [  trie  [  j  ]   [  k  ]   ]   =  dp  [  i  ]   [  j  ]   +  val  [  trie  [  j  ]   [  k  ]   ]   ;  path  [  i  +   1   ]   [  trie  [  j  ]   [  k  ]   ]   =  path  [  i  ]   [  j  ]   +   (   char   )   (   'a'   +  k  )   ;   }   else   if   (  dp  [  i  +   1   ]   [  trie  [  j  ]   [  k  ]   ]   ==  dp  [  i  ]   [  j  ]   +  val  [  trie  [  j  ]   [  k  ]   ]   )   {   if   (   cmp   (  path  [  i  ]   [  j  ]   +   (   char   )   (   'a'   +  k  )   ,  path  [  i  +   1   ]   [  trie  [  j  ]   [  k  ]   ]   )   )  path  [  i  +   1   ]   [  trie  [  j  ]   [  k  ]   ]   =  path  [  i  ]   [  j  ]   +   (   char   )   (   'a'   +  k  )   ;   }   }   if   (  i  >   0   )  mx  =   max   (  mx  ,  dp  [  i  ]   [  j  ]   )   ;   }   }   if   (  mx  ==   0   )   return   ""   ;  string res  =   ""   ;   for   (   int  i  =   1   ;  i  <=  n  ;   ++  i  )   for   (   int  j  =   0   ;  j  <=  pos  ;   ++  j  )   {   if   (  dp  [  i  ]   [  j  ]   ==  mx  &&   cmp   (  path  [  i  ]   [  j  ]   ,  res  )   )   {  res  =  path  [  i  ]   [  j  ]   ;   }   }   return  res  ;   }   int   main   (   )   {   scanf   (   "%d"   ,   &  t  )   ;   while   (  t  --   )   {   memset   (  trie  ,   0   ,   sizeof   (  trie  )   )   ;   memset   (  val  ,   0   ,   sizeof   (  val  )   )   ;   memset   (  dp  ,   -   1   ,   sizeof   (  dp  )   )   ;  pos  =  cost  [   0   ]   =  dp  [   0   ]   [   0   ]   =   0   ;   scanf   (   "%d%d"   ,   &  n  ,   &  m  )   ;   for   (   int  i  =   0   ;  i  <  m  ;   ++  i  )   scanf   (   "%s"   ,  h  [  i  ]   )   ;   for   (   int  i  =   1   ;  i  <=  m  ;   ++  i  )   scanf   (   "%d"   ,   &  cost  [  i  ]   )   ;   for   (   int  i  =   0   ;  i  <  m  ;   ++  i  )   insert   (  h  [  i  ]   ,  i  +   1   )   ;   getfail   (   )   ;  cout  <<   solve   (   )   <<   "\n"   ;   }   return   0   ;   }   

原创不易,请勿转载 ( 本不富裕的访问量雪上加霜 ) 
博主首页: https://blog.csdn.net/qq_45034708 
如果文章对你有帮助,记得一键三连❤

posted @ 2020-09-30 15:36  小琪琪来啦  阅读(266)  评论(0编辑  收藏  举报
欢迎大家来到我的博客:武汉雅思 | 武汉托福 | 新航道 | 雅思培训 | dnfsf | 天龙sf | 热血江湖sf | 天龙sf | dnfsf