Keywords Search(AC自动机)

 In the modern time, Search engine came into the life of everybody like Google, Baidu, etc. 
 Wiskey also wants to bring this feature to his image retrieval system. 
 Every image have a long description, when users type some keywords to find the image, the system will match the keywords with description of image and show the image which the most keywords be matched. 
 To simplify the problem, giving you a description of image, and some keywords, you should tell me how many keywords will be match. 

Input

 First line will contain one integer means how many cases will follow by. 
 Each case will contain two integers N means the number of keywords and N keywords follow. (N <= 10000) 
 Each keyword will only contains characters 'a'-'z', and the length will be not longer than 50. 
 The last line is the description, and the length will be not longer than 1000000. 
Output

 Print how many keywords are contained in the description.

Sample Input

1
5
she
he
say
shr
her
yasherhs

Sample Output

3

多模匹配,AC自动机的模板题。

复制代码
 1 #include <bits/stdc++.h>
 2 
 3 using namespace std;
 4 
 5 int tree[500005][28], counword[500005], fail[500005];
 6 int coun;
 7 
 8 void AddWord(char a[], int len)   //建树,把每一个模式串加到树上
 9 {
10     int root, i;
11     root = 0;
12     for(i=0;i<len;i++)
13     {
14         if(tree[root][a[i]-'a']==0) tree[root][a[i]-'a'] = ++coun;   
//tree为0表示为空(就算是root也没表示字母),
//那么就给当前节点赋上++coun,一一对应的标记节点
15 root = tree[root][a[i]-'a']; 16 } 17 counword[root]++; //当前节点表示的单词个数 18 } 19 20 void getfail() //fail指针的意义类似于KMP的next数组,当前节点匹配失败的时候,
//我们可以借助fail到达前缀能对应上当前串后缀的节点,匹配继续
21 { 22 queue <int> q; //每个节点都有自己的fail指针,用队列挨个看 23 int now, i; 24 fail[0] = 0; 25 for(i=0;i<26;i++) //第二层不为空的全部垫在队列里 26 { 27 if(tree[0][i]) 28 { 29 fail[tree[0][i]] = 0; 30 q.push(tree[0][i]); 31 } 32 }
33 while(!q.empty()) 34 { 35 now = q.front(); 36 q.pop(); 37 for(i=0;i<26;i++) 38 { 39 if(tree[now][i])
//now的该子节点不为空,fail指向的是now(i的父亲)的fail的i儿子,节点不为空,有意义,进队列
40 { 41 fail[tree[now][i]] = tree[fail[now]][i]; 42 q.push(tree[now][i]); 43 } 44 else tree[now][i] = tree[fail[now]][i];
//该节点为空,该节点不要,其编号被赋成now(i的父亲)的fail的i儿子,
//节点本来为空,不要进队列,否则会有相同节点重复进队列
45 } 46 } 47 } 48 49 long long query(char a[], int len) //树和fail准备就绪,跑一遍文本串出结果 50 { 51 int now, i, j; 52 long long re; 53 now = 0; 54 re = 0; 55 for(i=0;i<len;i++) 56 { 57 now = tree[now][a[i]-'a']; 58 for(j=now;j&&counword[j]!=-1;j=fail[j]) 59 { 60 re+=counword[j]; 61 counword[j] = -1; 62 } 63 } 64 return re; 65 } 66 67 int main() 68 { 69 int t, n, len, i; 70 long long re; 71 char a[55], txt[1000005]; 72 scanf("%d", &t); 73 while(t--) 74 { 75 coun = 0; 76 memset(counword, 0, sizeof(counword)); 77 memset(tree, 0, sizeof(tree)); 78 scanf("%d", &n); 79 getchar(); 80 for(i=0;i<n;i++) 81 { 82 scanf("%s", a); 83 len = strlen(a); 84 AddWord(a, len); 85 } 86 getfail(); 87 scanf("%s", txt); 88 len = strlen(txt); 89 re = query(txt, len); 90 printf("%lld\n", re); 91 } 92 return 0; 93 }

复制代码

 

单模匹配KMP,多模用字典树+fail



posted @   Xxiaoyu  阅读(472)  评论(0编辑  收藏  举报
编辑推荐:
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
阅读排行:
· winform 绘制太阳,地球,月球 运作规律
· AI与.NET技术实操系列(五):向量存储与相似性搜索在 .NET 中的实现
· 超详细:普通电脑也行Windows部署deepseek R1训练数据并当服务器共享给他人
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 上周热点回顾(3.3-3.9)
点击右上角即可分享
微信分享提示