KMP与sunday的比较
有关sunday算法的讲解:http://blog.csdn.net/caianye/article/details/6096610
以下转载请注明出处 by CrazyAC
1.求模式串出现在文本串的第一个位置
情况1:
char src[]="jfkdsahdiojdaigfgthlipjgffg";
char des[]="gffg";
KMP:(匹配了49次)
#include <iostream> using namespace std; char src[]= "jfkdsahdiojdaigfgthlipjgffg" ; char des[]= "gffg" ; int next[10]; int n, m; void getNext() { int i, j; i = 0; j = -1; n = strlen ( src ); m = strlen ( des ); next[0] = -1; while ( i < m ) { if ( j == -1 || des[i] == des[j]) { i ++; j ++; next[i] = j; } else { j = next[j]; } } } void solve() { int i, j; i = j = 0; int cnt = 0; while ( i < n && j < m ) { cnt ++; if ( j == -1 || src[i] == des[j] ) { i ++; j ++; } else { j = next[j]; } } if ( j == m ) printf ( "%d\n" , i-j+1 ); else printf ( "-1\n" ); printf ( "cnt = %d\n" , cnt ); } int main() { // freopen( "c:/aaa.txt", "r", stdin ); getNext(); solve(); return 0; } |
sunday: (匹配了10次)
#include <iostream> using namespace std; char src[]= "jfkdsahdiojdaigfgthlipjgffg" ; char des[]= "gffg" ; int next[26], cnt; int sunday() { int i, j, pos, sum; int len_s = strlen ( src ); int len_d = strlen ( des ); for ( i=0; i<26; ++i ) next[i] = len_d + 1; for ( i=0; i<len_d; ++i ) next[des[i]- 'a' ] = len_d - i; pos = sum = 0; while ( pos < (len_s-len_d+1) ) { for ( i=0; i<len_d; ++i ) { ++ cnt; if ( src[pos+i] != des[i] ) { pos += next[ src[pos+len_d] - 'a' ]; break ; } } if ( i == len_d ) return pos; } return -1; } int main() { cnt = 0; printf ( "%d\n" , sunday() ); printf ( "cnt = %d\n" , cnt ); return 0; } |
可见sunday的优越性。但是如果测试数据为
char src[]="aaaaaaaaaaaaaaaaaaaaba";
char des[]="aaaaaaaba";
KMP匹配了35次,sunday匹配了105次!、
2.求模式串在文本串中出现的次数。hdoj 1686
sunday超时
#include <iostream> #include <cstring> #include <cstdio> using namespace std; char src[1000005], des[10005], next[26]; int sunday() { int i, j, pos, sum; int len_s = strlen ( src ); int len_d = strlen ( des ); for ( i=0; i<26; ++i ) next[i] = len_d + 1; for ( i=0; i<len_d; ++i ) next[des[i]- 'A' ] = len_d - i; pos = sum = 0; while ( pos < (len_s-len_d+1) ) { for ( i=0; i<len_d; ++i ) { if ( src[pos+i] != des[i] ) { pos += next[ src[pos+len_d] - 'A' ]; break ; } } if ( i == len_d ) { ++sum; if ( pos + len_d == len_s ) break ; pos += next[ src[pos+len_d] - 'A' ]; } } return sum; } int main() { // freopen( "c:/aaa.txt", "r", stdin); int T; scanf ( "%d" , &T ); while ( T-- ) { scanf ( "%s %s" , des, src ); printf ( "%d\n" , sunday() ); } return 0; } |
KMP: 93MS
#include <iostream> #include <cstdio> #include <cstring> using namespace std; char src[1000010], des[10010]; int next[10010]; int len_d, len_s; void getNext() { int i, j; i = 0; j = -1; next[0] = -1; while ( i<len_d ) { if ( j == -1 || des[i] == des[j] ) { i ++; j ++; next[i] = j; } else { j = next[j]; } } } void solve() { int i, j, sum = 0; i = j = 0; while ( i < len_s ) { if ( j == -1 || src[i] == des[j] ) { i ++; j ++; } else { j = next[j]; } if ( j == len_d ) { ++ sum; j = next[j]; } } printf ( "%d\n" , sum ); } int main() { // freopen( "c:/aaa.txt", "r", stdin); int T; scanf ( "%d" , &T ); while ( T-- ) { scanf ( "%s %s" , des, src ); len_d = strlen ( des ); len_s = strlen ( src ); getNext(); solve(); } return 0; } |
综上所述,在信息学竞赛中sunday算法的优势得不到体现,那是因为ACM比赛对时间的重视,所以测试数据会卡你时间,而这些卡你时间的测试数据又往往把sunday给卡住了
就像上面这组数据
char src[]="aaaaaaaaaaaaaaaaaaaaba";
char des[]="aaaaaaaba";
根据sunday的原理,当一直匹配到b时,
aaaaaaaaaaaaaaaaaaaaba
aaaaaaaba
发生不匹配,根据sunday中,移动步长=匹配串中最右端的该字符到末尾的距离+1 ,而最右端是‘a',’a'这个字符到末尾的距离为0,所以步长为1,
又要从第二个'a'开始匹配
aaaaaaaaaaaaaaaaaaaaba
aaaaaaaba
如此和暴力就相差无几了。
对KMP来说,当到b不匹配时,i = j = 7,然后,j = next[j] = 6,所以,下次只是对src[i]与des[j]进行比较
aaaaaaaaaaaaaaaaaaaaba
aaaaaaaba
前面那一串红色的a的比较就可以省掉了,而sunday是没有省掉的。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· 没有Manus邀请码?试试免邀请码的MGX或者开源的OpenManus吧
· 园子的第一款AI主题卫衣上架——"HELLO! HOW CAN I ASSIST YOU TODAY
· 【自荐】一款简洁、开源的在线白板工具 Drawnix