快速的字符串查找算法(Boyer-Moore)
2005-07-05 10:53 灵感之源 阅读(12339) 评论(5) 编辑 收藏 举报在CodeProject,Efficient Boyer-Moore Search in Unicode Strings,作者leseul 展示了Boyer-Moore算法的威力,代码这里下载:
Download source - 10.2 Kb
Download demo project - 5.18 Kb
我写了一个性能测试:
public static void Main()
{
string pattern = "AbC";
string target = "AbCaBc";
string pressure;
StringBuilder b = new StringBuilder();
int count = 10000000;
for (int i = 0; i < count; i++)
{
b.Append(target);
}
pressure = b.ToString();
//BM without case senstive
HiPerfTimer time = new HiPerfTimer();
time.Start();
CIBMSearcher BMS = new CIBMSearcher(pattern, false);
int index = BMS.Search(pressure, 0);
while (index >= 0)
{
index = BMS.Search(pressure, index + pattern.Length);
}
time.Stop();
Console.WriteLine("BM without case senstive:" + time.Duration);
GC.Collect();
//BM with case senstive
time = new HiPerfTimer();
time.Start();
BMS = new CIBMSearcher(pattern, true);
index = BMS.Search(pressure, 0);
while (index >= 0)
{
index = BMS.Search(pressure, index + pattern.Length);
}
time.Stop();
Console.WriteLine("BM with case senstive:" + time.Duration);
GC.Collect();
//SubString without case senstive
time = new HiPerfTimer();
time.Start();
index = pressure.IndexOf(pattern);
while (index >= 0)
{
index = pressure.IndexOf(pattern, index + pattern.Length);
}
time.Stop();
Console.WriteLine("SS without case senstive:" + time.Duration);
GC.Collect();
Console.ReadLine();
}
结果如下:
BM without case senstive:1.2411443536895
BM with case senstive:0.707685620917367
SS without case senstive:1.77157282256596
SS是SubString。
我的电脑是PIV 2.8G + 1GRAM 。
BM的威力可见一斑,估计我之前写的高效的忽略大小写的字符串替换(Replace)函数(多种方法比较)可以大大改善了。
高效的算法的意义就在于此啊!这个算法现暂不研究,今天太忙,得看看今晚是否有时间研究一下。
注释:代码不格式化是因为代码插入功能有错误,无法使用。
点击这里下载我的写的测试代码