JavaScript效率PK——统计特定字符在字符串中出现的次数

2011年7月15日23:34:18
效率PK —— 统计字符串中字符出现次数
原文见:javascript 统计哪个字符出现的次数最多–修正版

var str = "The officials say tougher legislation is needed because some \
telecommunications companies in recent years have begun new services and made \
system upgrades that create technical obstacles to surveillance. They want to \
increase legal incentives and penalties aimed at pushing carriers like Verizon, \
AT&T, and Comcast to ensure that any network changes will not disrupt their \
ability to conduct wiretaps." +
    "An Obama administration task force that includes officials from the \
Justice and Commerce Departments, the F.B.I. and other agencies recently began \
working on draft legislation to strengthen and expand a 1994 law requiring \
carriers to make sure their systems can be wiretapped. There is not yet \
agreement over the details, according to officials familiar with the \
deliberations, but they said the administration intends to submit a package to \
Congress next year." +
    "To bolster their case, security agencies are citing two previously \
undisclosed episodes in which major carriers were stymied for weeks or even \
months when they tried to comply with court-approved wiretap orders in criminal \
or terrorism investigations, the officials said.",
  count = 0,
  index = 0,
  arrStr = [],
  oLetter = {};
str = str.replace(/\s/g,''); // 之前的Method_3和normal不对,原来是漏了这里

for (var i = 0; i < 5000; i++) { //create a long text
    arrStr.push(str);
}

str = arrStr.join(""); // 原来的代码这里为什么要用","?我发现他的代码也会统计",",所以把","删掉了。

if(! ('console' in this || 'console' in window) ){ // 专给无console的解析器
  console = {
    stacks : [],
    log : function(str){
      stacks.push(str);
    },
    show : function(){
      alert(console.stacks.join('\n'));
      console.stacks = [];
    }
  }
}

我的方法,使用str.replace(RegExp,Function) 进行遍历
关于str.replace(RegExp, function)的用法,请参考我的上一篇随笔《JavaScript replace(RegExp, Function)详解

function method_replace_RegExp_function(){
  function counter(match) {  // 用于统计的函数
    if(visited[match]){
      visited[match]++;
    } else {
      visited[match] = 1;
    }
  }
  var count = 0, index = 0, arrStr=[], visited = {};
  var begin = (new Date()).getTime();

  str.replace(/\S/g, counter);
  for (var i in visited) {
    if (visited[i] > count) {
      count = visited[i];
      index = i;
    }
  }

  var end = + new Date();
  console.log("Method_replace_RegExp_Function:\n出现次数最多的是" + index + ",一共出现" + count + "次", "耗时:" + (end - begin) + "毫秒");
}
// 又想到的Normal方法
function method_normal(){
  var count = 0, index = 0, arrStr = [], visited = {}, tmp = '';
  var begin = (new Date()).getTime();

  for(var i = 0; i < str.length; i++){
    tmp = str.charAt(i);
    if(visited[tmp]){
      visited[tmp]++;
    } else {
      visited[tmp] = 1;
    }
  }
  for (var i in visited) {
    if (visited[i] > count) {
      count = visited[i];
      index = i;
    }
  }

  var end = + new Date();
  console.log("Method_normal:\n出现次数最多的是" + index + ",一共出现" + count + "次", "耗时:" + (end - begin) + "毫秒");
}

method_2();
method_3();
method_replace_RegExp_function();
method_normal();
(!!console.show)?console.show():void 0;
 //给不支持console的浏览器使用的

几个环境下的输出结果:

傲游 3.1.3.600
Method_2:
出现次数最多的是e,一共出现610000次 耗时:7128毫秒
Method_3:
出现次数最多的是e,一共出现610000次 耗时:6757毫秒
Method_replace_RegExp_Function:
出现次数最多的是e,一共出现610000次 耗时:4399毫秒
Method_normal:
出现次数最多的是e,一共出现610000次 耗时:5925毫秒

Node.exe 2011.07.14 v0.5.1 http://nodejs.org
> method_2();
Method_2:
出现次数最多的是e,一共出现610000次 耗时:3141毫秒
> method_3();
Method_3:
出现次数最多的是e,一共出现610000次 耗时:1560毫秒
> //method_replace_RegExp_function();
//这个会直接死掉……
> method_normal();
Method_normal:
出现次数最多的是e,一共出现610000次 耗时:1045毫秒

FireFox 3.6.3 FireBug 1.7.3
Method_2:
出现次数最多的是e,一共出现610000次 耗时:12046毫秒
Method_3:
出现次数最多的是e,一共出现610000次 耗时:10488毫秒
Method_replace_RegExp_Function:
出现次数最多的是e,一共出现610000次 耗时:6836毫秒
Method_normal:
出现次数最多的是e,一共出现610000次 耗时:5351毫秒

IE9:
日志: Method_2:
出现次数最多的是e,一共出现610000次耗时:18411毫秒
日志: Method_3:
出现次数最多的是e,一共出现610000次耗时:10968毫秒
日志: Method_replace_RegExp_Function:
出现次数最多的是e,一共出现610000次耗时:1651毫秒
日志: Method_normal:
出现次数最多的是e,一共出现610000次耗时:12339毫秒

总结:不能迷信正则表达式的强大搜索功能,正则的每一次匹配过程就是一次循环
所以正则的匹配不能用太多,善用String.replace(RegExp, Function)才是高效的选择。

 

http://www.cn09.com/thread-150-1-1.html

posted @ 2011-07-16 19:13  Arliang  阅读(3058)  评论(3编辑  收藏  举报