c++与perl在正则表达式运算速度上的比较

写了两组代码文件，组内代码功能相同：

testv.pl vs testv.cpp

testreg.pl vs testreg.cpp

代码如下：

////////testreg.cpp/////////
#include<iostream>
#include<fstream>
#include<regex>
using namespace std;
int main(int argv, char ** argc)
{
   fstream in(argc[1], fstream::in);
   int line_count;
   string line_content;
   regex reg("[ATCG]");
   while(getline(in, line_content))
   {
       line_count++;
       if(line_count % 4 == 2)
       {
           if(regex_search(line_content, reg))
           {
               cout<<1<<endl;
           }
       }
   }
   return 0;
}
////////testreg.pl/////////
#!/usr/bin/perl
use strict;
use 5.010;
my $file = shift;
open SEQ, '<', $file or die "$!";
while(<SEQ>) {
        chomp;
        if($. % 4 == 2) {
       if(/[ATCG]/) {
           say 1;
       }
   }
}
////////testv.cpp/////////
#include<iostream>
#include<fstream>
#include<unordered_map>
using namespace std;
int main(int argv, char ** argc)
{
   fstream in(argc[1], fstream::in);
   int line_count;
   string line_content;
   typedef unordered_map<string, int> mapdef;
   mapdef mymap;
   while(getline(in, line_content))
   {
       line_count++;
       if(line_count % 4 == 2)
       {
           mymap[line_content]++;
       }
   }
   cout<<mymap.size()<<endl;
   return 0;
}
////////testv.pl/////////
#!/usr/bin/perl

use strict;
use 5.010;

my $file = shift;
open SEQ, '<', $file or die "$!";
my %hash;
while(<SEQ>) {
   chomp;
   if($. % 4 == 2) {
       $hash{$_}++;
   }
}
say scalar(keys %hash);

使用shell命令，计算运行时间，结果如下：

time perl testv.pl Input

time ./a.out Input

time perl testreg.pl Input | wc -l

time ./a.out Input | wc -l

	real	user	sys
testv.pl	0m0.141s	0m0.121s	0m0.011s
testv.cpp	0m0.077s	0m0.054s	0m0.012s
testreg.pl	0m0.142s	0m0.122s	0m0.006s
testreg.cpp	0m0.251s	0m0.104s	0m0.137s

其中，Input是fastq文件，含有54914DNA序列。

可以看出在涉及正则表达式运算时， c++明显不占优势，要卡一两面才输出结果

posted on 2015-08-26 11:46 Namlike 阅读(419) 评论(0) 收藏举报