天下有比grep实现更好、更快、更强大的grep吗

如果你没有看到这个软件，看到它的介绍，真正试用，真的是很难相信。这个软件的名字叫ack , 其网站标榜: "better than grep, a search tool for programmers" 一个gentoo开发人员对其评介为”令人激动的、在某些情况下可替换grep的新工具"，并且他还对 ack 站点列出的“十大胜出理由“一一做了解释。但是这个软件即使更快也不能完全取代grep，因为它运行的时候用到perl解释器及perl标准模块。既然用perl解释的，那么它怎么能比c写的grep快呢？ That's really a problem. 这个gentoo开发人员在写一个ack的每日小技巧的文档，链接是： ack 每日小技巧

十大胜出理由:

It's blazingly fast because it only searches the stuff you want searched.

Wait, how does it know what I want? A DWIM-Interface at last? Not quite. First off, ack is faster than grep for simple searches. Here's an example:
```
$ time ack 1Jsztn-000647-SL exim_main.log >/dev/null
real    0m3.463s
user    0m3.280s
sys     0m0.180s
$ time grep -F 1Jsztn-000647-SL exim_main.log >/dev/null
real    0m14.957s
user    0m14.770s
sys     0m0.160s
```
Two notes: first, yes, the file was in the page cache before I ran ack; second, I even made it easy for grep by telling it explicitly I was looking for a fixed string (not that it helped much, the same command without -F was faster by about 0.1s). Oh and for completeness, the exim logfile I searched has about two million lines and is 250M. I've run those tests ten times for each, the times shown above are typical.

So yes, for simple searches, ack is faster than grep. Let's try with a more complicated pattern, then. This time, let's use the pattern (klausman|gentoo) on the same file. Note that we have to use -E for grep to use extended regexen, which ack in turn does not need, since it (almost) always uses them. Here, grep takes its sweet time: 3:56, nearly four minutes. In contrast, ack accomplished the same task in 49 seconds (all times averaged over ten runs, then rounded to integer seconds).

As for the "being clever" side of speed, see below, points 5 and 6
ack is pure Perl, so it runs on Windows just fine.

This isn't relevant to me, since I don't use windows for anything where I might need grep. That said, it might be a killer feature for others.
The standalone version uses no non-standard modules, so you can put it in your ~/bin without fear.

Ok, this is not so much of a feature than a hard criterion. If I needed extra modules for the whole thing to run, that'd be a deal breaker. I already have tons of libraries, I don't need more undergrowth around my dependency tree.
Searches recursively through directories by default, while ignoring .svn, CVS and other VCS directories.

This is a feature, yet one that wouldn't pry me away from grep: -r is there (though it distinctly feels like an afterthought). Since ack ignores a certain set of files and directories, its recursive capabilities where there from the start, making it feel more seamless.
ack ignores most of the crap you don't want to search

To be precise:
- VCS directories
- blib, the Perl build directory
- backup files like foo~ and #foo#
- binary files, core dumps, etc.
Most of the time, I don't want to search those (and have to exclude them with grep -v from find results). Of course, this ignore-mode can be switched off with ack (-u). All that said, it sure makes command lines shorter (and easier to read and construct). Also, this is the first spot where ack's Perl-centricism shows. I don't mind, even though I prefer that other language with P.
Ignoring .svn directories means that ack is faster than grep for searching through trees.

Dupe. See Point 5
Lets you specify file types to search, as in --perl or --nohtml.

While at first glance, this may seem limited, ack comes with a plethora of definitions (45 if I counted correctly), so it's not as perl-centric as it may seem from the example. This feature saves command-line space (if there's such a thing), since it avoids wild find-constructs. The docs mention that --perl also checks the shebang line of files that don't have a suffix, but make no mention of the other "shipped" file type recognizers doing so.
File-filtering capabilities usable without searching with ack -f. This lets you create lists of files of a given type.

This mostly is a consequence of the feature above. Even if it weren't there, you could simply search for "."
Color highlighting of search results.

While I've looked upon color in shells as kinda childish for a while, I wouldn't want to miss syntax highlighting in vim, colors for ls (if they're not as sucky as the defaults we had for years) or match highlighting for grep. It's really neat to see that yes, the pattern you grepped for indeed matches what you think it does. Especially during evolutionary construction of command lines and shell scripts.
Uses real Perl regular expressions, not a GNU subset

Again, this doesn't bother me much. I use egrep/grep -E all the time, anyway. And I'm no Perl programmer, so I don't get withdrawal symptoms every time I use another regex engine.
Allows you to specify output using Perl's special variables

This sounds neat, yet I don't really have a use case for it. Also, my perl-fu is weak, so I probably won't use it anyway. Still, might be a killer feature for you.

The docs have an example:

ack '(Mr|Mr?s). (Smith|Jones)' --output='$&'
Many command-line switches are the same as in GNU grep:

Specifically mentioned are -w, -c and -l. It's always nice if you don't have to look up all the flags every time.
Command name is 25% fewer characters to type! Save days of free-time! Heck, it's 50% shorter compared to grep -r

Okay, now we have proof that not only the ack webmaster can't count, he's also making up reasons for fun. Works for me.

posted on 2010-06-18 21:11 drswinghead 阅读(1016) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

drswinghead

导航

公告

天下有比grep实现更好、更快、更强大的grep吗

十大胜出理由:

It's blazingly fast because it only searches the stuff you want searched.

ack is pure Perl, so it runs on Windows just fine.

The standalone version uses no non-standard modules, so you can put it in your ~/bin without fear.

Searches recursively through directories by default, while ignoring .svn, CVS and other VCS directories.

ack ignores most of the crap you don't want to search

Ignoring .svn directories means that ack is faster than grep for searching through trees.

Lets you specify file types to search, as in --perl or --nohtml.

File-filtering capabilities usable without searching with ack -f. This lets you create lists of files of a given type.

Color highlighting of search results.

Uses real Perl regular expressions, not a GNU subset

Allows you to specify output using Perl's special variables

Many command-line switches are the same as in GNU grep:

Command name is 25% fewer characters to type! Save days of free-time! Heck, it's 50% shorter compared to grep -r