grep has gained a few competitors over the years. ack and grin both aim to fill in the gaps in grep's functionality, and
provide a style of interaction that is focused on searching large code
repositories. By default, they search recursively, colorize their output, and
ignore certain "obvious" files that no one generally wants to search through.
The latter both cuts noise out of search results, and allows them to run
faster.

I've used all three over the years; mostly grep and grin. I've encountered
surprising issues with ack and grin, both in terms of performance and behavior:

ack quickly scared me away with its default behavior of only searching
files that match a hard-coded whitelist of file suffixes. This appeared to
be addressed with the addition of the -a option, but it turns
out: not really.

With grin, I recently realized that the performance of 1.1 was
shockingly bad, but that horror was short-lived after realizing that grin
1.2 was out, and fixed the problem.

The discovery and resolution of these issues has caused me to switch back
and forth a few times, and finally I decided to do a performance test of all
three, just to make sure I wasn't missing out on something.

An attempt at a speed comparison

Considering that the files/directories that you ignore from your search will
play a big role in determining how fast your search executes, I wanted to
equalize the set of files/directories that the three tools ignore, to give a
fair comparison. This turned out to be problematic. It resulted in the
following commands:

I ran this on my personal code projects directory, which has all kinds of
random stuff that's built up over the years. Code, images, executables,
archive files, SQLite databases, and so on. Here are the results:

grep

12.524s

ack

25.873s

grin

14.703

There were a few exceptions with grin; it can't ignore file patterns other
than literal suffixes, so ack's default ignoring of #.+#\$,
[._].*\\.swp\$, and core\\.\\d+\$ could not be
applied to grin. Thankfully, all of those filenames are pretty rare, so it
shouldn't matter much.

ack's file suffix ignoring mechanism, which is a bit circuitous, turns out
to be completely disabled when using -a. It is impossible to
ignore any files when using -a; only literal directory names (no
globs). In other words, you can modify ack's whitelist of file suffixes, but
if you want to forego the whitelist and use a blacklist approach instead,
tough luck. You either use the whitelist or you search everything.

If you are okay with the whitelist approach, ack is pretty close in
performance to the other two. It performed the same search as above in in
13.988 seconds. This number can't be compared strictly to the others, but it's
as close as I can get.

So in short, performance is fairly uniform, with grep being the fastest by a
fairly small margin (around 10-20%).

Filtering files and general usability

grep

grep did not have the --exclude-dir option until
version 2.5.3. That was released in 2007 or 2008 (it's surprisingly
hard to track down the date), but Ubuntu 10.04 is still using grep 2.5.1. In
light of this, and to be fair to grep with regard to any recent performance
enhancements, I installed the newest (2.6.3) package from Launchpad.

Now that I had the --exclude-dir option available to
me, I had a lot of trouble with it. If you tell it to ignore .*
(any "dot directories"), and you then pass . as the directory to
search, it will immediately exit without having done anything. It might seem
obvious why when I state it that way, but I was truly baffled for a little
while. One solution is to pass `pwd` instead of .;
But now, all of the filenames in your search results will have their full,
absolute path shown, and that's usually quite long and ridiculous to sift
through. Another solution is to never ignore .*, but rather
ignore specific names like .git and .svn. You can
even ignore almost every dot-dir you'll encounter in the real world by using
--exclude-dir='.[a-zA-Z0-9]*'. This will fail if a dot-dir starts
with anything other than an ascii alphanumeric character, but it should be good
for the vast majority of cases. By the way, .?* and
.??* mysteriously do not work. For me, they
prevent grep from recursing. I don't understand that at all. It may be some
weird artifact of the options grep is passing to fnmatch().

grep also fails to exclude a directory glob that
looks/like/this*. I'm not sure why this happens either.

Beyond those issues, grep, unlike ack and grin, has a pretty complete set of
options for excluding files and directories.

There are some other issues with grep. You probably know about these. They
all mostly have solutions now.

Regex syntax is limited. Solution: use -E or -P.

No coloring. Solution: use --color=auto.

Annoying error messages on broken symlinks and other filesystem
oddities. Solution: use -s.

grin

I've had a couple issues with grin over time. The first is a lack of a
-w (word) option. You can simulate it by doing
\bpattern\b, but that's pretty tedious. The author did not seem
very interested in implementing this feature when I asked. ack and grep have
it.

My other issue with grin is well known by the author:

[...] setuptools installs scripts indirectly; the scripts
installed to $prefix/bin or Python2xScripts use setuptools' pkg_resources
module to load the exact version of grin egg that installed the script,
then runs the script's main() function. This [...] can add substantial
startup overhead [...]. If you want the response of grin to be snappier, I
recommend installing custom scripts that just import the grin module and
run the appropriate main() function.
-- From the grin PyPI page

Not only does the default script start up a bit slower than it could, but if
you hit control-C soon after grin starts up, you might get an ugly Python
traceback, because grin hasn't gotten to its KeyboardInterrupt try/catch
statement yet.

This is more a Python packaging limitation than a problem with grin per se.
Nonetheless, it's another annoyance to deal with as a user, and fair or not,
it makes it less appealing.

ack

Its default whitelisting behavior is a really poor choice in my opinion. If
it isn't familiar with a given file extension, it will simply ignore it. Since
you have no idea it's ignoring it, you won't know that you missed something
until there is some unfortunate side effect. That can be downright dangerous
when refactoring big, old, ugly code that has stuff in all kinds of
unpredictable filenames. A coworker and I have both sadly run into this
problem while working on a messy legacy PHP project where some PHP files had
names ending with ".inc". I can picture this whitelist behavior biting a lot
of people in the ass when they don't realize that's how ack works.

The default whitelisting approach might be forgivable if it were possible to
turn it off and go with a blacklisting approach, but that, according to the
author, is simply not supported.

A summary of the file exclusion madness

These are all of the variations of file/directory exclusions I could dream
up, and their support across these three tools:

Excluding Files

grep

grin

ack

fixed name (foo)

OK

suffixes only

-

glob (fo*)

OK

-

-

fixed name w/path (path/to/foo)

OK

-

-

glob w/path (path/to/fo*)

-

-

-

Excluding Directories

grep

grin

ack

fixed name (foo)

OK

OK

OK

glob (fo*)

OK

-

-

fixed name w/path (path/to/foo)

OK

-

-

glob w/path (path/to/fo*)

OK

-

-

The ultimate grep setup

This brings grep 95% of the way towards doing what I appreciate about grin
and ack. You do still have to pass the directory name to search, whereas ack
and grin will default to the current directory if you don't tell them
otherwise. However, I can live with typing another space and period.

The cgrep alias will force colors on, which you can pipe through
less -R if you want to page the output.

And dreams for the future

What I think would work amazingly would be a hierarchical set of exclusion
rule files. Let's give them the filename .grepignore. You could
have a .grepignore in your home directory which would list
files/dirs you always want to ignore. Then in each project's
directory (which are children of your home directory) you could have another
.grepignore file that would ignore the specific files that you
want to ignore in that project. $GREP would then ignore the superset of all
the .grepignore files from / down to the directory
you're in. This seems like it would be elegant, simple, and effective.

I looked into implementing something like this in grin, but it turns out
there may be a good reason for grin not being able to ignore
multi/level/paths/with/glob*s -- Python's fnmatch function is (re-)implemented
in pure Python and does not use the system fnmatch. Thus, it's impossible to
use the FNM_PATHNAME flag, which enables sane multi-level
globbing. Python's fnmatch thinks that the glob foo*bar matches
foo/x/y/z/bar, which is strange and contrary to most other
tools.

Implementing hierarchical exclusion rule files in grep would certainly be
more laborious, since it's written in C instead of Python. I may try doing it
with some kind of wrapper script instead. Anyone wanna beat me to it?