Searching Files on UNIX

On MPE you can display files
using the :Print command, Fcopy, Magnet,
or Qedit (with pattern
match searches). On HP-UX
you can display files using cat and even better using
more (and string search using the slash "/" command), and
Qedit (including searches of $Include files, and so on),
but if you really want to search for patterns of text
like a UNIX guru, grep is the tool for you.

MPE users will take a while to remember that
more, like most UNIX tools,
responds to a Return by printing the next line,
not the next screen. Use the Spacebar to print the next
page. Type "q" to quit.
To scan ahead to find a string pattern, type "/" and enter
a regular expression to match.
For further help, type "h".

The grep program is a standard
UNIX
utility that searches through a set of files for an arbitrary text
pattern, specified through a
regular expression.
Also check the man pages as well
for egrep and fgrep. The
MPE
equivalents are
MPEX
and
Magnet, both third-party products.
By default, grep is case-sensitive (use -i to ignore case).
By default, grep ignores the context of a string (use -w to
match words only).
By default, grep shows the lines that match (use -v to show
those that don't match).

Regular Expressions are a feature of UNIX.
They describe a pattern to match, a sequence of characters,
not words, within a line of text.
Here is a quick summary of the special characters used
in the grep tool and their meaning:

match any one of the enclosed characters, as in [aeiou].
Use Hyphen "-" for a range, as in [0-9].

[^ ]

=

match any one character except those enclosed in [ ], as in [^0-9].

. (Period)

=

match a single character of any value, except end of line.

* (Asterisk)

=

match zero or more of the preceding character or expression.

\{x,y\}

=

match x to y occurrences of the preceding.

\{x\}

=

match exactly x occurrences of the preceding.

\{x,\}

=

match x or more occurrences of the preceding.

As an MPE user, you may
find regular expressions difficult to use at first.
Please persevere, because they are used in many
UNIX tools, from more to perl.
Unfortunately, some tools use simple regular expressions and others use extended
regular expressions and some extended features have been merged
into simple tools, so that it looks as if every tool has its own
syntax. Not only that, regular expressions use the
same characters as shell wildcarding,
but they are not used in exactly the same way.
What do you expect of an operating system built by graduate students?

Since you usually type regular expressions within
shell commands, it is good practice to enclose the regular
expression in single quotes (') to stop the shell from expanding
it before passing the argument to your search tool.
Here are some examples using grep:

Back Slash "\" is used to escape the next
symbol, for example, turn off the special meaning that it has.
To look for a Caret "^" at the start of a line, the
expression is ^\^.
Period "." matches any single character. So b.b will
match "bob", "bib", "b-b", etc.
Asterisk "*" does not mean the same thing in regular expressions
as in wildcarding; it is a modifier that applies to the
preceding single character, or expression such as [0-9].
An asterisk matches zero or more of what precedes it.
Thus [A-Z]* matches any number of upper-case letters,
including none, while [A-Z][A-Z]* matches one
or more upper-case letters.

The vi editor uses \< \> to match characters
at the beginning and/or end of a word boundary.
A word boundary is either the edge of the line or any character
except a letter, digit or underscore "_".
To look for if, but skip stiff, the
expression is \<if\>.
For the
same logic in grep, invoke it with the -w option.
And remember that
regular expressions are case-sensitive. If you don't
care about the case, the expression to match "if" would be
[Ii][Ff],
where the characters in square brackets define a character
set from which the pattern must match one character. Alternatively, you
could also invoke grep with the -i option to ignore case.