sed,
awk, and perl

sed, awk, and perl are some of
the Unix utilities that implement Regular
Expressions, mostly in tasks requiring pattern matching and
substitution.

They are widely used for data manipulation, searching, and general
programming. While they were originally developed for and are
integrated into Unix, they have been
ported to every other computing environment, including PCs.

sed is a stream editor, which follows commands
just like an interactive editor, but is designed to run in batch mode,
to perform repetitive search-and-replace commands untouched by human
hand. It deals with individual characters and thus is more useful
for phonological manipulation than large-scale textual analysis. It is
cryptic, though no more so than, say, Turkish Vowel Harmony.

awk (named after its authors: Aho, Weinberger, and Kernighan),
is a text-oriented pattern-matching language that is at its best
and most powerful when coping with large amounts of moderately
structured data.
For instance, one can perform text analysis on Usenet posts using awk.
It is less cryptic than sed, and works at the
word level, rather than characters. It can do anything that sed can, but
sed is faster and simpler for what it does. Awk exists in several
dialects, including nawk ('new awk'), with a richer command set,
and gawk ('Gnu
awk'), part of the Gnu operating system from Free Software Foundation.

Both awk and sed exist on every Unix system; consult
the local man pages for details of your specific implementation. They are
also available for most microcomputer systems, including DOS. Both are
line-oriented, and both have limitations, despite their utility.

Perl, by contrast, is a full-featured programming language, designed
to be useful for handling text and will do everything sed and
awk can and plenty more besides. The script that runs The Chomskybot
is written in Perl, and so are most of the CGI scripts that drive
search engines and other Web programs.