How would you sort these? Certainly not with sort
-whatever you
do, you'll end up with a mish-mash of unmatched addresses, names, and
zip codes.
The chunksort
script will do the trick.
Here's the part of the script that does the real work:

The script starts with a lot of option processing that we don't show
here - it's incredibly thorough, and allows you to use any sort
options, except -o
.
It also adds a new -a
option, which
allows you to sort based on different lines of a multiline entry.
Say you're sorting an address file, and the street address is on the
second line of each entry.
The command chunksort -a +3
would
sort the file based on the zip codes.
I'm not sure if this is really useful (you can't, for example, sort on
the third field of the second line), but it's a nice bit of
additional functionality.

The body of the script (after the option processing) is conceptually
simple.
It uses
gawk
(33.12
)
to collapse each multiline record into a
single line, with the CTRL-a character to mark where the line
breaks were. After this processing, a few addresses from a typical
address list might look like this:

Now that we've converted the original file into a list of one-line entries, we
have something that sort
can handle. So we just use sort
,
with whatever options were supplied on the command line.
After sorting,
tr
(35.11
)
"unpacks" this single-line
representation, restoring the file to its original form,
by converting each CTRL-a back to a newline.
Notice that the gawk
script added an extra CTRL-a to the
end of each output line - so tr
outputs an extra newline, plus
the newline from the gawkprint
command, to give a blank
line between each entry.
(Thanks to Greg Ubben for this improvement.)

There are lots of interesting variations on this script. You can
substitute grep
for the sort
command, allowing you to
search for multiline entries - for example, to look up addresses in an
address file. This would require slightly different option
processing, but the script would be essentially the same.