Shell Pipes by Example

Pipes, piping, pipelines… whatever you call them, are very powerful – in fact, they are one of the core tenets of the philosophy behind UNIX (and therefore Linux). They are also, really, very simple, once you understand them. The way to understand them, is by playing with them, but if you don’t know what they do, you don’t know where to start… Catch-22!

So, here are some simple examples of how the pipe works.

Let’s see the code

$ grep steve /etc/passwd | cut -d: -f 6
/home/steve
$
What did this do? There are two UNIX commands there: grep and cut. The command “grep steve /etc/passwd” finds all lines in the file /etc/passwd which contain the text “steve” anywhere in the line. In my case, this has one result:steve:x:1000:1000:Steve Parker,,,:/home/steve:/bin/bash
The second command, “cut -d: -f6” cuts the line by the delimiter (-d) of a colon (“:“), and gets field (-f) number 6. This is, in the /etc/passwd file, the home directory of the user.

So what? Show me some more

This is the main point of this article; once you’ve seen a few examples, it normally all becomes clear.

What I did here, was three commands: “find . -type f -ls” finds regular files, and lists them in an “ls”-style format: permissions, owner, size, etc.
“cut -c14-” cuts out the first 14 characters, which mess up the formatting on this website (!), and aren’t very interesting.
“sort -n -k 5” does a numeric (-n) sort, on field 5 (-k5), which is the size of the file.
So this gives me a list of the files in this directory (and subdirectories), ordered by file size. That’s much more useful than “ls -lS“, which restricts itself to the current directory, but not subdirectories.

(As an aside, I have to admit that I only concocted this by trying to think of an example; it actually seems really useful, and worth making into an alias… I must do a post about “alias” some time!)

So how does it work?

This seems pretty straightforward: get lines containing “steve” from the input file (“grep steve /etc/passwd“), and get the sixth field (where fields are marked by colons) (“cut -d: -f6“). You can read the full command from left to right, and see what happens, in that order.

How does it really work?

EG1 Explained

There are some gotchas when you start to look at the plumbing. Because we’re using the analogy of a pipe (think of water flowing through a pipe), the OS actually sets up the commands in the reverse order. It calls cutfirst, then it calls grep. If you have (for example) a syntax error in your cut command, then grep will never be called.
What actually happens is this:

A “pipe” is set up – a special entity which can take input, which it passes, line by line, to its output.

cut is called, and its input is set to be the “pipe”.

grep is called, and its output is set to be the “pipe”.

As grep generates output, it is passed through the pipe, to the waiting cut command, which does its own simple task, of splitting the fields by colons, and selecting the 6th field as output.

EG2 Explained

For EG2, “sort” is called first, which ties to the second (rightmost) pipe for its input. Then “cut” is called, which ties to the second pipe for its output, and the first (leftmost) pipe for its input. Then, “find” is called, which ties to the first pipe for its output.
So, the output of “find” is piped into “cut“, which strips off the first 14 characters of the “find” output. This is then passed to “sort“, which sorts on field 5 (of what it receives as input), so the output of the entire pipeline, is a numerically sorted list of files, ordered by size.