Environment Variables

An environment variable is a variable that is made available
to commands as part of the environment that the shell maintains.

Environment variables are useful for a variety of reasons:

the are used to "pass" information into commands

they mask the complexities of the directory structure allowing
users to navigate without knowing path and file names

they can be used to minimize typing and the entering of data

Personal environment variables can be defined in your
.profile (or .bash_profile) file stored
in your HOME directory. The .profile file
is executed every time you log into the system.

Prior to executing the .profile, the shell executes
/etc/profile. This ensures that all users start with
a common environment upon successful login.

On most systems, the following environment variables are
automatically set for you.

HOME

absolute path of your home directory

PATH

used by your shell to find programs

PS1

your shell prompt

SHELL

the name of the shell program you are using

LOGNAME

your login account name

TERM

the type of terminal you are using

MAIL

absolute path of where your email is stored

An environment variable has a name and an associated value. Prefixing
an environment variable with a dollar sign causes the shell to get the
variable's value. The echo command can be used to display the
value of a environment variables. Here are some examples.

One of the most useful environment variables is PATH.
PATH is used by the shell to locate commands.
The following is a common value assigned to the PATH
environment variable: /bin:/usr/bin:/usr/local/bin.
When a command is entered, the shell (using the value of PATH)
first looks in the /bin directory for the program and
executes it if it finds it; otherwise, is searches /usr/bin
for the command, followed by /usr/local/bin. If the shell
searches all the components without finding the executable file, then
it prints a "command not found" error message.

By default, the shell does not look in your current working directory
for the command. This can be changed to modifying your PATH
as follows.

$ PATH=$PATH:. or PATH=$PATH::
Important note: If you do put the current working directory on
the PATH, then it should be last the component. You never want
it to be the first component.

If you enjoy playing games, then you may alter the PATH
as follows.

PATH=$PATH:/usr/games

You can use your own environment variables for abbreviations.
For example.

$ letters=$HOME/personal/letters
$ cd $letters

If you have an environment variable that you want available to other
commands, then you need to export it.

$ EDITOR=/usr/bin/vi export EDITOR
Now the EDITOR environment variable can be used by email
programs, pager programs, and so on.

Environment variables are often used to "hide" the complexities
of the directory structure. In other words, you can access a specific
directory by name without having to know its absolute path.

By convention, personal environment variables are spelled in lower case to
help distinguish them from those setup by the system.

Environment variable names must begin with a letter.

The env program can be used to display all the
environment variables you have defined.

A variable can be removed from the environment using the
unset command. Syntax.

Standard Input/Output/Error

In Unix, a file is a sequence of bytes that is stored somewhere
on a storage device (e.g. a disk). The content of a file does
not have any significant meaning to the OS (it is simply a
sequence of bytes). The "structure" of the bytes
may have to conform to a particular format in order to be used
by specific applications (i.e. programs or commands).

Every command, when invoked, has three I/O (input/output)
streams opened for it: two for output and one for input.
The output streams are called standard output and
standard error. The input stream is called
standard input. In "C" terminology,
these I/O streams have the names stdout,
stderr and stdin, respectively.

A command can get (read) input from the standard input stream,
but it doesn't know where the input comes from (it could come
from a file, the keyboard, or another command).

The output of a command (if any) can be written to either
the standard output stream or the standard error stream.
The command doesn't need to know where the output is going
(it could go to a file, the screen, or another command).

The shell assigns a file descriptor to each of the I/O
streams: standard input is 0, standard output is
1, and standard error is 2.

By default, standard input is the keyboard, and standard output
and standard error are both the screen. Re-directing these
streams is accomplished on the command-line when the command
is invoked using the re-direction operators.
[Note: It can also be done internally by the command itself.]

Redirecting Standard Output and Standard Error

In some instances you want to save the output of a command into
a file. This can be accomplished using the shell's
redirection operators > and
>>.

Here are examples.

$ who > who.out
Executes the command 'who' and redirects the output into
a file named "who.out". No output is seen on
the screen. The file "who.out" is created in
your current working directory.
$ date > /tmp/now
Executes the command 'date' and redirects the output to a
file named "now" stored in the "/tmp"
directory. (Note: If the directory "/tmp" contains
a directory named "now", then the shell will not execute the
command.
$ ls -l>myfiles
Use of whitespace around the > operator is optional.
Executes the 'ls -l' command with output redirected into a
file named "myfiles".

If you redirect the output of a command into a file that doesn't exist,
then the shell creates the file for you (assuming permissions are not
a problem). If you redirect the output of a command into a file that
already exists, then the content of the existing file is replaced with
the command's output (again, if allowed).

$ ps > /etc/foo
This command should fail on most Unix systems. We are attempting
to create a file named "foo" in the "/etc"
directory. "/etc" is a system-level directory and
regular users are not allowed to write to it.

If you want to redirect the output of a command and have it appended
to the content of an existing file, then you must use the
>> operator.

$ who > cmd.log
The 'who' command is executed and its output redirected into
a file named "cmd.log".
$ date >> cmd.log
The 'date' command is executed and its output is appended to
the file named "cmd.log". (If file "cmd.log"
doesn't exist, then the system creates it.) Again, the use of
whitespace around the operator is optional.
$ ps >> cmd.log
The 'ps' command is executed and its output is appended to the
file named "cmd.log".
$ cat 'mygroup:*:40000:' >> /etc/group
This command should fail on most Unix Systems when executed as
a regular (i.e. non-root) user. We are attempting to add a
group to file that should be read-only.

Re-directing Both Output and Error

The following are some examples of how you can re-direct the
standard error stream. Note: these examples work with Bourne,
Korn and Bash shells, but they do not work with "C" shell.

Assume cmd is some Unix command.
$ cmd >out 2>err
'cmd' is executed. Standard output is re-directed to a file
named "out" and standard error is re-directed to a
file named "err".
$ cmd >out 2>&1
All output (both standard output and standard error) are
redirected into a file named "out".
$ cmd 2>foo
The standard error is re-directed to a file named "foo".

Redirecting Standard Input

Many commands obtain their input from files specified on the
command-line. Most of these commands also work if no files
are specified. In these cases, the command reads input from
the standard input stream, which by default is the keyboard.
The cat command provides a good example.

$ cat /etc/group
The command opens the file /etc/group and uses
the content of the file for input.
$ cat
Now I am entering data (via the keyboard) that is
going to be used as input to the 'cat' command.
This is the standard input stream. You tell the
shell that you are done entering data by typing
a <CTRL-D> character. [Note: on an ASCII
system, <CTRL-D> is EOT.]
<CTRL-D>
The 'cat' command is executed without any arguments; therefore,
it gets input from the standard input stream.
$ cat < /etc/group
From a user perspective this has the same effect as if
/etc/group was specified as a command-line argument,
but internally the standard input stream was re-directed
from the keyboard to the file /etc/group.

In some cases you want to execute a command and have it get
input from both the standard input stream and a file.

$ cat - /etc/group
This is the content of /etc/group:
==================================
<CTRL-D>
On some systems, the use of - may not be supported by
all commands. The file /dev/stdin can be used instead.
$ cat /dev/stdin /etc/group
This is the content of /etc/group:
==================================
<CTRL-D>

Redirecting Input and Output

You can execute a command and have both the input and output
streams re-directed.

$ cat < /etc/group > foo
Execute the 'cat' command re-directing standard input to
the file /etc/group (i.e. the content of /etc/group
becomes the standard input stream). The output of the
'cat' command is re-directed to the file foo.

Pipes

Recall, the output of a command be redirected by using the
> operator.

Many commands receive their input from a file or from in
the standard input stream. Input can be redirected into
a command by using the < command.

$ sort /etc/group
The sort program displays the content of /etc/group
sorted in alphabetical order.
$ sort < /etc/group
Instead of getting its input from a file, the sort command
sorts the standard input stream (which in this happens to be
the content of the /etc/group file).

In many cases you need to the take the output of command and
use it as input to another command. This can be easily
accomplished by using the | (or pipe) operator.

$ ls -l | grep "Oct 28"
Do a long listing and pipe the output into the grep
command searching for the pattern "Oct 28". The output
of the command sequence will be a long listing on all
files that were created and/or modified on "Oct 28".
$ wc -l /etc/passwd
wc -l counts and prints the number of lines found
in the file argument /etc/passwd.
$ who | wc -l
The output of the who command is piped into the
word count program wc . wc when executed with
the -l option prints the number of lines it finds
in its input.
$ cut -f1 -d":" /etc/passwd | sort | uniq | wc -l
The 'cut' command prints the values of field
one found in the /etc/passwd file having colon
delimited fields. The output is sorted using
the Unix 'sort' command and the sorted output
is sent into the 'uniq' program which eliminates
duplicate values. The unique list of values is
then counted by the 'wc -l' command.
$ cat /etc/passwd | tr [a-z] [A-Z] | grep THURM | wc -l
This script prints the number of times the string
"thurm" is located in the /etc/passwd file. Searching
is not case sensitive. Exercise: describe what is
going on.
$ tail -100 $logs | grep "^csnet:" 2>/dev/null | sort | pr -s | more
Exercise: describe what is going on.

Early Unix History and Evolution

The following was copy/pasted from Ritchie's website.

Pipes appeared in Unix in 1972, well after the PDP-11
version of the system was in operation, at the suggestion
(or perhaps insistence) of M. D. McIlroy, a long-time
advocate of the non-hierarchical control flow that
characterizes coroutines. Some years before pipes
were implemented, he suggested that commands should
be thought of as binary operators, whose left and
right operand specified the input and output files.
Thus a copy utility would be commanded by
inputfile copy outputfile

* An asterisk matches 0 or more characters in a file name.
? A question mark matches any single character.
[ ] Square brackets can surround a choice of characters to match.

Note: asterisk does not match file names that start with a dot
(i.e. hidden files).

$ ls -x
d01.shtml d02.shtml d03.shtml d04.shtml d05
Display all files names found in the current directory.
$ ls -x *.shtml
d01.shtml d02.shtml d03.shtml d04.shtml
Display all file names that end with the string .shtml
$ ls -x ?05
d05
Display all file names that are 3 characters long and
end in 05
$ ls -x d0[1-2]*
d01.shtml d02.shtml
Display all file names that start with d0 and are at
least three characters long. The third character must
fall in the range of 1 to 2 (inclusive) and can be
followed by 0 or more characters.
$ ls -x *5*l
d05.shtml
Display all file names that have a 5 somewhere in them
(except the last character) and that end with a lowercase
ell character.
$ ls -x d0[123].shtml
d01.shtml d02.shtml d03.shtml
Display all file names that start with d0 followed
by either a 1, 2, 3 followed by s.html. Every file
name must be 9 characters long.
$ ls -x ???
d05
Display all file names that are three characters long.
$ cat d*
Display all file names that start with a lowercase dee.
$ rm *.shtml
Removes all files that end with the string .shtml
$ grep "the end" d0?
Searches for the pattern "the end" in all files having names
that are three characters long and start with d0
$ mv *0* /tmp
Moves all files having a 0 in their name to the /tmp directory.
$ mv temp[0-9] /tmp
Moves all files having the name temp followed by a single
digit to the /tmp directory.
$ mv [A-Z]* /tmp
Moves all files beginning with an uppercase letter to
the /tmp directory.

When you enter a command-line that contains meta-characters. The
shell expands the command-line to include the fully named
files. In otherwords, commands see complete file names; they do
not know about the meta-characters.

Assume we have a directory that has the following files:
x.out y.out z.err w.out t.err c.obj
$ rm *.out
The shell expands *.out and the following command-line
is executed: rm x.out y.out w.out
$ rm *.junk
No file names are found ending with .junk therefore *.junk
is the argument that is passed to the 'rm' program. The 'rm'
command will try to open a file named *.junk and will fail
resulting in an error message.

Caution needs to be exercised when using meta-characters on the
command-line. The following has happened to many users:

$ rm x *
The user wants to remove all files starting with x but
the space between the x and * causes all files to be removed.
The shell expands the * to all file names which in turn get
passed onto the 'rm' command.

There is a maximum length that the command-line can end up being. For
example, if you have directory containing a 1000 files having long file
names, then a command like rm * may not work.

What happens if you have file name that has an asterisk in it? For
example, suppose we have directory containing the following files:

Introduction to the grep Command

The grep command is used to search for a pattern
in a file or list of files. The pattern used by grep
is called a regular expression. On some Unix systems,
by default, the grep command supports basic-REs.
[grep: global regular expression print (g/re/p)
or general regular expression parser]

$ grep Unix d01.shtml
Print all lines from the file d01.shtml containing
the pattern Unix.
$ grep -i UNIX d*.shtml
Search all files starting the character 'd' and ending
in the string ".shtml" that contain the pattern UNIX.
The search is not case sensitive; therefore, Unix, UNix,
uniX, UNIX, unix, ..., all match.
$ grep "#include <iostream.h>" *.c
Look for the string #include <iostream.h> in
all files ending in ".c". Since the pattern contains
strings and metacharacters, it must be specified using
double quotes.
$ grep -l "while (" *.cpp
Search all "*.cpp" files for the pattern "while ("
and only display the names of the files with one
more matching lines, not the lines themselves.
$ grep -v UNIX foo
Display only those lines from the file foo that
do not contain the pattern UNIX.
$ grep -v -c UNIX foo
Display a count of the number of lines from the file
foo that do not contain the pattern UNIX.
$ who | grep "^jdoe "
Find out if jdoe is logged in.
$ grep Unix foo >/dev/null 2>&1
$ echo $?
Search the file foo for the pattern Unix and re-direct
the output to /dev/null. Use the exit status of the grep
command to determine if the pattern was found (0 indicates
that it was, 1 implies it wasn't).
$ crypt some_key < roster.done | grep jdoe
Find the password entry for use jdoe in the encrypted
file roster.done that was encrypted using some_key.
$ grep -E "unix|UNIX" foo
$ egrep "unix|UNIX" foo
-E causes grep to run as egrep. egrep supports
extended-REs (Regular Expressions). Search for either
the pattern unix or the pattern UNIX.
$ grep -F hello foo
$ fgrep hello foo
-F causes grep to run as fgrep. If the pattern is
a string literal (i.e. fixed), then fgrep may be used.
$ pgrep -u root
pgrep is a customized grep that is used to search
the process table. -u LOGNAME displays a list of
PIDs for user LOGNAME.

Introduction to the find Command

The find command locates files that match a given set
of criteria in a hierarchy of directories. The criterion may
be filename or a specified property of a file (such as its
modification date, size, or type). You can also direct the
command remove, print, or otherwise act on the file.

find pathname search-options action-option
pathname -- directory from which find begins the search; the
search is recursive (sub-directories, if any, are
also searched)
search-options -- identifies the file you are interested in
action-options -- tells what to do once the file is found
$ find . -print
Begin searching from current working directory and print
the name of all files found.
$ find / -name foo.c -print
Begin searching for a file named foo.c from the root
directory and print its name if found. More than one
file can be found with the name foo.c. Caution: if you
are not super-user, then you will not have permission to
search many directories.
$ find $HOME -name "*.c" -print
Begin searching in the HOME directory for all files
that end in ".c". Print their names, if found.
$ find . -type d -print
Begin searching from current working directory and print
all files that are of type directory.
$ find /tmp -type f -print
Begin searching from the /tmp directory and print
all files that are of regular files.
$ find . -mtime 10 -print
Find and display files last modified exactly 10 days ago.
$ find . -mtime -10 -print
Find and display files last modified less than 10 days ago.
$ find . -mtime +10 -print
Find and display files last modified more than 10 days ago.
$ find . -atime +10 -print
Find and display files last accessed (read) more
than 10 days ago.
$ find . -newer .lastbkup -print
Find and display files modified more recently than
.lastbkup was.
$ find . -user jdoe -size +50 -print
Find and display files owned by jdoe that are
larger than 50 blocks in size.
$ find . -type f -exec chmod 644 {} \;
Find and change permissions on all regular files.
$ find . -name foo.c -mtime +30 -exec rm {} \;
Find and remove all instances of the file foo.c
that are 30 days old.
$ find . -name foo.c -mtime +30 -ok rm {} \;
Find and interactively remove all instances of the
file foo.c that are 30 days old.
$ find . -inum 8888 -print
Find all and display files having i-node number 8888.
This is a handy way to remove files that have goofy
(e.g. non-printable) characters in their names.