Shell Scripting – Best Practices

Most programming languages have a set of “best practices” that should be followed when writing code in that language. However, I have not been able to find a comprehensive one for shell scripting so have decided to write my own based on my experience writing shell scripts over the years.

A note on portability: Since I mainly write shell scripts to run on systems which have Bash 4.2 installed, I don’t need to worry about portability much, but you might need to! The list below is written with Bash 4.2 (and other modern shells) in mind. If you are writing a portable script, some points will not apply. Needless to say, you should perform sufficient testing after making any changes based on this list.

Here is my list of best practices for shell scripting (in no particular order):

Unless you’re writing a very small script, use functions to modularise your code and make it more readable, reusable and maintainable. The template I use for all my scripts is shown below. As you can see, all code is written inside functions. The script starts off with a call to the main function.

#!/bin/bash
set -e
usage() {
}
my_function() {
}
main() {
}
main "$@"

Document your functions

Add sufficient documentation to your functions to specify what they do and what arguments are required to invoke them. Here is an example:

# Processes a file.
# $1 - the name of the input file
# $2 - the name of the output file
process_file(){
}

Use shift to read function arguments

Instead of using $1, $2 etc to pick up function arguments, use shift as shown below. This makes it easier to reorder arguments, if you change your mind later.

# Processes a file.
# $1 - the name of the input file
# $2 - the name of the output file
process_file(){
local -r input_file="$1"; shift
local -r output_file="$1"; shift
}

Declare your variables

If your variable is an integer, declare it as such. Also, make all your variables readonly unless you intend to change their value later in your script. Use local for variables declared within functions. This helps convey your intent. If portability is a concern, use typeset instead of declare. Here are a few examples:

To prevent word-splitting and file globbing you must quote all variable expansions. In particular, you must do this if you are dealing with filenames that may contain whitespace (or other special characters). Consider this example:

If you are not familiar with the infamous Useless Use of Cat award, take a look here. The cat command should only be used for concatenating files, not for sending the output of a file to another command.

# instead of
cat file | command
# use
command < file

Avoid unnecessary echo

You should only use echo if you want to output some text to stdout, stderr, file etc. If you want to send text to another command, don’t echo it through a pipe! Use a here-string instead. Note that here-strings are not portable (but most modern shells support them) so use a heredoc if you are writing a portable script. (See my earlier post: Useless Use of Echo.)

# instead of
echo text | command
# use
command <<< text
# for portability, use a heredoc
command << END
text
END

Avoid unnecessary grep

Piping from grep to awk or sed is unnecessary. Since both awk and sed can grep, you don’t need the grep in your pipeline. (Check out my previous post: Useless Use of Grep.)

The problem is that ls outputs filenames separated by newlines, so if you have a filename containing a newline character you won’t be able to parse it correctly. It would be nice if ls could output null delimited filenames but, unfortunately, it can’t. Instead of ls, use file globbing or an alternative command which outputs null terminated filenames, such as find -print0.

Use globbing

Globbing (or filename expansion) is the shell’s way of generating a list of files matching a pattern. In bash, you can make globbing more powerful by enabling extended pattern matching operators using the extglob shell option. Also, enable nullglob so that you get an empty list if no matches are found. Globbing can be used instead of find in some cases and, once again, don’t parse ls! Here are a couple of examples:

In order to correctly handle filenames containing whitespace and newline characters, you should use null delimited output, which results in each line being terminated by a NUL (00) character instead of a newline. Most programs support this. For example, find -print0 outputs filenames followed by a null character and xargs -0 reads arguments separated by null characters.

In most cases, if a command takes a file as an input, the file can be replaced by the output of another command using process substitution: <(command). This saves you from having to write out a temp file, passing that temp file to the command and finally deleting the temp file. This is shown below:

Prefer [[ ... ]] over [ ... ] because it is safer and provides a richer set of features. Use (( ... )) for arithmetic conditions because it allows you to perform comparisons using familiar mathematical operators such as < and > instead of -lt and -gt. Note that if you desire portability, you have to stick to the old-fashioned [ ... ]. Here are a few examples:

One warning about #19 (though you already cover this a bit in #4); if you do end up using pipes, set -e may result in some difficult to debug issues during loops because a failed command will _only exit the subshell created by the pipe_. Please excuse this contrived example:

Newsletter

Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

Email address:

Recent Jobs

No job listings found.

Join Us

With 1,240,600 monthly unique visitors and over 500 authors we are placed among the top Java related sites around. Constantly being on the lookout for partners; we encourage you to join us. So If you have a blog with unique and interesting content then you should check out our JCG partners program. You can also be a guest writer for Java Code Geeks and hone your writing skills!

Disclaimer

All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners. Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. Examples Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.