Executing commands safely from Python

Python provides multiple ways to execute commands on the system it is running
on. Some of them inherently unsafe, some of them safe in nature but easy to
use in an unsafe way.

Here I will set out to document the current ways to execute commands with
modules included in Python 3's standard library. Their pros, and their cons.
This article assumes that you are familiar with shells, you don't need to know
everything about them but you do need to know about their basic syntax. I also
assume you are using Python 3 and are on Linux while concepts will carry over
to all languages and operating systems.

Command Injection

To start out you need to understand why executing commands from Python can be
dangerous. This principle applies to all languages and is called
Command Injection, there are some examples on the OWASP pages and the CWE-77
page. I will provide my own here.

Here is some code that will restart a service on your system by the name of
the argument it receives. I name this program service.py and its goal is
to restart services. To do that it uses a function to execute commands called
os.system.

If we call our program with python service.py nginx the string that gets put
into our os.system-call will be the string systemctl restart nginx and all
is good in the world. However, if someone calls our program as
python service.py 'nginx;cat /etc/passwd' our executed command will become:

systemctl restart nginx;
cat /etc/passwd

Where I have added the newline myself for clarity. Our program was not intended to
be reading the /etc/passwd file at all! This is a command injection and it comes
in many shapes and forms and is something you want to prevent.

Any place where input is passed into a command to be executed one needs to be
especially careful. This can be in scripts such as the example above or websites,
network protocols, and others. Sometimes input can be things you wouldn't
expect to be input and is a reason why I won't call it user input in this
article. It can be, for example, an HTTP request made by your application that is
changed by a man in the middle attack on an unsafe network, which can put the client
at risk.

How does a command get executed?

Before I can talk about how to prevent these types of attacks it is important
to dive a tiny bit deeper. How does a command get executed by your operating
system?

In general your operating system's library will use a set of functions called
exec* functions where the * can be filled with a variety of letters. They
are documented in the man-pages.

These seem a bit daunting but in general all these functions follow the same
pattern. They all take a path or file to execute, if the function takes a
file the path to the name of that file will be looked up by parsing the
PATH environment variable.

Some of these functions also allow one to pass the environment to be set for
the executable that will be executed. However they all share a common idiom
which is executable followed by a varying number of arguments.

This means that whenever we execute a string in the form of
systemctl restart nginx something needs to parse that string into the parts
systemctl, restart, and nginx and give it to one of the functions in the
exec* family. This tends to be done by your shell.

If we jump back to our previous os.system program it will call the system
function in your standard C library which will in turn execute the command
sh -c 'systemctl restart nginx' to allow the sh executable, which is a
shell, to parse the command into the parts necessary for the exec* function
used.

Shells

As soon as a shell gets involved in parsing your command we are entering a very
dangerous state regarding the characters that are in our command to be executed.
Shells allow executing multiple commands at once, they have built-ins that allow
you to do things without calling commands. Someone can chain everything they want
in there by gaining control of a parameter that gets fed to a shell and shells get
involved in places where you sometimes don't know they will be.

Can we make arguments passed to shells safe? No, not really. You want to
use a function which does not use a shell at all to prevent shell-based
exploits.

Ways to execute commands in Python

Python 3 offers a variety of ways for executing commands but there is one which
springs out and that is the subprocess-module.

The subprocess-module allows us to execute commands without opening a shell to
parse our string into the appropriate parts. This puts us at minimal risk for
being exploited.

Note: Of course the program you are executing through subprocess can still have
its own flaws that allow it to be subverted to do things you don't want.

Let's make a version of our previous program using subprocess. Subprocess offers
many functions but they all follow the same rules for their arguments:

Subprocess's methods take either a list of arguments or a single string. Remember
the previous explanation about the exec* family of functions.

When you pass a list to subprocess as I've done above then your list will be split,
the first item will be the first argument to the exec* function and the rest of
the arguments will each be passed as a separate argument.

This means arguments are not interpreted by a shell first and this makes it impossible
for someone to execute other commands through the shell.

If you pass a single string to subprocess such as:

importsubprocesssubprocess.run("systemctl restart nginx")

Then that string will be the first argument to the exec* without any splitting,
the arguments will be left empty. If you execute the command above then the exec*
function will look for an executable called systemctl restart nginx on your PATH
which will likely not exist.

This is a safe way to execute commands in Python even when input is passed as
arguments to your executable.

shell=True

Subprocess's methods take an additional keyword argument called shell which
can be set to True. If you do so then you can only pass a string which will
be passed the same way, as sh -c 'command', if you do pass a list then it will
be passed as:

What if I need a shell?

Executing commands in the safe way as described above means that you can't use
those handy shell features you are used to such as |, <, > and their friends.

Most of these functions can be implemented separately in Python. If you need
a | it is often better to execute the first command, store its output and then
execute the second command giving the output to the new process.

File redirection (>, and others) can be done in the same way by storing the
output and then writing it to a file in Python.

For most command line utilities you would normally use with these operators you
can either trivially implement them in Python. You can also try to find a library
on PyPI to give you the output directly instead of trying to parse ip, ifconfig,
or others in a shell.

What if I really really need a shell?

You could use Python's shlex-module which tries to implement the proper escaping
rules for shells. Specifically you could try to use shlex.quote for each argument
you fill in. Reasoning about what is 'safe' or 'unsafe' becomes very difficult in
this context.