Search

Paranoid Penguin - Limitations of shc, a Shell Encryption Utility

shc is a popular tool for protecting shell scripts that
contain sensitive information such as passwords.
Its popularity was driven partly by auditors' concern
over passwords in scripts. shc encrypts shell
scripts using RC4, makes an executable binary out
of the shell script and runs it as a normal shell
script. Although the resulting binary contains the
encryption password and the encrypted shell script,
it is hidden from casual view.

At first, I was intrigued by the shc utility (www.datsi.fi.upm.es/~frosal/sources/shc.html) and
considered it as a valuable tool in maintaining
security of sensitive shell scripts. However, upon
further inspection, I was able to extract the original
shell script from the shc-generated executable
for version 3.7. Because the encryption key is
stored in the binary executable, it is possible for
anyone with read access to the executable to recover
the original shell script. This article details the
process of extracting the original shell executable
from the binary generated by shc.

shc Overview

shc is a generic shell script compiler.
Fundamentally, shc takes as its input a shell script,
converts it to a C program and runs the compiler to
compile the C code. The C program contains the original
script encrypted by an arbitrary key using
RC4 encryption. RC4 is a stream cipher designed
in RSA laboratories by Ron Rivest in 1987. This
cipher is used widely in commercial applications,
including Oracle SQL and SSL. Listing 1
demonstrates running shc.

The two new files, named with the .x and .x.c
extensions to the name of the source shell script,
are the executable and an intermediate C version.
Upon executing pub.sh.x, the original shell source
is executed. shc also specifies a relax option, -r.
The relax option is used to make the executable
portable. Basically, shc uses the contents of the
shell interpreter itself, such as /bin/sh, as a key.
If the shell binary were to change, for example,
due to system patching or by moving the binary to
another system, the shc generated binary does
not decrypt nor execute.

I inspected the shell executable using strings and
found no evidence of the original shell script.
I also inspected the intermediate C source code and
noted that it stores the shell script in encrypted
octal characters, as depicted in Listing 2.

Listing 2. The original shell script becomes an
RC4-encrypted string in the C version.

The C source code also includes as arrays the password as well
as other encrypted strings. Therefore,
anyone with access to the source code easily can
decrypt and view the contents of the original shell
script. But what about the original shell binary
executable generated by shc? Is it possible to
extract the original shell script from nothing but
the binary executable? The answer to this question
is explored in the next section.

Extraction Approach

I generated and reviewed the C source code for several
shell scripts to better understand how the shell
source is encrypted and decrypted. Fundamentally,
shc uses an implementation of RC4 that was posted
to a Usenet newsgroup on September 13, 1994. I set
off by first identifying the encryption key and
the encryption text. The objdump utility came in
handy for this. bjdump, part of GNU binutils,
displays information about object files. First, we
use objdump to retrieve all static variables, for this
is where the encryption key and the encrypted shell
text are stored. Listing 3 provides a brief overview
of objdump.

The first column of the output in listing 3 specifies
the starting addresses in hexadecimal, followed by the
stored data in the next four columns. The last column
represents the stored data in printable characters.
So somewhere in the first four columns of the output is
the array of characters that form the encryption
key (password) and the encrypted shell script.
Comparing the original C source code and Listing
3, you can see that the password most likely
begins at address 0x804a540. After comparing other
executables, I determined that the first address after
the zeroes leading the “Please contact your provider”
text usually is the starting address. To retrieve
these arrays, such as the one depicted in Listing 2,
we also need to look at the disassembled code.
We use objdump again here, except this time with the
-d option, for disassemble, as shown in Listing 4.

The last two columns represent assembly instructions.
The movl instruction is used to move data—movl
Source, Dest. The Source and Dest are prefixed with
$ when referencing a C constant. The push takes a
single operand, the data source, and stores it at
the top of stack.

Now that we have the basics of objdump, we can proceed
to extract the encryption password and eventually the
shell code.

In the intermediate C code produced by shc,
about nine arrays are referenced by the variables pswd,
shll, inlo, xecc, lsto, chk1, opts, txt and chk2.
The pswd variable stores the encryption key, and the
txt variable stores the encrypted shell text. shc
hides the useful information as smaller arrays within
these variables. Thus, obtaining the actual array
involves two steps. First, identify the length of the array.
Second, identify the starting address of the array.

The objdump output needs to be looked at in detail
to obtain the actual array length and the starting
address. My first hint here is to look for all
addresses that are within the data section (Listing 2)
of the disassembled object code. Next, seek out all
the push and mov commands in Listing 4.
Addresses will be different for different scripts, but
when you encrypt a few scripts and read the resulting
C code, the patterns become familiar.

The 804a540 address seems to correspond to the pswd
variable, the encryption key. The length of the
useful portion of the encryption key is represented
by 0x128, or 296 in decimal form. Similarly, the next
variables, shll and inlo, have useful lengths of 0x8 and
0x3 and starting addresses of 804a672 and 804a68a,
respectively. This way, we are able to obtain the
starting addresses and lengths of all nine variables.
Next, we need to be able to decrypt the original shell
script using only the binary as input.

In shc, before the shell script itself is encrypted,
many other pieces of information are encrypted.
Furthermore, the RC4 implementation maintains state
between encrypting and decrypting each individual
piece of information. This means that the order in
which shc encrypts and decrypts information must
be maintained. Failure to do so results in
illegible text. To extract the original shell
script, we need to perform several decryptions.
For this step, I wrote a small program called deshc,
using the existing code from one of the intermediate
C files. The program reads two files as its input,
the binary executable and an input
file that specifies the array lengths and addresses.
deshc executes the following four steps:

Reads binary executable.

Extracts data section from the disassembled output.

Retrieves individual arrays based on input file.

Decrypts individual arrays in order, so that the RC4 state is maintained.

Based on the objdump output, I have arrived at the following array
lengths and addresses for the pub.sh.x executable:

All of these parameters are used in an input file to deshc, which
then decrypts and prints the original shell script.

Conclusion

An approach to extract the shell source code successfully from
shc version 3.7 generated binary executable
was demonstrated. The pub.sh script was used for
illustrative purposes only. I have indeed tested the
deshc program on executables that I did not create
and without access to the source code or the original
shell script.

Francisco García, the author of shc, recently released
version 3.8. It uses somewhat different
data structures and improves upon the security of
the previous version. Nevertheless, I believe that
embedding the encryption password within the binary
executable is dangerous and prone to extraction as
discussed in this article.

Nalneesh Gaur, CISSP, ISAAP, works at Diamond Cluster International as a
BS7799 Lead Auditor.