Security

Licensing Using Symmetric and Asymmetric Cryptography

By Ilan Shamir, August 01, 2003

Algorithms and enlightenment for license key encryption.

Licensing Using Symmetric and Asymmetric Cryptography

Licensing is a good way to prevent misuse of your software, to enable temporary
use of it, or to enable the creation of a complex pricing model. The best approach
to secure and reliable licensing is through cryptography. In this article, I
will present two options for implementing a software licensing system: one option
uses symmetric (secret) key cryptography, and the other option uses asymmetric
(public) key cryptography. I will describe the algorithms and provide a sample
implementation based on OpenSSL. In closing, I will compare the two solutions
and evaluate their strengths and weaknesses.

The Licensing System

In this article, the term licensing system refers to a system for authenticating
access to a software product through the use of a key string supplied by the vendor
and input by the user. The license key is typically input at installation, but
the same technique could apply to updates or upgrades. Even casual computer users
are familiar with the prompt from an installation wizard to enter a string printed
somewhere on the package. In this article, I'll discuss where that license string
comes from, how it works, and how to implement your own licensing system using
OpenSSL.

The licensing system consists of a licensing authority, which issues the license
keys, and an application, which consumes them. The inputs to the licensing authority
are:

a unique identifier of the specific instance of the licensed software

a list of licensed features

an expiration time

The license authority outputs a 20-digit string (for example, A7BF4-87FE3-1D4A7-8D451)
that encodes the list of licensed features, the expiration time, and a piece of
information that enables the application to verify the authenticity of the license.
This piece of information is a cryptographic Message Authentication Code, also
known as the MAC. The user types the license key into the application, which decodes
the list of licensed features and the expiration time then verifies the authenticity
of the license using the MAC.

In many ways, a license key is the electronic equivalent of a physical key
that "unlocks" the software. As such, it should be easy to issue and use licenses
but very difficult to forge them. The license key should be linked to a specific
instance of the software and should not be transferable or shared between different
instances. The license should also include a time limitation feature, which
will enable things like temporary licenses or a complex pricing model using
yearly licenses.

Once the application authenticates the license and extracts the encoded information,
the application can then take any actions necessary to ensure that the software
protected by the license is properly configured. The application, in this case,
is typically an Install application used to set up and configure the licensed
software. Note, however, that the license key encodes a list of licensed features,
so this solution lends itself to situations in which a single CD holds all the
necessary software for various licensing models and a user upgrades to a higher
license level by obtaining a new key from the vendor.

Licensing and Cryptography

Cryptography has many objectives, two of which are most applicable to licensing:
Data Integrity and Data Origin Authentication. Data Integrity addresses the unauthorized
alteration of data. Data Origin Authentication addresses the identification of
the entity that created the data.

The cryptographic MAC mentioned in the previous section ensures both the Data
Integrity and the Data Origin Authentication. A valid MAC means that the data
provided in the license has not been modified by an unauthorized party and that
the licensing authority known to the application has issued the license.

It is not impossible to guess a valid MAC, but it is difficult. The level
of difficulty (which is known as the level of security) is often given in terms
of the number of operations required to defeat the protection. Typically, the
level of security is defined by an upper bound on the number of operations necessary
in order to break it.

For example, if a system is "40-bit secure" it means that breaking into it
will take up to 2^40=1,099,511,627,776 operations. In our case, if every license
check takes 10 msec, then guessing a license will take up to 348 years, which
seems quite safe to me.

MAC Calculation

There are several known ways to generate a MAC. I will examine two options: one
option uses a symmetric key, and the other option uses an asymmetric key. In symmetric
key cryptography, encryption and decryption are performed using the same key.
In the asymmetric case, there is a pair of two related keys, private and public.
Encryption and signature verification is done using the public key, while decryption
and signing is done using the private key (signing is the operation that binds
information to an entity). Both implementations are based on well-known and well-tested
cryptographic algorithms. It is important not to modify or invent new cryptographic
algorithms. The topic is complex, and it is easy to make naïve mistakes.

Symmetric Key MAC

My symmetric key implementation is based on RFC 2104 [1] and FIPS-198a [2]. Both
documents describe a mechanism for message authentication using cryptographic
hash functions. This mechanism is called HMAC. To make it easier to describe the
HMAC algorithm, let's define the following:

H -- a cryptographic hash function

B -- the block size processed in each iteration of H

L -- the size of the hash output

K -- a secret key shared between the originator and the receiver.
The key size should be greater than or equal to L/2

M -- the message to be processed

IPAD -- the byte 0x36 repeated B times

MPAD -- the byte 0x5c repeated B times

(X, Y) -- Y appended to X

The HMAC algorithm takes M and K as input and generates a
MAC as follows:

In general, any cryptographic hash function can be used in the HMAC algorithm.
I think it is better to use a function that is FIPS approved [2], like SHA-1
[3]. The strength of the MAC depends on the strength of the underlying hash
function. Therefore, it is best to avoid hash functions with known security
issues like MD2, MD4 and MD5.

SHA-1 output is 160 bit and is considered to be 80-bit secure. This is due
to the so-called birthday attack. Any hash function, when supplied with a
random input, returns one of k equally likely values. By repeatedly hashing
different inputs, we expect to obtain the same output after about k^0.5 trials.

It is a common practice to truncate the output of the MAC algorithm and
use only the t leftmost bits as the MAC. You should remember that the probability
of guessing a correct MAC is (1/2)t. Therefore, it is recommended to keep
t >= L/2 and never use a t that is less than 32 bits. In our case, since
each guess takes a while (the user has to somehow type it in and wait for
the application response), I think t=48 will do.

Listing 1 provides an implementation of
hmac_sha1(). OpenSSL includes an implementation of hmac_sha1,
see hmac_openssl_sha1() in the downloadable code archive. Note that the
MAC calculation and MAC verification use the exact same algorithm and the same
key.

Asymmetric Key MAC

The RSA signature schema as described in RFC 2437 [4] and FIPS-186-2 [5] can
be used to authenticate the content of a message. As opposed to the scheme described
above, this scheme employs an RSA key pair instead of a single secret key. The
data is signed using a private key and the signature is verified using a matching
public key.

To make it easier to describe the RSA signature generation and verification
algorithms, let's define the following:

H -- a cryptographic hash function

Encode -- encoding function (i.e. PKCS#1-V1.5)

PrivK -- RSA private key

PubK -- RSA public key

M -- the message to be signed

S -- the resulting signature

The RSA signature generation algorithm takes M and PrivK as input
and generates a signature as follows:

1. Mhash = H(M)
2. EM = Encode(Mhash)
3. S= RSA-Sign(EM, PrivK)

The RSA signature verification algorithm takes M, S, and PubK
and generates a binary result (valid/invalid) as follows:

For the same reasons described in the symmetric key discussion, the recommended
hash function is SHA-1 or one of its siblings SHA-256, 384 and 512.

The RSA signature is as secure as RSA encryption assuming you use a good
hashing function (like SHA). Using an RSA private key of 1024 bits gives 80
bits of security. The signature size is 128 bytes (the same as the key size).

It is impossible to truncate the RSA signature. In the symmetric case, since
both sides go through the exact same steps and use the same key, we could
easily truncate the MAC and compare only the t leftmost bits. In this case,
the verification algorithm requires the complete signature.

HMAC Licensing System

The licensing authority (see Listing 3) associates
a set of features and an expiration date with a specific instance of the licensed
software (target system). The set of features is represented as a bit-mask and
allows any combination of up to 16 features. The expiration date is represented
as days (not seconds) since 1/1/1970. This way it only takes two bytes to represent
the date and more room remains for the MAC. The target system is assumed to have
a 64-bit unique identifier. You can embed one in your software or create one on
the fly based on the system components (such as, CPU ID, MAC, IP address -- somewhat
like WinXP [6]).

The licensing authority assembles all this information, calculates the MAC,
and outputs the license string. The license string is a user-friendly representation
of the following information:

Feature set -- two bytes

Expiration time -- two bytes

Leftmost bits of the MAC -- six bytes

The above ten bytes are converted into a 20-character string.

While processing the licensing string, the application extracts the feature
set and the expiration date from the license string. It then concatenates
the system-unique id and calculates the MAC. The leftmost bits of the MAC
are compared to the bits provided in the license string. If they are identical,
the license is valid. Note that the secret key used by the licensing authority
must be identical to the key embedded in your application.

I would like to discuss the pros and cons of the above system, but first I will
describe the other system which uses asymmetric keys.

RSA Licensing System

The licensing authority in this case (Listing
4) does a similar job to the one described above, but with a few minor twists.
The license string is too short to contain the RSA signature (128 bytes). In the
previous case, I used the 48 leftmost bits of the HMAC result, and since the process
is symmetric, it was good enough. Since the RSA method is asymmetric, the application
must receive the complete signature to verify it.

To solve this problem, I will distribute a part of the signature with the
application and the rest will come from the license string. The application
will combine the two to form a complete signature that can be verified. This
can be considered a very primitive form of "secret sharing" (more about this
topic in [7]).

Given a target machine, every combination of licensed features and expiration
date has a unique signature. Since part of every signature will be shipped
with the application, it is necessary to limit the number of combinations.

Therefore, in the RSA based licensing system, the set of features is represented
by a smaller bit-mask and allows any combination of up to six features. The
expiration date has only 4 possible values, 1 week since activation, 3 months,
1 year, or lifetime. As before, the target system is assumed to have a 64-bit
unique identifier.

The licensing authority assembles all this information, signs it using the
private key, and outputs a license string. The license string is a user-friendly
representation of the following information:

Feature set - six bits

Expiration code - two bits

Most significant bits of the signature - nine bytes

The above ten bytes are converted into a 20-character string.

While processing the license string, the application recreates the original
data that was signed by extracting the feature set and the expiration code from
the license string and concatenating the system unique id. Then it iterates
through all of the partial signatures it has, reconstructs the complete signature
by appending them to the signature bits extracted from the license, and tries
to verify the message using each one of them (Listing
5). If any of them is valid, the license is valid. Note that the private
key used by the licensing authority must be the key pair of the public key embedded
in your application.

Cryptographic Considerations

The symmetric-key-generated license described here is less than 48-bit secure.
Adding additional characters to the license string can increase it up to 80-bit.
But the main security problem lies in the fact that the symmetric key is stored
somewhere in the application; if found, it can be used to forge licenses.

The asymmetric-key-generated license described here is less secure. If the
license contained the complete signature, it would have been 80-bit secure.
Since I suggest embedding most of the signature in the application and including
only some of it in the license, the solution cannot be considered to be more
than a few bits secure (evaluating the strength is outside the scope of this
article). Nevertheless, it is good enough to make a brute-force attack very
difficult, and more importantly, breaking one system does not enable the attacker
to break the others (as opposed to the symmetric case).

Cryptographic strength aside, you should remember that the license check
provides procedural security -- in almost every case there is a procedural
way to bypass the cryptographic protection. For example, in many cases the
license check is a single if statement in the code. If that check is
found and removed, the need for a license is gone.

I ignored the topics of key generation and key management. Good recommendations
can be found in FIPS 140-2 [8].

Implementation Considerations

Not all platforms are capable of performing RSA calculations in a reasonable
time. Asymmetric key cryptography is hard to implement on systems with limited
resources (i.e., 8-bit CPU with no hardware support for RSA). Some platforms
are more susceptible to attacks (such as desktops). Using symmetric key solution
in such an environment is probably not a prudent idea. In most embedded systems,
attacks are harder. This means using symmetric keys is relatively safe. Using
different symmetric keys for each version, or even for each distribution batch,
will reduce the exposure to attacks on the key.

Regarding the reduced security of the RSA-based licensing solution, if you
are not limited by the license format; it is much better (and easier) to distribute
the complete signature as part of the license. The license data can contain
more information, and it would be impossible to forge a license.

If your platform can support RSA calculations and you can easily distribute
the complete RSA signature with the license, you are better off using the
asymmetric licensing solution.

About the Author

Ilan Shamir has more than 20 years of programming experience. He spent many
years with Intel Corp., was a visiting scientist at MIT, and is now working
for Decru Inc. that develops wire-speed encryption appliances. He welcomes comments
at ilan_e_shamir@yahoo.com.

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Video

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!