Strong Distribution HOWTO

This document describes the protocol and methods for the cryptographically
strong distribution of free software on the GNU/Linux platform. It gives an
explanation of the underlying technologies, provides step by step instructions,
and provides answers to frequently asked questions.

A strong distribution model is a model of software distribution which is
cryptographically strong. In such a distribution model software archives
and source code are protected against alteration, damage and replacement
through the science of cryptography. Specifically, a strong distribution
model is a distribution model with makes use of public key cryptographic
technology to make attack, fraudulent presentation, compromise and alteration
theoretically hard problems.

Let's start by either gaining an understanding of how modern cryptography works,
or by running through a quick review of that subject. Cryptography is the
mathematic science of creating and using cryptosystems. A cryptosystem is a
system of protecting information. Encryption is the action of protecting, or
encoding or scrambling, data through cryptography. Decryption is the
action of decoding, or unscrambling, that data so that it can be read. Encryption
and decryption is done with numeric values which are called keys. There are
two types of cryptography which we'll discuss here, symmetric key cryptography
and public key cryptography. After defining symmetric key cryptography, we'll
shift our focus to public key cryptography. Public key cryptography is the
technology which the strong distribution model is based upon.

Symmetric key cryptography is a type of cryptosystem which uses a single key,
or secret, to allow the encryption of data. Typically, it is very easy to decode
the secret message and reveal the hidden information if you are in possession of
the secret key. If you do not have the secret key, the hope is that it is very
difficult, almost impossible, to decrypt the secret message and gain access to
the data.

Public key cryptography is a different type of cryptosystem which uses two
complementary keys, which are interrelated in a special way, to protect data. We
know from our knowledge of symmetric cryptography and encryption that a key can be
used to decrypt or encrypt a message. One of the interrelated cryptographic keys
in a public key cryptosystem can be used to encrypt information and, due to
their interrelation, the analogous key may subsequently be used to decrypt that
information encrypted with the other key. In some instances of public key
cryptography, both keys may be used for encryption and decryption. The
interrelated cryptographic keys in a public key cryptosystem are called a
"key pair". The two keys of a key pair are separated into a private key, which
is kept secret, and a public key which is broadcast. The basis of the public key
cryptography which is used in the strong distribution model is that data
encrypted with the private secret key, can be decrypted with the public key
and data encrypted with the public key can be decrypted with the secret key.

A digital signature is the electronic equivalent of a traditional paper
based signature with some additional benefits and features. A digital signature
is a combination of hashing technology and public key encryption. A digital
signature can be used to certify and timestamp a document, or in our case a
software archive. A solid understanding of hash technology is critical in
establishing an understanding of digital signatures and therefore, the strong
distribution model.

A hash algorithm is a function which can be run on a variable length piece of
data in order to produce a fixed length representation of that data. This
fixed length representation is called a digest. It is impossible to reproduce a
message from its digest. Hash technology works by being reproducible, given
the same input data, the same digest will always be produced. Data integrity can
be established by comparison of hash digests. Hash algorithms are designed to
have a phenomena in them called the avalanche effect. The avalanche effect is
the chaining of subsequent computations to the computations which preceded
them in a way that a minor change in any piece of data will avalanche, or produce
an extreme effect in all following calculations. Hash algorithms are also designed
in a way that makes collisions extremely unlikely. A collision occurs when two
different pieces of data produce the same digest when hashed.

In the openPGP based strong distribution model, a digital signature is created by
creating a hash of the data and then encrypting the digest with a given private key.
Software digitally signed with your secret key can be verified by anyone who acquires
your public key and performs a verification operation with openPGP software.
Verification takes place when a hash is recomputed on a software archive, or a piece
of data, and the digest matches the digest encrypted in the digital signature. Since
the signature is in part made up of hashing technology (most often SHA-1 or MD5), even
any small changes to the software archive or source code, due to the avalanche effect,
will be immediately apparent when they cause verification to fail. In the encryption
portion of the digital signature, the tight integration of the key pair means that a
digital signature decrypted by a public key which is not the public key linked with
the secret key of the signer will produce a value which does not match the recalculated
digest. Through this feature, verification of the digital signature not only verifies
the contents of the archive through hashing technology, but also verifies the identity
of the author through public key technology - and that summarizes the basis of the strong
distribution model.

Many developers currently make use of MD5 hash technology in the distribution
of their software. Most of us are familiar with the use of MD5 to protect
against the alteration of computer files though the use of programs like
tripwire which create a database of the MD5 hashes for later verification.
Cryptographic signature to protect data is very similar to the use of a hash
like MD5. The difference again is the encryption of the resulting digest value
by a key in a known key pair allows identity to be asserted and proven to the
extent that the key pair is trusted.

The possession of an openPGP key pair also provides a software project with
the ability to post digitally signed message or to send digitally signed email.
The maintainers of a software project can digitally sign release announcements,
security advisories and even support email. In addition, it is also possible
to use an openPGP key pair to send encrypted email to the maintainers of a
software package. All of these abilities work can contribute greatly to the
security of a free software project.

Since the identity of the author and the contents of the archives are
protected, the strong distribution model protects you from having your
software replaced on your server with a trojan horse. It also protects
you from having your software infected with a virus while on your server
and it even protects you from bit rot (data corruption) and data
transmission errors.

In the case of a free software project not using a strong distribution
model, the worst possible scenario in a compromise is that the server the
project is using for distribution is compromised. However in the case of
a free software project which is using a strong distribution model, the
worst possible case is that the key pair is compromised. Theoretically,
if the secret key is not stored on the server which is used for distribution
and that server is compromised it should have only mildly significant
security consequences for the project. The integrity of the software
being distributed would be preserved. On the other hand, compromise of the
secret key has dire consequences which are detailed in section 1.6.

The best thing about embracing a strong distribution model is that it gives
you security that is independent of the security of your server. Since
the security of a strong distribution model rests in the security of the key pair
rather than the security of the server, it's no long necessary to rely on a
centralized model of distribution. This means that the free software community
can begin to take advantage of peer to peer (p2p) technology to distribute its
software. The use of p2p technology can drastically reduce the resources needed
for a large software project. The technology can therefore reduce the barrier to
entry in areas such as linux distributions or heavily used and frequently
downloaded applications.

The user benefits not only from the protection afforded by cryptography
against crackers, but also by the protections afforded against unscrupulous
distributors. Cryptographic signatures provide some degree of nonrepudation,
or the inability to deny later that the distributor authored the contents
of an archive. If a software distributor distributed a software archive
which was infected with a virus or which contained questionable code and
was using a strong distribution model, it would be impossible for the
distributor to claim that infection took place on the client system or
that the questionable code was not official code with out also claiming
that their key pair was compromised.

In the event of a compromise of a client system which your software
is running on, the user must reinstall all the software on the system
from read only media, or re-download the software from the internet.
In the case of free software, the user may not have read only media.
If you are using the strong distribution model, the user may be spared
the need to re-download all the software if they lack read only media
copies. The user would just be able to reinstall their openPGP package
and basic operating system, then install your software from archives
which where cryptographically protected from modification.

The compromise of the strong distribution systems takes place when the
key pair used by the distributor is compromised. Most importantly, if a
cracker gains access to your secret key, he can digitally sign software
archives and messages which will appear to have come from you directly.
Since digital signatures made by the cracker will be indistinguishable
from the digital signatures made by the people with legitimate access
to your project's secret key, it will be impossible to tell which
archives are valid and which are false. Secondarily, the cracker will
also be able to decrypt all messages encrypted with the public key
of the key pair, including both past and present messages.

The best way to protect yourself from such compromises is to use
great caution in the protection of your secret key. The best
practice is to store your secret key on a floppy disk, store a backup
of your secret key and revocation certificate in a secure place, to
choose an excellent passphrase, and to have only one individual have
access to the secret key. This individual can function as the project's
release master. To some extent, you can also limit the damages from
such a compromise by using key pairs with reasonable expiration times
and by using a signature only key. The use of short expiration times
limits the number of releases that a key pair compromise would involve.
The generation of a key pair with out an encryption subkey would
eliminate the compromise of encrypted communications with the
compromise of the release signature key pair.

Minor attacks which do not result in full compromises are also
possible. These attacks would most likely take the form of false key
pair introduction usually with the end goal of the release of trojan
horses or the interception of encrypted communications. For example,
an attacker could create a fake public key for your project and use
that fake key to sign a trojan horse which could then be distributed.
The same type of attack could be made against developer keys.

The best way to protect yourself from this type of attack is to
have your key pair deeply integrated into the web of trust. Web of
trust integration is discussed in section 2.3.

One final type of attack worthy of mention would be the circulation
of a compromised revocation certificate. This would have the effect
of disabling your key pair. Recovery from this type of attack would
involve the generation of a new key and the integration of that keypair
into the web of trust.

An interesting ramification of being able to certify an archive, or piece,
of software as having certain contents is that you can make that certification
and subsequently make assertions about those contents. For example, it now
becomes possible for a distributor to make the claim that a given package of
software has been reviewed and is valid and safe to run. The distributor
can sign the archive with their openPGP key to produce a new version of the
archive which differs from the original only in that it has been signed
with the distributor's key. An unlimited number of signatures can be made
on a single software archive. therefore, the distributor can use digital
signatures to identify software packages as officially being part of their
distribution, and perhaps as supported versions of software.

As a real world example of where third party testimonials would be
useful, take the case of the Free Software P2P Network. The FSPN can
digitally sign software archives which are distributed through the network.
These digital signatures can provide significant additional security for
end users of the FSPN and can even allow the achievement of the level of
security necessary to make the FSPN possible in the absence of the
adoption of the Strong Distribution Model by a large number of free
software projects.

As another example of where third party testimonials would be useful, take
a value added model of software distribution. Company X wants to
make money by distributing GPL'd software in a Linux distribution. Company X
can perform a security review on the software and if Company X finds no
security flaws in the archive Company X can digitally sign the archive
with an openPGP key which is used to assert that a given software archive
is free from security flaws. Company X can then provide some form of warranty on their
signed archives. Company X now has a value added linux distribution.

With third party testimonials, companies can charge an audit fee for
their auditing and subsequent signature of a software package. This model
in effect allows scarcity to be reintroduced into the market place of
free software. The identity can only be asserted and the assertion about the
software archive can only be made by the holders of the given key pair.
therefore the reputation of the key pair holder can be developed and can
be made valuable. While a diff could be run in the case of GPL'd software
this is may not necessarily be the case with alternative licensing models.

This section covers the steps necessary to establish a strong
distribution model for your own free software projects. In this
section, we make use of the GnuPG openPGP implementation on the
GNU/Linux platform. These steps should be easily transferable to
other platforms.

The first step in the implementation of a strong distribution model is the
generation of a key pair. If you're currently working with a key pair which
was generated before openPGP version 4, you should regenerate a new openPGP
key pair with openPGP software that uses the version 4 format. The version 4
format has many improvements in design which make it more resistant to
attacks and more expandable for long term use. In the case of GnuPG, you
should be using at least version 1.0.6 in order to have protection from
all known security flaws. I strongly suggest that you revoke any version 2
or version 3 public keys and replace them with version 4 keys unless you have
good reason not to do so.

Please, note that GnuPG does allow you to generate a key pair in which
the public key of the pair contains only a public key without the standard
additional encryption subkey. This type of key is, in effect, a signature
only key. Some developers choose to generate signature only keys because
while the compromise of a signature key may compromise the integrity
of the web of trust and the validity of signatures made with the key,
it will only indirectly result in the compromise of the secrecy of
data. On the other hand, upon the compromise of a encryption subkey,
the secrecy of the data encrypted with that subkey is compromised as
well as any future data protected by that subkey. If you have an
interest in understanding this further, I suggest you read the
openPGP standards document, RFC2440. For most users, the generation
of a key pair with an associated subkey is adequate and not cause for
concern.

Due to the way in which keys can be linked in a web of trust, it
is possible to generate multiple key pairs for a project. This makes
it is possible to generate a signature only key pair and an encryption
key pair for the same project and link them together through digital
key signatures. This also allows the generation of additional key
pairs for sub projects.

With GnuPG, a key pair can be generated with the following command.

bash$ gpg --gen-key

I'd suggest that you generate at least a 1024 bit key pair. Depending
on the scope and direction of your project you may want to set an
expiration date on the key of between one and ten years. I personally use
three year expiration dates on my software publication keys. The user id
on project keys is usually that of a security officer or a certificate
authority (for example: security@yourdomian.tld or ca@yourdomain.tld).
A step by step guide to the generation of a key pair with gpg, which
includes a more in depth explanation of the steps, is available
in section 3.5 of the GnuPG Keysigning Party HOWTO.

In the event of a compromise of your openPGP key pair by an attacker, you'll
want to revoke, or invalidate, the key pair to let the public know that it
is no longer trustable. The mechanism of action that allows this in
openPGP is called a revocation certificate. It is recommended that after you
generate your key pair, you generate a generic revocation certificate for that
key pair. The generation and storage of a revocation certificate will allow you
to revoke your public key even in the event that you loose access to your
private key due to compromise, seizure, forgotten passphrase, or media failure.
In order to retain the ability to revoke your public key in the event that you
no longer have access to your private key, you should generate a revocation
certificate and store it a secure and safe place. You should also print out a
copy of your revocation certificate so that it can be entered and used in the
event of the failure of the media the revocation certificate is stored on.

If your revocation certificate is compromised, the individual who compromises
your revocation certificate will be able to circulate the certificate thereby
disabling your key. However, the individual will not be able to compromise your
secret key through his access to your revocation certificate. therefore, they will
not be able to generate fake signatures, decrypt messages encrypted with your
key pair, or otherwise misrepresent themselves as the owner of your key pair. Since
the only negative outcome possible from the compromise of a revocation certificate
is the disabling of your key pair, it is a generally safe and good thing to do.
For more information on revocation certificates, and instructions on
how to generate one, see section 2.8 of this document.

If you have chosen to assign an expiration date to your key pair, you should
generate a replacement key pair before your key pair expires. You should then use
your older key pair to form a trust link with your new key pair before it expires
to provide a sense of continuity to your keys.

It is very important to make your public key available and readily
accessible. If a user cannot gain access to your public key they
cannot verify your signatures. If the end user does not verify your
signatures, the benefits of the strong distribution model are
diminished significantly.

There are three steps that I recommend that you take in order to
circulate your public key. First, you should post your public key on
the website where the software is distributed from. You should place
the ASCII armored public key in a conspicuous place where people can
easily find and download it. Second, you should post your public key
to the various networks of public keyservers. A list of keyserver
networks, and keyserver, can be found at the end of this document in
section 4.1. Finally, when the signatures on your
key change be sure to update the copies of your key on your web site and
the keyserver networks. Keyservers with in the same network (CryptNET,
keyserver.net, etc) will automatically synchronize the various copies
of your key. You'll only need to send your updated copy of your key to
one node on each network. Work on achieving cross synchronization
between keyserver networks is being done.

You may choose to distribute your keys by linking to them on the keyserver.
This will allow your signature information to be displayed by the keyserver and
alleviate the need to re-upload your key to your web site when you have additional
signatures added to the key. As an example of this, here are the PGP keys for
CryptNET and myself.

I do not recommend that you include your public key inside your software
archive. While there is no technical security problems with this, it does
encourage the end user to accept the public key driven by trust based in the location
of the key rather than integrity imparted upon the key by signatures. Encouraging
such habits in the end users will make them more susceptible to trojan horse
attacks against the Strong Distribution Model in which fake archives and fake keys
are distributed.

If your public key is not integrated into the web of trust it is of significantly
less security value than a public key which is integrated into the web. The openPGP
keys can be used to verify that the archive has not been modified since the owner
of the private key signed it. But, what assurance is there that the owner of the
private key is actually the primary developer of the software package? The web
of trust provides an answer to this question. In the scope of the key pair used
to sign the software archives in a strong distribution model, the web of trust
is a collection of statements about the validity of the key pair the maintainer
of the software package provides. Most often, these statements about validity
are made by the developers of the software package.

Often software distributors see fit to rely on central distribution to
guarantee the integrity of their published public key. This is a mistake
because it not only allows easier compromise of their distribution model,
but also because it ties the assurance of the validity of the key pair to
the existence of centralized distribution point of the public key of that
pair. If software is distributed over a p2p network, such a centralized
point may not exist.

In the case of a software distribution key, practicality must be taken
into consideration. Unlike a personal key on which it is in most cases
advantageous to collect as many signatures as possible, a distribution key
only really needs signatures from web of trust integrated developer keys.
When presented with a public key for a software distribution, the question
which should be asked is "Is this the key the developers have issued to
secure their releases?". This question can be answered by looking for the
presence of developer signatures on the key. Obviously, the question of the
validity of the developer signatures, should they exist, is the next logical
step.

Aside from the cryptographic validity of the developer signature, the
validity of a developer signature must be determined in two ways. In
looking at the validity of the developer signature, what we are really
looking at is the relationship between the developer key and the project
key, and the strength of the developer key. The relationship between
the project key and the developer key should be reciprocal. The
project key should be signed by the developer and the developer key
should be signed by the project key. The strength of the developer
key can be judged by the presence of other signatures on that key.
The key for CryptNET can be examined an example of what the web of trust
for a Strong Distribution Project key. The
cryptnet.net
has reciprocal links with its developers. The develops key in the case
of CryptNET is strongly integrated in to the web of trust as you can see
from this picture
of the reciprocal trust links with that key and
the
overall web of trust integration of that key.

In order to link the keys of the developers on the project into
the web of trust you must first sign them with the project's key.
After you have performed that signature operation to complete the
reciprocal trust path between the project key and the key of the
developer, the developer's key should be linked into the normal
web of trust. In order to do this, a keysigning party should be
held. Instructions on how to conduct a keysigning party are
provided in the GnuPG
Keysigning Party HOWTO. Additional information
about the web of trust can be found in that document.

This section consists of some naming and operating conventions I suggest to you.
These are not yet standardized and there is not an impervious need to follow them,
but I highly suggest that you do. If we can agree on common standards for naming and
distribution, we can make things much easier on distributors, packagers, maintainers,
programmers, and especially the end users. These standards where taken from primarily the
practices of the GnuPG developers and the Linux Kernel Developers. therefore, they
already have been deployed and somewhat established. Due to their use by the Linux
Kernel Developers in the Linux Kernel Archive and the GnuPG developers, these methods
should be somewhat familiar to at least some of your product's user base.

The best way to sign a binary file, be it a tarball, an archive, or an iso image,
is to generate a detached ASCII armored signature. The use of a detached ASCII
armored signature keeps the main binary file from having anything appended to it.
The use of an ASCII armoring for the detached signature makes it easy for the
user to look at the file and determine if it the transmission was most likely
complete and unflawed. ASCII armoring also protects the signature from damage
during transmission though limited channels which are not safe for raw binary
data.

Assuming we're working with archives which conform to the GNU standards, generating
a detached ASCII signature would result in an archive, for example, with the name
gnu_software-1.0.0.tar.gz and an ASCII armored signature file with the
name gnu_software-1.0.0.tar.gz.sig. Both files would then be distributed
separately through traditional means such as ftp, http, or p2p transmission. Patch files
would be signed in the same way with detached ASCII signatures. The signature of patch
files would produce files with the names gnu_software-1.0.0-1.0.1.diff.gz and
gnu_software-1.0.0-1.0.1.diff.gz.sig.

In this command the -a argument directs ASCII armoring (radix encoding), the
-o argument specifies and output file (--output may be used instead)
and the --detach-sig argument specifies that the signature should be stored in
the output file rather than appended to the archive.

In order to verify a signature, you must be able to access the public key of the signer.
You can either download the key and import it into your keyring before you attempt verification,
or you can rely on your PGP software to download the key when it is needed for verification.
There is great advantage to downloading the key by hand from a keyserver which lists
signatures on the key because it makes your personal verification and estimation of the
validity of the key slightly easier.

The patch program makes use of the dash ('-') character to represent lines
of source code which are to be deleted. This can be problematic for PGP users
because openPGP and other PGP implementations make use of the dash ('-')
as a meta character. GnuPG provides a special feature for the signature of patches,
either inline in an email message or inline in a file. If you use a detached
signature to sign your patch file, you do not need to use the --not-dash-escaped
feature of gnupg.

In order to sign a message with an inline patch and take advantage of gnupg's special
feature, the following command can be used:

The worst thing that can possibly happen in a strong distribution model is the compromise
of the private key used to sign software releases. If a hacker gets control of your private
key, the hacker is then able to issue software releases which can be verified as authentic
with your public key. If a compromise does happen, here's how to deal with it - revoke your
public key.

In the event that you suspect that your secret key has been compromised, you should revoke
your public key immediately. Key revocation takes place by the addition of a Revocation
Signature to a public key. The revocation of a key suggests that a key is no longer valid
(secure) and should not be used. Once a revocation certificate is issued, it cannot be
withdrawn.

Since your openPGP key is distributed (read circulated) to people rather than distributed from a
central point every time it is accessed, you must circulate (or distribute) your revocation
certificate in the same manner that you distributed your public key. The circulation of the
revocation certificate in the same manner as the distribution of your public key would usually
mean uploading the revocation certificate to the keyserver networks and posting it in a
conspicuous place on your website. You should continue to make your revoked public key
available from your web site long after you have stopped using your public key. This ensures
that someone does not mistakenly think that your secret key is valid and is still producing
valid signatures. Best practice would be to keep links to all of the public keys your
project has ever used on your website, including expired and revoked public keys.

In review, the gpg command to generate a revocation certificate is:

bash$ gpg --output revcert.asc --gen-revoke <key_id>

If you have an idea about how or when you key was compromised and you generated a revocation
certificate during key generation, you will still likely want to generate a new revocation
certificate to revoke your key pair. This is the case because the openPGP standard will allow
you to specify the reason why you are revoking your key pair and even provide some free text
comments about the reason for revocation. The circulation of a revocation certificate with
this type of information is likely advantageous and preferable to the circulation of a generic
certificate created during key generation.

This section for the provision of information related to openPGP
signature integration with the various package formats. If you
know would like to submit information about using PGP with a
packaging format not listed here, please submit it to the
maintainer of this HOWTO for inclusion.

If you have difficulty working with the integrated openPGP signatures
of a packaging format, or if the packaging format of your choice does
not support openPGP signature integration, you can still use the strong
distribution model for your project. Simply perform the openPGP signature
and verification operations as if your package distributed software
was a tarball.

Key - One or more bits of data used in the encryption or decription process.

Key Fingerprint - If PGP, a value used to identify a key which is generated
by performing a hash of key material.

Key Pair - In public key cryptography, a pair of keys consisting of a public
and private, or secret, key which interrelate.

Keyring - A collection of keys. Most often this term is used in relation to PGP,
where a keyring consits of a collection of one or more key packets.

Key Server - A system which stores key material in a database. These servers may
be queried by users who wish to acquire the public key of a recipient they have not had
prior contact with.

Keysigning Party - A get-together of people who use the PGP
encryption system with the purpose of allowing those people to sign each others public keys. Keysigning parties serve
to extend the web of trust.

openPGP - An open standard which defines a version of
the PGP security system.

Pretty Good Privacy [PGP] - Privacy software developed by Phil Zimmermann,
which includes public key cryptography, a standard packet and key format, and symmetric encryption
as well.

Public Key - In public key cryptography, the key of a key pair which is
shared.

Public Keyring - A keyring consisting of Public Keys. This term is most often used in
relation to PGP.

Radix - A method of encoding data so that it may be transmitted
over a channel which only support 8 bit characters. For example, such a channel could be
email or the Usenet.

Secret Key - In public key cryptography, the key of a key pair which
is kept secure.

Secret Keyring - A collection of secret keys. Most often this term is used in relation
to PGP where it defines a collection of secret key packets.

Trust Path - A route by which trust is extended from one entity to
another. In PGP, this is a link of trust between two public keys.

Web of Trust - The collection of signatures upon keys and resultant
trust paths in a user centric trust model which provide for authentication.
Collectively, the trust relationships between a group of keys.