Hashing, Encryption and Encoding

Posted on 2018-02-16 by
Mark McDonnell
14 mins read

Introduction

I’ve written previously (and in-depth) on the subject of security basics, using tools such as GPG, OpenSSH, OpenSSL, and Keybase. But this time I wanted to focus in on the differences between encryption and hashing, whilst also providing a slightly more concise reference point for those already familiar with these concepts.

Hashing vs Encryption

In essence:

hashing: provides integrity.

encryption: provides confidentiality.

Often cryptographic primitives need to be combined. For example, public-key cryptography uses RSA (a slow, but very secure algorithm) for communicating securely, while internally using AES (a faster, but less secure algorithm †) for encrypting data with a shared key, while using a hash function for generating a message digest to ensure both parties can verify the integrity of the payload sent/received.

† less secure in the sense that you have to share a secret key with the person you wish to communicate with, but that’s what public-key cryptography helps to secure.

Why use a hash function?

Hash functions (or more specifically their output: digests) can be used for many things, like indexing data in a hash table, fingerprinting (i.e. detecting duplicate data or uniquely identifying files), or as a checksum (i.e. detecting data corruption).

Message authentication (i.e. message integrity) involves hashing the message to produce a digest and encrypting the digest with the private key to produce a digital signature.

In order to verify this ‘signature’ the recipient of the encrypted message would need to compute a hash of the message, then decrypting the signer’s public key and comparing the computed digest against the decrypted digest sent within the encrypted message.

If the digest you generated is the same as the decrypted digest, then we can be sure the message was delivered unmodified whilst in transit (e.g. ‘man-in-the-middle’).

Base64 Encoding

Base64 is a way of taking binary data and transforming it into a text-based format. It is commonly used when there is a need to transfer the binary data over a medium that only supports textual data (e.g. you can Base64 encode images so they can be inlined into HTML).

How it works: Base64 encoding takes three bytes, each consisting of eight bits, and represents them as four printable characters in the ASCII standard.

MAC vs HMAC

A ‘MAC’ (Message Authentication Code) uses symmetrical cryptography with an encryption algorithm (such as AES †) to verify the integrity of a message, where as a ‘HMAC’ will use a hash function (such as SHA256) internally instead of an encryption algorithm.

Random Password Generation

Generating random passwords that are complex enough to make automated attacks difficult can be a bit tedious, yet important. But if you install a program such as pwgen (brew install pwgen) you’ll be able to generate random and complex passwords very easily.

Once installed, add the following alias to your shell:

alias psw="pwgen -sy 20 1"

Now when you execute psw you’ll get output that looks something like the following:

|93<3(M;r?~40c$A@>{\

Hash Functions

There are many different ways of accessing a hash function, two options we’ll look at will be using the executable shasum (provided by macOS) and the hashlib package provided by the Python programming language.

shasum

Let’s generate a hexidecimal digest of the message foobar using the SHA512 hash algorithm:

cksum

Remember hash functions generate a digest of some message input, and one such use of that digest output is data corruption (i.e. a checksum).

The macOS also provides a cksum command which let’s you generate a checksum, like so:

echo foobar | cksum

Which outputs:

857691210 7

The first number is the checksum and the second number is the amount of data in bytes.

OpenSSH

OpenSSH provides secure and encrypted tunneling capabilities and is typically used to enable secure shell connections from your machine to external servers.

In order to generate a cryptographically secure key pair, execute the following command:

ssh-keygen -t rsa -b 4096 -C "your.email@domain.com"

This uses the RSA algorithm (which is the default, so the -t can be omitted) along with a key size of 4096 bits (the default is 2048).

The output of this command will be a public and private key pair.

It’s usually best to generate these keys (or at least move them when generated) within the ~/.ssh directory.

SSH Agent

One thing that catches me out all the time is when I open a new terminal tab or shell instance and I go to push up some code changes to a remote server only to discover an error saying I’m not authenticated. This is because the new terminal/shell instance doesn’t have the SSH agent running which is what makes my SSH key pair available.

This happens so often I’ve created an alias to make starting up the SSH agent and loading my SSH private key very quick and easy:

Note: the use of the -K flag is macOS specific, it means it’ll add the key into the macOS keychain program.

OpenSSL

OpenSSL is designed to provide a method for securing web based communication (think HTTPS/TLS/SSL).

Note: for a full list of commands see: openssl -h and openssl <command> -h.

Key Exchanges

There are two popular key exchange algorithms:

RSA

Diffie-Hellman

For the specific details of each I recommend you read this post on the differences. In short RSA uses the person’s public key to encrypt the secret, while Diffie-Hellman uses a mathematical function to ensure only those two people communicating can calculate the secret based on the information that’s publicly available.

Generating a key pair

In order to generate a RSA based public/private key pair, execute the following commands:

Encrypting and Decrypting

The following examples use symmetric encryption, and so you’ll be asked for a secret key when encrypting and decrypting (although you could also use the -pass flag like so -pass pass:<your_password>, yeah the syntax is odd and it’s the same for decrypting):

Note: .enc is a commonly used format to indicate a file is encrypted (.asc is specifically used for asymmetric encryption).

I’m passing in the message via stdin (when encrypting), but specifying a file for the output (when decrypting), but you could use a file for both by explicitly specifying the -in and -out flags to provide a text file instead.

Annoyingly with openssl the same thing can be done a million different ways, so (for example) you might also find that you can do the above without the enc portion of the command (and thus removing the - prefix from the selected algorithm):

Encoding

Salts

It’s also worth mentioning that the default behaviour for OpenSSL is to use a ‘salt’ when using encrypting the message. A salt is random data appended to your already hashed message and then that is hashed itself. In pseudo-code it would look like this:

$pwd = hash(hash($password) + salt)

You would then store the value of $pwd in your database along with the salt itself.

The security doesn’t come from obfuscating the salt, but more that a rainbow table attack can’t now automatically loop/check its collection of hashed passwords. An attacker would need to incorporate your (per-user) unique salt value into their check against a predetermined list of hashes, and they also wouldn’t know if the salt was prefixed or suffixed to the password itself. Making it computationally very expensive and time consuming to attempt.

You can also see that a salt is used by trying to read an encrypted file (cat message.enc):

Salted__MJin¨MàÍ£?è,random¡:~randomW!5µõ

Asymmetrical Encryption

If you need to you can use a public key to encrypt data with (i.e. asymmetrical encryption) by utilising the openssl rsautl command, which stands for “RSA Utility” and is commonly used to sign, verify, encrypt and decrypt data using the RSA algorithm.

In the following example we have a file plaintext.txt we encrypt using a public key. It will now only be possible to decrypt the secret.enc file if you have the corresponding private key:

GPG

GPG is a tool which provides encryption and signing capabilities, and supports both symmetrical and asymmetrical encryption + digital signing of your encrypted content to ensure the integrity.

Generating a key pair

To generate a new GPG key pair you would execute the following command and interactively fill in the details:

gpg --gen-key

Automate

If you prefer to automate this you can create a file to contain the details and pass that into the command-line instead. The following code generates a new batch_file that will contain the information we would otherwise have to enter manually:

Revocation

When you generate a new key pair, if you intend on publishing your public key online, then you’ll want to generate a revocation certificate. Doing this will mean you can revoke your original key pair if your private key becomes compromised (or you just want to decommission it):

gpg --gen-revoke your.email@domain.com

When you’re ready to decommission it, just import the certifcate into your keyring:

gpg --import revocation.cert

You can then also push up your key identifier to a key server to force it to recognise the key has been revoked:

gpg --keyserver pgp.mit.edu --send-keys <key_id>

Asymmetrical Encryption and Decryption

In order to encrypt some data using someone elses public key (i.e. so only they can decrypt the data) you first need access to their public key and have it imported to your gpg keyring:

gpg --import public.key

If you want to verify the integrity of the public key you have acquired, then you should speak securely with the recipient who owns the public key and ask them to give you their digital ‘fingerprint’. You can then verify it matches what you have using the following command:

gpg --fingerprint <pub_key_id>

You’ll then look for the fingerprint in the gpg output. The fingerprint should look something like this:

FDFB E9B5 24BA 6972 A3AA 44B9 A1B1 7E6F DD86 E7F5

The command for encrypting a file plaintext.txt using their public key would be:

As you’ve encrypted the file using that person’s public key, it means they can decrypt the file simply with:

gpg -d plaintext.txt.gpg

Symmetrical Encryption and Decryption

By default gpg uses the AES algorithm for its symmetrical encryption. The command to use is (you’ll be asked to provide a passphrase):

gpg --symmetric plaintext.txt

You can specify a different algorithm, as the default isn’t as secure as it could be. Let’s use a 256bit encryption key:

gpg --symmetric --cipher-algo AES256 plaintext.txt

Note: see gpg --version for all available ciphers

Signing keys

If you want to explicitly trust a public key you have imported, you can ‘sign’ it. You do this using the --sign-key flag. Doing this can also be beneficial for the owner of that public key (Bob), because if a friend of yours (Alice) trusts you and they see you’ve signed Bob’s public key, then Alice is more likely to trust Bob as well.

In order for Bob to benefit from this ‘web of trust’ you need to send him back his public key which you signed. Bob would need to import that version of his public key back into his gpg keyring, so that he can then republish it online for others to see the you trust him.

The following example demonstrates how you would export Bob’s public key, which you previously imported and signed:

gpg --export --armor bob@example.org

Note: --armor simply outputs the binary data as ASCII

Signing encrypted files

It can be useful to sign a file that you encrypt, so that the person who will decrypt the file can verify it was you who sent it to them, and also check that the integrity of the file is still intact.

Note: this provides a combination of authenticity and integrity (as defined within the terminology section)

You do this by using the --sign flag:

gpg --local-user Bob --encrypt --recipient Alice --sign plaintext.txt

Note: I’m using --local-user because I have many different key pairs setup for testing.

This will generate a plaintext.txt.gpg encrypted file.

The recipient (Alice), can either decrypt the file using Bob’s public key and this will both decrypt and verify the signature, or Alice could just use the --verify flag if she didn’t want to decrypt the file.

Keybase

Keybase is a public-key directory that maps social media identities to encryption keys in a publicly auditable manner. Keybase offers an end-to-end encrypted chat and cloud storage system, called Keybase Chat and the Keybase filesystem.

In order to use the command-line tool keybase you’ll need to register for an account on their website.

To install keybase on macOS:

brew install keybase

Once installed you’ll need to login:

keybase login

At this point you can either generate a fresh key pair or select an existing gpg key pair:

If you receive an encrypted file (info.txt.gpg) using your keybase pub key but the senders not using keybase (e.g. they’ve encrypted the file using their own gpg private key), then you’ll need to have their public key in your gpg keyring: