You are here:

Recently I needed to transfer data between entities, but I needed to keep the data secure from prying eyes, and its integrity intact from busy little fingers on the wire.

I needed the solution to be simple, and support a high-performance environment. Seeing that I could exchange a secret key over a secure channel out-of-band (OOB), I opted for using symmetric-key cryptography.

Let me start off by saying that cryptography is a vast, fascinating and complex subject. I'll discuss some of the high-level key concepts related to the subject matter to provide some background, but thats about it. If you're interested, I recommend the above link as well as Applied Cryptography by Bruce Schneier.

What is symmetric encryption (and asymmetric for that matter)

In a nutshell, symmetric-key cryptography refers to encryption methods in which both the sender and the receiver share the same key. This requires establishing a secure channel for secret key exchange, which also presents a considerable and practical chicken-and-egg problem.

This is where asymmetric-key (or public-key) cryptography comes in. Whitfield Diffie and Martin Hellman first proposed the notion of public-key cryptography in 1976, in which two different but mathematically interrelated keys (public and private) are used. The public key (freely distributed) is typically used for encryption, while the private key is used for decryption.

Enough background for now, lets get to it!

Choosing the crypto library

In my search for the most lightweight, flexible, powerful yet simple cryptographic Python library, I came across close to a dozen options. After reading their documentation (cough!) and interfacing with their API's, I decided on python-crypto.

The library has a wide collection of cryptographic algorithms and protocols, seemed like the best fit for what I was looking for, and is packaged by the major distributions.

apt-get install python-crypto

Next up, choosing the cipher

Without going into detail, there are block ciphers and stream ciphers. They differ in how large a chunk of plaintext is processed in each encryption operation.

A block cipher operates on a fixed-length group of bits (or blocks), for example 128-bits. In contrast, a stream cipher operates on relatively small blocks, typically single bits or bytes, and the encoding of each block depends on previous blocks.

I chose to use a block cipher, in particular the Advanced Encryption Standard (AES), which has been adopted by the US government.

Because the plaintext needs to be a multiple of block size, we specify Cipher FeedBack (CFB) mode so we don't need to deal with padding. But, the secret too needs to be a valid block size (e.g., 16, 24, 32), so for ease of use, we can use a lazysecret to achieve this.

In the above example, we are adding a 4 byte CRC of the data, onto the data, prior to encryption. After decryption, the CRC is recalculated and matched against the attached CRC. If they match, all is good in the world. If they don't, well, not good...

Update:
Thanks to tptacek on Hacker News for pointing out that a CRC is a non-secure hash function designed to detect accidental changes, and should not be used as a security check. Instead, it's recommended to use a secure hashing function such as SHA1.

Cryptography is so complex with all these algorithms, public keys and private keys. I remember pulling out quiet a few hairs while mugging it up. I tumbled up across your post recently and it shed a lot of light into the mysteries of cryptography. JBut no matter how we encrypt our data, won’t hackers come up with new ways to decrypt them? There are algorithms which take a lot of time to decode, but eventually everything is decodable! What is the safest one, your personal favorite?

The never ending arm's race / cat and mouse games are more relevant for virii/antivirus and exploits/patches. Cryptography is a different ballgame (read Schneier's book and you'll understand what I'm talking about).