Hello World! FUBSWRJUDSKB

By Jeff Kazmierski, Copy Editor

The semester’s almost done, so for our final Hello World! of spring I thought we’d turn to a slightly more practical application, one that Python is extremely well suited for. In fact, what we’re discussing this week is one of the oldest uses for computers, cryptography.

That’s right. Modern computers today can trace their origins to World War II and the British government’s attempts to crack the German “Enigma” machines. They didn’t have Python back then (or Java, or C, or C++, for that matter), just rooms filled with spinning, whirring, clicking devices called bombes, built by mathematician Alan Turing and staffed by the women of the Royal Air Force. But through brute force, mathematical skill, and good old English stubbornness, they managed to break the German cipher and turn the tide of the war.

Don’t worry, we’re not going to be doing anything quite that intensive today. We will, however, be exploring one of the oldest known ciphers, the Caesar Cipher.

If you’re not familiar with cryptography, the Caesar Cipher was developed by Julius Caesar about 2,000 years ago and is very simple and easy to code. It’s a “monoalphabetic substitution cipher,” which means it uses a single ciphertext alphabet to encode the plaintext alphabet by using an offset value. The offset is called the “key” and is used to both encrypt and decrypt the text.

Here’s an example:

offset = 3

Plain: A B C D E … Z

Cipher: D E F G H … C

To encrypt the message, just replace each letter with the one three steps right of it. ‘A’ becomes ‘D, ‘C’ becomes ‘H’, and so on. When you reach the end of the cipher alphabet, just roll it over to the beginning (in this case, ‘X’ becomes ‘A, ‘Y’ becomes ‘B’, and ‘Z’ becomes ‘C’).

It’s not a very strong cipher, being prone to repetition and easily broken, but it worked pretty well for the Roman Empire, since most of their enemies were illiterate anyway.

The code is in the sidebar, but it needs a little explanation before we delve into it.

You may have heard of another code, the American Standard Code for Information Interchange, or ASCII. It’s a character encoding scheme that’s been used since the 1960’s to represent text in computer applications. In its original form it consists of 33 ‘control’ characters and 95 printable characters, for 128 total values in the original table. Python has two functions that relate to characters and their ASCII values; ord() and chr().

The ord() function takes a character as an argument and returns its ASCII equivalent.

>>> ord(‘A’)

65

>>>

The chr() function does the opposite.

>>> chr(65)

‘A’

>>>

To see all 128 symbols in the original ASCII table, just use a script:

>>> for a in range(0, 128):

print(a, “t”, chr(a) )

Try it. You should see each number from 0 to 127, followed by a character. You should also notice that many of them appear to be blank. That’s because values from 0 to 31 are reserved for “special” characters like line feeds (10) and tabs (9). ASCII value 32 is a space, and 127 is DEL (delete).

Fine, I hear you say, but what’s this got to do with the program? Take a look at the encrypt() function in the program:

def encrypt(plaintext, key):

ctext = “”

for letter in plaintext:

ctext += chr((((ord(letter)-65)+key)%26)+65)

return ctext

The function takes two arguments, a string (plaintext) and an integer (key), does some math and sends back a ciphertext. It does this by looking at each letter in the plaintext (the for loop) and adding the key value to the ASCII value of each letter. Let’s break down the fourth line in the function:

ctext += chr((((ord(letter)-65)+key)%26)+65)

This looks complicated, but really it’s not. Look at it like a math problem, from the inside out.

First, the ord() function converts the letter to its ASCII equivalent. Then, we subtract 65 from the value. Why? Because in the convert() function of the program we convert the message to all upper case letters with no punctuation or spaces. Because ASCII ‘A’ is 65 and there are 26 letters in the English alphabet, and we want numbers in the range of 0 to 25, we subtract 65 from the result of ord().

The third step is adding the key to the resulting value. If our key = 3, this will produce a result from 3 to 28. Because this will exceed our specified range of 0 to 25, we do a little modulus math to get it back on track (28 mod 26 = 2).

Next, the chr() function returns the character equivalent of the new number. But, because ASCII codes 0 to 25 are non-printable characters, we add 65 to the number before converting it.

The last step is to add the new character to the end of the ctext string.

The decrypt() function works the same way, but instead of adding the key value it subtracts it.

Go ahead and try the program. When you’re done playing with it, try out some of these challenges:

The encrypt() and decrypt() functions do pretty much the same thing, meaning one isn’t needed. Try rewriting it so it only uses one encoding function.

Another cipher, the affine cipher, uses the formula y = (ax+b) mod 26, where y is the ciphertext, x is the plaintext, and a and b are a key pair. Try writing code to implement it.

The program as written only uses uppercase letters. Rewrite it so it uses the entire alphabet including lower case and spaces. Then see if you can code it to use the entire printable ASCII set (ASCII values 32 through 126).