To avoid spam, all mail addresses on this page have the "@" replaced by
"#".

Breaking the Cipher (Cryptanalysis)

One or two persons, level of ambition accordingly. No prior knowledge of
Cryptography is required.

In this project you will be given a set encrypted texts (ciphertexts). Your
task is to decrypt them to find the corresponding plaintext.

1. Caesar Cipher

This is one of the oldest and most known ciphers. It is a substitution
cipher where each letter of the alphabet is shifted some c steps to the
right.

Example: say we want to encrypt the plaintext: "THIS IS NOT VERY
SECURE". First, pick an integer 0<c<26 (assuming we have 26 letters in
the alhpabet). Next, shift all letters c steps to the right. Doing this for
c=7 yields:

Plaintext: THIS IS NOT VERY SECURE
Ciphertext: AOPZ PZ UVA CLYF ZLJBYL

To decrypt we simply shift the ciphertext c steps to the left and we get the
plaintext.

Your task is to write a program in F# that decodes a text file that
has been encrypted using a Caesar cipher. You can assume that only the ASCII
letters (A-Z) are shifted in the encrypted text, and that the text contains
only capital letters. (Hint: consider how you can find c. How many
different c's are there? Is it feasible to use brute force?)

Obviously your solution must involve the user. Your program should offer the
user to check proposed decodings of the text, and to pick the correct one.

Test your solution using the file TEXT_1.txt.
In this file, the text has been encrypted by shifting letters only: no
other characters are encrypted.

2. Frequency analysis of mixed alphabet

The security of the Caesar cipher in 1 can be increased by instead of simply
shifting the alphabet c positions, each letter l is replaced by the
letter p(l) according to the permutation p. (A permutation is
a function from a set to itself that is 1-1.) This means that an encrypted
letter e is decoded into p-1(e),
where p-1 is the inverse of p.

Example: we substitute letters in the regular alphabet (plaintext
alphabet) according to the permutation (ciphertext alphabet) specified by
the table below:

Using this ciphertext alphabet we encrypt the plaintext: "THIS IS A BIT MORE
SECURE":

Plaintext: THIS IS A BIT MORE SECURE
Ciphertext: RXGQVGQVHVCGRVWNPBVQBIAPB

The problem with this kind of encryption is that, although it scrambles the
letters, it doesn't change the frequencies of which they appear. Therefore,
if we analyze the frequency of each letter and compare it to the known
letter frequency of the English language, we are given good hints as to the
character mapping between the plaintext alphabet and the ciphertext
alphabet. For instance, using knowledge about how often whitespace tends to
appear in English texts, one could figure out from the above ciphertext that
whitespace most likely is mapped to the letter 'V'.

Your task is to write a program that decrypts a text that has been
encrypted using a mixed alphabet of the type described above. (Hint: search
the web for information about the relative frequencies of letters in the
English language).

Test your solution using the file TEXT_2.txt, which
contains a cipher text that is encrypted using a mixed alphabet. In
this file the text has been encrypted by permuting all characters in
the file (not just the letters), including parentheses, periods,
commas, etc.

Be aware: there can be random variations in the character frequencies
in a given text, that make them deviate from them in a table. Also,
frequency tables can be slightly different depending on what mix of texts
they are based upon. Thus, one cannot expect a solution that is based just
on execution frequencies to return a perfect decryption. Rather, one will
have to work in an iterative manner like the following:

Make a first guess of the encryption based on the frequencies. Most
likely, the encryption will be wrong for some characters.

Make guesses about the encryption of single characters. For instance, it
may be that you can spot common English words that are "almost"
right. You can then make a guess which characters to fill in, and you can
change the assumed encryption for these and apply to the text to see
whether the decryption became better. This should become increasingly
simple as more and more of the encryption becomes known.

You can also try to use frequency tables of character sequences to
gather more information about the encryption. Such tables exist at least
for length 2 and 3.

Thus, the software that provides your solution will work as a support for
this process rather than providing the decrypted message fully
automatically. In your report you should give a brief description how you
worked your way to obtain the final clear text, and how your software
supported this process.

The above is sufficient for one person. For two persons also implement:

3. Beating simple frequency analysis

Both the Caesar cipher in 1, and the mixed alphabet in 2 are vulnerable to
simple frequency analysis. The Vigenère cipher tries to combat this by
combining several Caesar ciphers. Recall the right shift offset c used in
the Caesar cipher above. In the Vigenère cipher, we construct a 'key' K by
selecting a sequence of these offsets. Let |K| be the key length (how many
Caesar ciphers that are combined).

To encrypt plaintext we apply the first Caesar cipher to the first |K|
characters. The next Caesar cipher to the next |K| characters, etc. After
applying the last cipher the process restarts with the first.

Example: we select the key K = {4 1 7} of length 3. For each offset in
the key, we construct a ciphertext alphabet by shifting the plaintext
alphabet that many steps to the right.

Say we want to encrypt the plaintext: "THIS IS THE MOST SECURE YET",
using the above key. The process would be:

The 3 first characters "THI" are offset by 4.

The next 3 characters "S I" are offset by 1.

"S T" are offset by 7.

"HE " are offset by 4.

"MOS" are offset by 1.

...

Offsets: 444111777444111777444111777
Plaintext: THIS IS THE MOST SECURE YET
Ciphertext: XLMTAJZG LIDNPT GZIGYSFAEL

Your task is to implement this cipher (not to break it). That is,
implement the encrypt and decrypt functions. The input to the encryption
function is a plaintext and a key. It outputs a ciphertext. The input to the
decryption function is a ciphertext and a key. It outputs the plaintext.

Note: there are several variations of the Vigenère cipher. You are
free to implement one of your own choice. However, you need to describe your
reasoning (in the report) if you decide on another variation than the above.