HW4: Cryptography

(b) (5 points) What method did you use to decrypt it? Describe why you
chose to use that method and how you proceeded with the decryption.

Problem 2 (20 points)

Using a shift cipher is very easy to break with a brute force attack
because there
are only 25 different keys for encoding. An alternative is to allow any
abitrary mapping of
each letter occurring in the plain text to a different letter in the
cipher text. This makes it
quite hard to remember which letter maps to which, so it was common to
have a keyphrase
that produced the key. For example, using the keyphrase “JULIUS
CAESAR”, we need to
remove any spaces and duplicate letters, which gives us “JULISCAER”
and then reproduce
the rest of the remaining letters in the alphabet, in alphabetic order
(and starting from the
last letter of the key phrase). This produces the following key:

(a) (8 points) If one wants to use the keyphrase “THE UNIVERSITY
OF TEXAS AT AUSTIN”, what is the key? Give it in the same
format as above, with the plain alphabet written on top and the cipher
alphabet on the bottom.

(c) (5 points) Even though the use of keyphrases permits
the use of a much greater number of keys than using a shift cipher, it
is not secure. What is the major weakness of simple substitution
ciphers? Use details of the above message and its encoding as an
example.

Problem 3 (30 points)

The following is cipher text that was encoded using a monoalphabetic
cipher:

Note that all spaces and punctuation in the original plaintext have been
removed. Also, numbers were removed, so there will be a few odd looking
phrases where something like “for 1 in 10 individuals” comes out as
“for in individuals”. (Note: this examples does not come from the text,
so it won't be helpful as a clue.)

In this problem, you'll exploit the major weakness of monoalphabetic
ciphers to decode this message using frequency analysis. The text is
somewhat long. Don't be intimidated by this – it actually makes the
problem easier. You will only need to write down the decoded plaintext
for the first two lines. (But, you get to use the rest of the text as
part of cracking the encoding.)

The relative frequency of a character is the number of tokens of that
character divided by the total number of character tokens. For example,
in ABCADEAFG the relative frequency of A is
3/9 = 1/3 = .3333 = 33.33%. In the ciphertext above, A has a
relative frequency of 2.54%. The following is the relative
frequency distribution for all the characters in the ciphertext.

Cipher
character

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

Relative frequency

2.54

10.99

3.48

7.18

6.8

3.31

4.83

0.99

0.14

0.14

6.69

4.31

1.77

2.62

1.82

0.08

0.06

9.31

7.85

3.07

1.46

3.9

5.61

1.35

0.8

8.9

Here is the same information graphically, ordered alphabetically and
ordered by frequency:

(a) (20 points) Decode the first two lines of the cipher text. Look at
the relative frequencies of letters in the English language
provided in Wikipedia. Compare the relative frequency distribution
given above to the distribution given in Wikipedia, using it to generate
hypotheses about which ciphertext characters might correspond to which
plaintext characters. Test those hypotheses on small segments of the
ciphertext to see if they make sense; then using the process of
elimination and some educated guesses based on the words you are
starting to see appear, determine the rest of the key and decode the
message. Write down the decoded plaintext for the first two lines of the
cipher text.

TIP: You should put the cipher text into an editor like
Microsoft Word or Google Docs and use Find and Replace as you test
different guesses. It will be especially helpful if you can replace the
capital letters of the cipher with lowercase plain text characters.

If you are able to use Unix, use the 'tr' function like I did in class,
e.g. if you think cipher text X is plaintext a:

$ echo 'XYZ' | tr 'X' 'a'aYZ

(b) (5 points) Describe the major aspects of the process you went
through to decode the message. Which strategies did you use? How did
your knowledge of English help in the decoding?

(c) (5 points) Write down the key showing which plaintext
character corresponds to which ciphertext character.

(d) (Extra credit, 4 points) The key was generated in a systematic way. How was it done? (This is not easy.)

(e) (Extra credit, 1 point) Can you find the web page that contains the
plaintext?

Problem 4 (20 points)

The following message is encrypted with the Vigenere cipher, using the
key phrase “LANGUAGEANDCOMPUTERS”:

Note: there are tools on the Internet for encoding and
decoding with the Vigenere cipher.
You are welcome to use them to check your work – but make sure you
understand the principle to what is going
on (keep in mind that these tools won’t be available during the exam).

Problem 5 (20 points)

Give two reasons why the Navajo code talkers were so effective for
secure
communication in WWII and explain one strategy that they had to employ
to avoid
being “decoded”.

You are encouraged to use outside sources for answering this question,
including Wikipedia's
code
talker page.