Comments 0

Document transcript

===============================================================================Course: Math 311w-01Laboratory 5: The RSA cipher and public-key cryptography, part 1Date: 2011-11-11================================================================================Introduction================================================================================Today, we are starting a two-part laboratory on RSA encryption. RSA encryptionis a certain form of asymmetric public-key cryptography system. Traditionally,ciphers have relied on a symmetric private key system. A special code iscreated using a secret key. The key is used to encrypt the message. Then, themessage can only be decrypted by a person with the key. The cipher is symmetricbecause the same key is used to both encrypt and decrypt the message. As longas the key stays "private", the message is secure (well, atleast that's what'ssupposed to happen).An asymmetric public-key cryptography system uses two different keys, one toencrypt the message, and a different one to decrypt the message. Because thetwo keys are different, it is okay to make one of them public, while keeping theother private -- even if a spy knows how the message was encrypted, they stillcannot decrypt it. This means that you can publish your public key, and anybodywho wants to send you a secure message can, as long as you keep your private keyprivate. You can learn more about the details of this latter athttp://en.wikipedia.org/wiki/Public-key_cryptographyMy explanation simplifies matters a little more than it should. It turns outthat the RSA cipher relies on certain mathematical properties that nobody hasyet been able to establish the truth of. Thus, there is a fear that some day, asmart person will figure out how to read all our secrets. The 1992 movie"Sneakers" is based on the fundamental dilemma of the RSA encryption system --it gives us vital flexibility in creating new communication channels with peoplewe have never met, but we don't actually know if these communications channelsare secure. Bitcoin currency is a technology developed using related ideas, andsuffering from similar potential limitations.Goals================================================================================This lab will probably be too long to complete in a single sitting. In thefirst part of the lab, we will put together the core number-theory functions wewill need. In the second part of the lab, we will implement the RSA cipher, andsee how RSA lets you encode a secret message that will stay secret EVEN IF SPIESKNOW THE CODE YOU ARE USING. Because the second part makes use of the firstpart, you will need the first part working by next Friday.As you can gather from our progress in testbook section 1.6 so far, RSA makesuse of modular arithmetic. If you flip ahead, you can see that RSA is rathersimple in appearance, just using exponentiation and modulus operations, exceptfor a few things related to GCD's. Thus, we need a few functions from basicnumber theory to help us as we go.- We need to be able to calculate gcd(x,y).- We need to be able to tell when an integer is prime.* It will also be convenient to have a way to search for new primes.- We need to be able to calculate the inverse of [a]_n.We will use python to right code for encrypting and decrypting a message. Whilethe RSA algorithm is (relatively) simple, it repeats certain calculations manytimes, and the numbers used are much larger than the numbers we usually workwith by hand. The numbers are even larger than calculators can work with -- 32bit calculators can only represent integers as large as 2**31 ~= 1e10. In 2011,RSA keys use integers about as large as 2**2000. However, some sophisticateddesktop calculator programs like "dc", "bc", and Python know how to representlarger integers, only limited by the computer memory. While you can understandthe RSA algorithm, you would never want to do an encryption or decryption byhand.Part 1: Number theory tools================================================================================First, we want to make a library of the number-theory functions we will need.Open a new file called "libnumbertheory.py". This is the file where we willwrite our basic mathematical functions. We will use the import command toinclude these functions in our cipher code.To speed things up, I have posted a script called "util.py" which contains somebasic utility functions will eventually be useful, but which are notparticularly important or interesting. Download this file and put it in thesame directory as numbertheorylib.py Make the second line of"numbertheorylib.py" read "from util import * "http://www.math.psu.edu/treluga/311w/lab5/util.pyGCD's--------------To calculate gcd(x,y), we can use Euclid's algorthm, like we learned in class.Write such a function. It doesn't need to be very long. My implementation is6 lines, and you can probably do it fewer. Use the following codeto test your implementation.##<start code>##TEST = Falseif TEST:for (x,y,g) in [(2*3*5, 3*5*7,3*5),(2*89,7*89,89)]:print "#Test gcd: gcd(%d,%d)=%d=%d"%(x,y,gcd(x,y),g)##<end code>##Primality testing-------------------Now, we need to be able to find prime numbers. We will usea variant of Eratosthenes's Sieve (http://www.youtube.com/watch?v=9m2cdWorIq8)The catch is that finding prime numbers is expensive, particularly asnumbers get large, so we do not want to waste more effort on this thanwe have to. Write a function isprime(x) that returns True if x is primeand False if x is composite. Use the following code to test.##<start code>##TEST=Falseif TEST:for x in range(2,20):if isprime(x):print "# %d is prime."%xelse:print "# %d is composite."%x##<end code>##Once your isprime(x) function is working, we can use it to find new primenumbers with the following function.##<start code>##def nextprime(x):while not isprime(x):x += 1return x##<end code>##

Note the speed of your function. For lab-purposes, we will need to work withprime-numbers having atleast 3 digits. What's the largest prime number youralgorithm can quickly find?Modular Arithmetic inverses-----------------------------Now, we need to calculate an inverse of [a]_n. We can check if this inverseexists using the gcd implementation above and the condition 1==gcd(a,n), but weneed to do the matrix calculation to find the inverse. This is a complicatedcalculation, where it is easy to introduce bugs and hard to find them, so ratherthan writing your own function, use the code below. Note that this code usesthe numpy library to facilitate matrix and vector calculations.##<start code>##import numpydef inverse(a,b):# these first part of the function# is a set of tests and conversions# that we use to set things up before# the main algorithm starts.assert isinstance(a,int) or isinstance(a,long)assert isinstance(b,int) or isinstance(b,long)if (a < 0): a = -aif (b < 0): b = -ba = a % bassert not 0==aassert 1 == gcd(b,a)# Now, we start the main algorithm.M = [ numpy.array([a,1,0]), numpy.array([b,0,1])]i_big = 0if ( a < b ):i_big = 1while M[i_big][0] > 0:i_big = 1 - i_bigM[i_big] -= M[1-i_big]*(M[i_big][0]/M[1-i_big][0])# The algorithm has finished. Now, we make sure the# result is in the form we need, and we return it.inv = long((M[1-i_big][1]+b)%b)return invTEST = Falseif TEST:for a, n in [ (5,26), (11,23), (27,128) ]:y = inverse(a,n)print "#Check inverse: [%d][%d] mod %d = %d"%(a,y,n,(a*y)%n)##<end code>##Part 2: RSA encryption===========================================Now, we have all the parts we need to implement the RSA cipher. You maydiscover that the functions you have written are not "good enough" because theyare slow or buggy, but that's something we'll assess as we make progress.RSA.py--------------We will implement the RSA cipher as a pair of classes in python. A class is aconcept from somewhat obsolete object-oriented programming paradigm. It is aset of data variables and some functions that operate on them in specific ways.Here, we want to associate encryption and decryption routines with specificpublic and private key pairs.Open up a new file "RSA.py" and fill in the following code. Note that gcd,isprime, and inverse all appear atleast once. The symbol "**" appears forexponentiation, while "%" appears for modulus.##<start code>##from libnumbertheory import *class RSAKey:def __init__(self,n,k):self.n = nself.k = kdef crypt(self,m):return [ long((m_i**self.k)%self.n) for m_i in m ]def __str__(self):return str( (self.n,self.k) )class RSACipher:def __init__(self,p,q,a):assert isprime(p)assert isprime(q)p,q = long(p),long(q)n = p*qtotient = (p-1)*(q-1)assert 1==gcd(a,n)x = inverse(a,totient)%nself.public_key = RSAKey(n,a)self.private_key = RSAKey(n,x)# extra things that should be forgottenself.primes = (p,q)self.totient = totientdef encrypt(self,m):return self.public_key.crypt(m)def decrypt(self,b):return self.private_key.crypt(b)##<end code>##Using RSA.py-------------------I've provided two scripts to show you how RSA.py can now be usedto encrypt and decrypt messages. Firsthttp://www.math.psu.edu/treluga/311w/lab5/test_with_fixed_keys.pyshows how you can make your own keys using prime numbers.Second, the scripthttp://www.math.psu.edu/treluga/311w/lab5/test_with_rand_keys.pyshows how random keys can be generated.