Evolutionary Computation Methods in Modern Cryptography

ABSTRACT

Meaningful attempts to produce practical solutions for the problem of an increasingly hostile environment are emerging in evolutionary computation. The problems cryptography faces are in the number and variety of ways it can be broken. One response is to build stronger defenses against increasingly powerful attacks. But what if the most powerful attacks came from the authorities who created the defense mechanisms you thought were protecting you? Powerful cryptanalysis tools and deceptive practices by authorities requires re-imagining the threat landscape. Existing solutions to this problem focus on collaboration between researchers and builders to ensure the implementation details of existing security protocols are effectively applied. Other courses of action suggest a better engagement with both software review and the creation of standards. There is no drawback to these solutions and they would complement a parallel effort to increase the diversity of encryption algorithms. Ensuring better reliability in these new algorithms means taking a look at the reliability of the building blocks of complex systems and protocols. New algorithms inspired by evolutionary computation that contribute to these building blocks is an approach that holds promise to this new threat.

1. INTRODUCTION

Secure communication was as much a concern in ancient times as it currently is in our modern civilization [11]. Many things are dependent on information being secure including the protection of some human rights. Article 12 of the UN Declaration of Human Rights and Section 8 of the Canadian Charter of Rights and Freedoms protects privacy, for instance. The lack of any privacy at all is central to Bentham’s ‘Panopticon’ prison, described as a form of power imbalance between the observer and the observed [10]. Fortunately there are Provincial and Federal laws in Canada like PIPEDA and FIPPA, that outline ways in which we are legally obliged to protect privacy. Government agencies such as the OIPC are tasked with oversight and enforcement of privacy laws should these obligations be overlooked, or dismissed. Yet, despite these compelling ethical and legal reasons, private communication is a challenge.

Cryptography has an important role to play in securing private communications[11]. The internet is supported by cryptographic systems and increasingly, these systems are threatened. The availability of computational power has amplified the effectiveness of cryptanalysis techniques [6]. This trend is likely to continue where brute force attacks become trivial with the advancement of quantum computing.

What’s also becoming a disturbing trend is cryptosystems coming under attack by the very organizations tasked with ensuring these systems work [3]. As was demonstrated by the events around the U.S. government developing the Clipper Chip in 1993 with a built-in backdoor and more recently by the NSA creating and promoting the flawed Dual EC DRBG algorithm, the concern that the organizations that create these cryptographic systems don’t have our best interests in mind is more than reasonable [2]. Once deliberate attempts are made to weaken encryption algorithms by the people who make encryption algorithms, what happens to the trust placed in those people and the algorithms they create[5]? By relying on algorithms created by governments that benefit from their exploitation, what is the cost? What are the practical alternatives? In contrast, the ability for non-governmental agents to create strong, cryptographically sound algorithms is probably best demonstrated by the creation of PGP (Pretty Good Privacy) in 1991 by Phil Zimmerman. Similarly, and as a result of this breakdown in trust, Blackledge et al [5] explore new personalized encryption algorithms using Evolutionary Computing methods. The hope is that this approach to cryptography has a better level of protection against cryptanalysis from state-level actors with access to powerful computational abilities.

In this paper I present the ways in which evolutionary algorithms have contributed to cryptography with a focus on how they can be used to create cryptographic primitives, such as pseudorandom number generators and block ciphers. These primitives, or building blocks are an important part of creating new encryption algorithms.

This paper is organized chronologically in order to give a sense of linear progression, and grouped by a specific cryptographic area of focus. The first part looks at evolutionary computation in block ciphers and the second part looks at primarily at pseudorandom number generators.

2.OVERVIEW OF CHALLENGES

Predictability in the generation of random numbers or the sequence of random numbers is a serious flaw that can be exploited, as was the case with the EC DRBG vulnerability [2]. Especially important to generating random numbers for cryptography is the passing of certain statistical tests which measure the randomness of pseudorandom number sequences. Essential questions to ask, when designing encryption algorithms, are can we produce new algorithms with enough complexity, chaos and randomness [5]?

2.1What is the perfect system?

True randomness is essential to security systems, readily available in the physical world (like atmospheric noise) and difficult to replicate in a digital system [13]. With enough computational resources aimed at detecting patterns in computer generated ‘randomness’, or pseudo randomness, it can result in a serious vulnerability. Cryptographically secure pseudorandom number generators (CSPRNG) are an important building block, essential, to building a secure system.

3. REVIEW OF PAST APPROACHES

If previous work in the area of cryptography could be expressed in broad terms, research would be bound by the Kerckhoff-Shannon Principle which states ‘a cryptosystem should be secure even if everything about the system, except the key, is public knowledge.'[10] The effect of this for past approaches to research is a focus on key exchange methods like Diffie-Hellman, increasing key length and proving the cryptographic strength of both symmetric and asymmetric algorithms [5].

Most of the research focus is on improving known encryption algorithms. Increasingly those algorithms are vulnerable to known algorithm attacks. For instance, the cost of computing power necessary to carry out precomputed attacks on Diffie-Hellman is available to state-level players and cryptographic ‘front doors’ such as those available in the DHE_EXPORT ciphers are known to be exploited by governments invested in mass surveillance [7]. Past approaches to cryptography have not been able to successfully protect us from these types of attacks.

4. PRELIMINARY

4.1 Modern Cryptography

The categories of functionality that together make up modern cryptography are hash functions, symmetric and asymmetric or public key cryptography [12]. Classic cryptography can be thought of as substitution and transposition and are considered broken [10].

4.2 Cryptographic primitives

Lower level algorithms that are reliably able to perform one specific task are used as building blocks in the design of cryptographic systems [16]. Block ciphers and PRNG are two examples of cryptographic primitives that are discussed in this paper.

4.2.1 Block Ciphers

The process of breaking up a piece of plain text into fixed sizes, or blocks, then transforming those blocks into an encrypted form with an accompanying symmetric key [15].

4.2.2 PRNG

A process that produces a sequence of numbers that does not follow a pattern is a pseudorandom number generator [12]. A deterministic system such as computer does not have the ability to create something truly random, so a number of mathematical measurements can be applied to the sequence to determine the level of random qualities it possesses. A cryptographically secure pseudorandom number generator (CSPRNG) is an algorithm that produces a number sequence with random qualities that can be verified by standardized tests such as those available from the National Institute of Standards and Technology (NIST) [13].

4.3 Symmetric Algorithms

Encryption methods where the sender and receiver share the same key and the key must remain secret are considered symmetric algorithms [12]. These algorithms generate a cipher text and a secure key. Diversity between the plain text, key and cipher text are critical components to improving the desired outcome. The security of the system also relies on keeping the key a secret. Key exchange mechanisms, such as Diffie-Hellman, are important as are making sure that the key cannot be brute forced or predicted based on the previous generation of keys. Both the quality of key generation depends on the quality of random numbers generated. Examples of symmetric algorithms are block ciphers and stream ciphers [12].

4.4 Evolutionary Computation

A familiar pattern followed by most evolutionary algorithms is: (i) initialize or seed a population (ii) apply a fitness function to evaluate each individual (iii) select the best individuals (iv) from those selections produce a new population applying crossover and mutation (v) iterate/loop back to step 2 until a condition is met [15]. Of all the methodologies that fall under evolutionary computation Genetic Algorithms (GA) receives the most focus in this paper.

4.4.1 Genetic Algorithms

Good quality pseudorandom numbers, those suitable for cryptographic applications, depend on passing a number of tests which is similar to the ‘tests’ of survival in a Darwinian context. This closely resembles the selection operator of Genetic Algorithms which chooses the best solutions in a range of possibilities according to a predefined test. Since the problem of generating a good pseudorandom number can be defined as searching for the best possible answer within a range of possible answers, genetic algorithms are well suited for CSPRNG as long as the tests applied are of a certain standard. The suitability of the algorithm can also be attributed to non-linear way that it arrives at the solution. Linear methodologies of producing randomness cannot guarantee the unpredictability of the sequence of numbers generated and are unsuitable for use in secure applications [13].

5.CRYPTOGRAPHY AND GENETIC ALGORITHMS

5.1 Block Ciphers

5.1.1 (2011) Improved Cryptography Inspired by Genetic Algorithms

Picek and Golub [12] present an overview of areas in cryptography to which evolutionary algorithms have contributed. The problem of identifying what level of applicability to real-world scenarios is the main goal of their survey, conducted in 2011. Acknowledging a renewed interest in the area they introduce six general applications including block ciphers, hardware design, boolean functions, hash functions, S-box design and the design of pseudorandom sequences. In some specific areas the authors imply these algorithms have had a limited effect in real world applications. For instance, in an evolutionary approach to creating block ciphers, only parts of Genetic Algorithms are used and no testing or verification of the security level of the ciphers is done. In Improved Cryptography Inspired by Genetic Algorithms (ICIGA) only crossover and mutation operations are used in the ciphering of plain text and no fitness operation is applied. The authors Tragha, Omary and Mouloudi do claim that is that the algorithm is faster than other block cipher systems such as AES, DES, or IDEA. While this has value with respect to efficiency, the use for it in the real world would be limited without proper testing for cryptographic strength. Picek and Golub suggest, generally, that more research in modern evolutionary algorithms will produce better results in cryptology.

5.1.2 (2014) Evolutionary Genetic Algorithm for Encryption

Similarly, in the strategy that Alsharafat [1] uses on block ciphers, no fitness score was applied to any of the binary sequence populations. The technique involves converting a plaintext message to an ASCII matrix, then into binary in order to leverage efficiencies in crossover and mutation. The size of the plaintext message matrix determines the size of the key matrix which is used in the decryption process. Once in binary format, a multi point crossover is performed on the message to create new ‘children’ binary sequences. Mutation of those children is described as a ‘flip-flop’ of text; randomly exchanging 0’s to 1’s. The final step is to convert that binary sequence into something readable. The benefit, beyond efficiency, that the technique Alshraft introduces is that it reduces the vulnerabilities to a cipher text attack by increasing the level of diversity between and cipher text and plain text. No recommendation was made to evaluate the effectiveness of this technique in the context of improving overall security, only that the algorithm would be enhanced by applying fuzzy logic to the crossover and mutation operators. Efficiencies are a valuable contribution but without the assurance of effectiveness in security it limits the applicability in cryptographic systems.

5.2 Pseudorandom Number Generators

5.2.1 (2004) Introduction to the Applications of Evolutionary Computation in Computer Security and Cryptography

Highlighting the growing interest in Evolutionary Computation to solve both cryptanalysis and cryptographic problems and recognizing some of challenges that were faced, Isasi and Hernandez [9] provide a brief introduction and history of papers drafted between 1979 and 2003. At the time of writing, Evolutionary Computation was playing a more significant role in many cryptographic areas, and so it was important to get a sense not only of its successes but of the unresolved problems. Genetic Algorithms primarily concerned with the cryptanalysis of substitution and transposition ciphers were identified in research conducted between 1993 and 1995. Genetic Algorithms were first used in what is considered modern cryptography in 1997 where a GA produced a ‘ciphertext-only attack over a simplified version of an ENIGMA rotor machine’ [9]. Later in 2002, along with simulated annealing, a GA was employed to cryptanalyze a pseudorandom number and public key exchange created by a neural network. These were valuable contributions that ensured a strengthening of cryptographic techniques.

Isasi and Hernandez note that Evolutionary Computation has a history of also being well suited for pseudorandom number generation (PRNG). While the first papers to tackle PRNG in this way were written in 1986, the most promising results came from a 2003 study by Sheng-Uei and Shu Zhang [14]. Built on the previous use of cellular automata to generate PRNG, their approach improved the entropy of the numbers generated by employing an Genetic Algorithm to evolve the structures of cellular automata (CA). The problem they were trying to solve was how to build new algorithms that produce numbers with comparable random qualities without the hardware requirements that were necessary for a two-dimensional (2-D) CA PRNG. As a measure of fitness, tests were used to evaluate the randomness of each of the generated numbers. The tests measured an average value of entropy, serial coefficient and chi-square with emphasis placed on the results of the chi-sqaure test. The value of this optimization technique is that it measures fitness not just in one way, but in a multitude of ways, which they termed Evolutionary multi-objective optimization (EMOO). Efficiency of the three algorithms produced are mentioned within the context of of each other and previous CA PRNG, so clearly it was an objective equal to producing effective random numbers. Performance as measured by cycle length and efficiency of output are considered as parameters to include as optimization objectives in future research. This study demonstrates an attention to both the quality of random numbers and the efficiency of its implementation. However, Isasi and Hernandez note that applying a fitness function to measure the effectiveness of any random number generated proves to be difficult because of the many ways in which randomness can be measured.

5.2.2 (2004) On the Design of State-of-the-Art Pseudorandom Number Generators by Means of Genetic Programming

Indeed, the problem of measuring an elusive concept like randomness in a fitness function is highlighted in the approach taken in 2004 by Hernandez et al to generate a PRNG using Genetic Programming (GP) [8]. The authors acknowledge that the most common and successful means to create a PRNG is with a Cellular Automata algorithm mentioned previously. However, the problem for the authors is that there are too few techniques that are both fast and able to pass rigorous tests for randomness. So problematic was the combination of an unclear definition of randomness and the narrow focus of statistical tests in most of the fitness functions at the time, that the measurement of randomness in this paper was replaced by a measurement of non-linearity, expressed as the ‘avalanche effect’. When the generated numbers were tested for randomness against the DIEHARD test suite, GCD, Gorilla, Birthday spacing, Frequency and Collisions tests, they demonstrated strong pseudorandom properties. Interestingly though, the results of this approach did not satisfy the researchers enough to recommend using their generator for cryptographic purposes. Nevertheless, the approach demonstrated a new and viable way to generate pseudorandom numbers using a method of Evolutionary Computation.

Poorghanad et al [13] address the problem of creating a PRNG with acceptable cryptographic strength by introducing a methodology that employs known statistical tests created by the National Institute of Standards and Technology (NIST) in the fitness function. In this way, there is a level of assurance for cryptographic strength as defined by a standard. The problem addressed is with the potential predictability inherent in the numbers produced by the Linear Feedback Shift Register (LFSR), a previously common way to generate pseudo random bits. On its own, LFSR will not guarantee the unpredictability necessary in a secure system. The solution to that problem uses a GA to evolve, in a nonlinear way, the initial population of numbers, which is a product of a linear function. The fitness function of the GA uses a suite of NIST tests to evaluate the best sequence of bits within the population. The selection phase of the algorithm uses a tournament methodology. That selection process involves comparing two numbers at a time and based on fitness level, discards the weaker one and moves the ‘stronger’ one to the next round of comparison. The results of the experiment would appear to be satisfactory if it weren’t for the recommendations that were made to modify the selection and mutation methods to achieve better outcomes. Clearly there was some level of dissatisfaction with the quality of numbers generated. However, the value of this study is in that it presented a new algorithm design and that it by using a standardized suite of tests to evaluate the quality of generated numbers, they were able to make an informed decision for future research efforts.

5.2.4 (2013) Cryptography using Evolutionary Computing

Blackledge et al [5] also use a suite of NIST tests in a fitness function but use a different, nonlinear seeding mechanism than LFSR. Their approach seeds the initial population with ‘natural noise’ in order to force the GA to output something that resembles (through inheritance) that expression of ‘true randomness’. This approach differs as well in that it aims to address a broader problem of relying on algorithms created by a central authority. Along with the historical events that have eroded the trust placed in governments to create reliable encryption algorithms, and the forecast of increased internet traffic (and the subsequent need for data security that goes along with that) the research paper focuses on PRNG as a building block for generating personalized algorithms. In this scenario, a personal/private encryption key generated by a known algorithm and presumed to be vulnerable by state-level players with access to large computational power is less desirable than a personalized encryption algorithm using a PRNG with good pseudo-random qualities. The security of a one-time-pad, for instance, relies on the randomness of the key. The framework for the proposed design of personalized algorithms consists of first seeding the initial population with a source of natural noise, which in this case was a sample of numbers from RANDOM.ORG. Selecting the best numbers came after they were subjected to five tests: Uniformly Distributed Power Spectral Density Function (PSDF), Uniformly Distributed Statistics, Positive Lyapunov Exponent, Acceptable CPU time and Acceptable Cycle length. Iteration occurs until the numbers pass the tests, resembling an approximation of the statistical properties of natural noise. The results were positive. Within the year prior to the paper being published, over 300 ciphers had been produced using this Evolutionary methodology. A problem identified by the researchers while not unlike other applications of chaos to cryptography, was that their approach pays no attention to the algorithmic complexity.

5.2.5 (2015) Cryptography Using Artificial Intelligence

Two years later, two of the same researchers investigated a technique for a PRNG for cryptographic purposes using an Artificial Neural Network (ANN) [4]. Of interest to Evolutionary Computation is that they compared the results of both the GA approach and the ANN approach, highlighting a potential disadvantage for using Evolutionary Computation. As far as the quality of pseudo-randomness generated, both techniques produced similar results. The main difference lies in the time it takes to generate those numbers. What takes a GA hours to produce could be done in a manner of minutes with the ANN.

6. SUMMARY OF THE STATUS OF THE FIELD

Using Evolutionary Computation to produce cryptographic primitives such as PRNG and Block Ciphers demonstrates some positive results. Both effectiveness and efficiencies were among the positive results but the limitations in the majority of papers reviewed are that it is one or the other. Rare is the combination of effective security and efficiency of computation. There are also many tests that measure randomness.

7. DESCRIPTION OF FUTURE CHALLENGES

Evolutionary computation has proven to benefit cryptographic applications and while the quality of the results can sometimes be excellent, it comes with an efficiency cost which ultimately affects its applicability.[8] The challenge for future research for evolutionary computation in cryptology seems to be to find a way to achieve both producing effective cryptographic primitives and doing so in an efficient manner. Clarity on which tests are the most effective would prove to be beneficial.

8. CONCLUSION

New algorithms using Evolutionary Computation are being created in a number of different ways, contributing to a growing diversity in algorithms created by non-governmental agents. Cryptographic primitives are a crucial first step in creating secure cryptographic systems and while they are singular in their purpose, they are far from simple to create. Evolutionary Computation strategies such as Genetic Algorithms can be an effective means to create these building blocks. Attention to both the effectiveness and efficiency of the algorithms will be a focus for future research endeavors.

Brad Payne is currently the lead developer for the Open Textbook Project whose work focuses on open source software using PHP (LAMP).
When not contributing to other developers’ projects on github, he builds his own. Through exploiting API’s and with a penchant for design patterns, he helps BCcampus implement new technologies for post-secondary institutions. Prior to his current position at BCcampus, Brad worked in IT at Camosun College and the BC Ministry of Education.