[Crypto Algorithm Review] KB15: A stream cipher with rolling keys.

Before I begin: I've pretty recently fallen for crypto in a hard way, and while I've always enjoyed the novelty of simple ciphers, the nuts and bolts are now more like jewels and diamonds. To that end, I'd like to extend my thanks to you all for this forum, for the information you've posted here, and for the links to information hosted elsewhere. Thank you.

With that said, I'd like to dump my brain about this cipher. My apologies if anything written from here on out is blatantly obvious, elementary, just-plain-wrong, or otherwise. My thanks for any feedback or criticisms which contribute towards improvement...

I'd like to submit for your dissection, digestion and review, an output-feedback mode stream cipher that employs pseudo random rolling keys based on a master key seed. I call it KB15 and have implemented it in ... (is anyone looking?) ... Javascript. It is inspired by Bruce Schneier's "Solitaire" in the same way that tea is inspired by sugar, yet includes a few of my own discoveries. Of course none of these discoveries were truly my own, as they had already been long discovered in the crypto world.

The Stream Cipher:
Like all stream ciphers, KB15 works by assigning numeric values to a fixed set of letters and symbols, aka the stream. In this way, a stream cipher is both defined and limited by it's stream since any symbol unaccounted for cannot be encrypted and, therefore, will be dropped from the final message. KB15 has a total of 96 symbols in it's stream, these include: All English letters and symbols, as they appear on a standard QWERTY keyboard, spaces, line-breaks and tabs.

To encrypt a symbol, X, using a key, Y, we'll convert both X and Y into numbers, derived from their position in the stream, then add them together. The total of this addition will become the value of our encrypted character, Z.

If Z exceeds the number of characters in our stream, we'll subtract the number of characters in our stream from it, in effect forcing it back to the beginning of the stream. Decryption follows the same process, but reversed.

For this reason, in the Javascript implementation above, our stream begins with an unused symbol in position 0. This is because any number plus or minus 0 is itself and would defeat the algorithm.

Pseudo Random and Rolling Keys:
By default, KB15 generates a pseudo random, 62 character long key of alphanumeric symbols. If the key is not the same length as the text it is meant to encrypt, it will repeat itself until it is.

However, this is a security concern. If an attacker were to map out all of the possible values of each encryption key symbol against each clear text symbol, not only could they break the message, but a string of values that kept repeating themselves would be a dead giveaway.

To this end, KB15 has several additional measures in place. The first of these measures is what I call "Paranoia." Paranoia, P, is not an uncommon concept in the crypto world: put simply, it just says "Hey, take the encrypted output you just made and encrypt that again." KB15 will do this as many times as specified by P.

Code:

for ( var a = 0; a < p; a++ ) {

That in itself is not enough, however. Say your Paranoia level were 2, aka encrypt the encrypted output once. If you used the same key for both stages of the encryption, what would be the point? The first layer of the onion would reveal the second layer without further analysis. This won't do as KB15 seeks to make any attacker work, and work hard, for that onion.

Queue the second measure: rolling keys. Rolling keys are, again, not an uncommon concept in the crypto world.

KB15 uses them as such: the user enters a master key. This master key can be any alphanumeric string of any finite length. The master key is parsed, one symbol at a time, and each symbol is converted into it's corresponding value in the stream. That value then undergoes the following equation: Rounding up to the nearest integer, take the number of the Paranoia phase that we're at plus Pi, and multiply it by the natural logarithm of 10, then take that total and add it to the square root of the number of the Paranoia phase that we're at.

Code:

var p = Math.round(((p + Math.PI) * Math.LN10) + Math.sqrt(p));

Finally, if that number is greater than 48, aka halfway through our keystream, subtract the value of the Paranoia number we're at from it. If it is less than 48, add the value of the Paranoia number that we're at to it.

The result of this will be a new, pseudo random key based upon the master seed key but ultimately different. KB15 will repeat this process for each phase of Paranoia, so that each layer of the onion is encrypted with a different key; a key the user could compute but, practically, does not know.

This also introduces a security risk, however, because the equation, while pseudo random in appearance, is not truly random: it will follow a fixed and mathematically significant order. This means that, if an attacker were to examine all possible key possibilities, they could potentially discern the unique pattern created by this equation and discover the keys used for encryption.

To prevent this, the stream as defined in the charStream variable is arranged in a pseudo random order. This means that there is probably more/less linguistic correlation between any set of number values and therefore, theoretically, a higher number of possible keys which fit the profile of the equation - many of which are dead ends.

To add a further layer of complication, as a final measure, KB15 will reverse the order of every other pseudo random key that it uses.

The net result of all of this, then, is that not only is our onion pretty securely encrypted at multiple levels, but the end user can also utilize a dictionary based key without quite as much concern since that word won't actually be used to encrypt the text.

Of course, the best practice is to use a pseudo random master key as well. Also, any key used is ideally as long as, or very close to it, the length of the text it will encrypt. And, of course, do not use the same key twice. :p

Output:
The output modes programmed into KB15 are not cryptographically important. They are just formatting for human eyes. The first, and default, will return a constant string of alphanumeric symbols as the cipher text. This is called "Stream." The second will split the cipher text into blocks of 4 symbols each, separated by spaces, i.e. AAAA 0000 BBBB 1111. This is called "Block-4." The final mode will output the stream into lines consisting of 8 blocks, with each block containing 4 symbols, i.e. like PNG encoding. This is called "PNG-like."

I would be very honored if you tried it out and let me know your thoughts. This demo includes a 'console' of sorts which should give some glimpse into the inner workings of the algorithm. You can access the commented code directly at: http://nullsnode.nfshost.com/KB15/kb15.js

Use:
In the event that anyone finds any portion of my code or implementation useful, you have my sincere gratitude and encouragement to use it on two conditions: 1. that it not be used for profit, and 2. that it not be used academically/professionally without attribution. Not that I presume either of these scenarios will happen, but I just want to put it out there.

Thank you for your time, I know this was long-winded. I welcome any feedback, questions, concerns, comments or otherwise - especially if I have missed something obvious.

As well, let me know if you have an algorithm that you'd like me to review, or any links or reading material that you think I should research. I would love to learn more.

- Null

Last edited by null.if.ied; February 12th, 2013 at 07:47 PM.
Reason: More formatting...

Danke sehr, Karl-Uwe, and cheers for the links! So I've had a look at the material you've linked and also reviewed, per your suggestion, some info on transposition ciphers:

I'm not sure that this is a proper transposition cipher since the order of the cleartext remains unmodified through the encryption/decryption process, although it certainly does use fractionation techniques to achieve some level of diffusion (glad to learn what that is)... I fear I'm being a bit thick here, though. Would you mind elaborating a little on that?

I'd like to mention that I saw the source code and abstract/doc for your zx8 (ps) cipher. My sincere compliments on what has been, thus far, a fantastic read: it's a very impressive piece of work. The threads/dialogue regarding your pqRNG have been most informative, as well.

My thanks again for your responses; they've shown me a great deal already.

So I've had a look at the material you've linked and also reviewed, per your suggestion, some info on transposition ciphers:

I'm not sure that this is a proper transposition cipher since the order of the cleartext remains unmodified through the encryption/decryption process, although it certainly does use fractionation techniques to achieve some level of diffusion (glad to learn what that is)... I fear I'm being a bit thick here, though.

Of course you are using fractionated encryption as far as I understand after a second quick glance, but it's still some kind of transposition cipher I suppose

http://en.wikipedia.org/wiki/Transposition_cipher#Fractionation

Edit: I need to correct myself, because your algorithm is in fact a substitution and not a transposition cipher

Originally Posted by null.if.ied

Would you mind elaborating a little on that?

Well just a very hasty and simplistic analyse reveal strong correlation between key and ciphertext

As you may notice, there occur not such a strong correlation between ciphertext and key, not even between keystream (Rolling Key in your algorithm) and the actual keyword.

Originally Posted by null.if.ied

I'd like to mention that I saw the source code and abstract/doc for your zx8 (ps) cipher. My sincere compliments on what has been, thus far, a fantastic read: it's a very impressive piece of work.

Many thanks for your compliments. But still need to cryptanalyse and verify that it's impossible the roll back (reverse) to any former keystream even when Eve know the complete keystream or even worst, the whole internal state at any time. As you can see, it's one thing building a cipher algorithm which can disguise the plaintext, but the other way round in thinking as Eve and what she would try to break the code is a far more difficult but also a extremely important task.

Originally Posted by null.if.ied

The threads/dialogue regarding your pqRNG have been most informative, as well.

And as you mention it. pqRNG in it's basic form is a extremely weak RNG, in fact it's nothing else than a alternating pseudo-random permutation generator and should never be used in any serious cryptographic solution.

And there is something else that you perhaps like to take care of in your algorithm, because your key generation function uses a weak random number generator.

Originally Posted by null.if.ied

My thanks again for your responses; they've shown me a great deal already.

The substitution is clearly visible when you analyze what is produced. In addition to the weak number generator which you mentioned, it's also apparent that my equation for creating the rolling keys has a strong amount of bias as well with 7, 4 and 2 being the most common 'steps' between keys.

Of course the unfortunate effect of this is that the key is not particularly secure as it could be cracked using all of the attacks suitable for substitution ciphers. A more 'random' key might buy time, but it would not prevent an adversary from breaking the encryption. Drats.

However, when we examine the final result, I think that - overall - the algorithm is still a stream cipher:

So, from your comments and the above, I can see now that both my random number generator and rolling key generator are rather weak and the rolling keys are a large flaw in this algorithm right now.

Originally Posted by Karl-Uwe Frank

And as you mention it. pqRNG in it's basic form is a extremely weak RNG, in fact it's nothing else than a alternating pseudo-random permutation generator and should never be used in any serious cryptographic solution.

Of course I also read the google groups discussion on the number generator, and saw where it's weakness was demonstrated but was impressed nevertheless and learned what little my intellect could about RNG's. Considering the weakness of my own (the native JavaScript RNG, btw, for those interested, which I'd always suspected of being rather weak), it had a good bit of info to digest...

With that said, I'm curious to know whether you've since worked on any more secure solutions? Is pqRNGr2 an attempt at that?

Thank you very much for testing the algorithm and posting your results. In fact, I feel a bit daft now to see such strong correlation between the key and cipher text.

...

So, from your comments and the above, I can see now that both my random number generator and rolling key generator are rather weak and the rolling keys are a large flaw in this algorithm right now.

You did very well and a good job in analysing your algorithm.

You see, from my point of view right now the core of any good stream cipher is clearly and simply a strong PRGA (Pseudo-Random-Generation-Algorithm) in combination with an equally strong KSA (Key-Schedule-Algorithm or Key-Generation-Algorithm).

And I am totally in favour for stream ciphers, because they are easy to implement and therefore reducing so many failures which you can find in many crappy programs and/or implementations. Just have a look at Matthew Green's last articlehttp://blog.cryptographyengineering....e-cbc-mac.html
and others of him which shed a light on some common mistakes.

Originally Posted by null.if.ied

Of course I also read the google groups discussion on the number generator, and saw where it's weakness was demonstrated but was impressed nevertheless and learned what little my intellect could about RNG's. Considering the weakness of my own (the native JavaScript RNG, btw, for those interested, which I'd always suspected of being rather weak), it had a good bit of info to digest...

With that said, I'm curious to know whether you've since worked on any more secure solutions? Is pqRNGr2 an attempt at that?

It is just an extension of the same alternating formulae, using only more integer values as seed. It should also not be used in any cryptographic context. There is another algorithm based on pqRNG, call xqRNG32 which for sure passes all statistical tests, but still is useless when it comes to cryptography.

And of course there is RC4, the masterpiece of all known stream ciphers until today. It has a clear and easy to understand algorithm, it is fast, it is perhaps the best analysed cryptographic algorithm until today and you can even memorise the complete code. That of course fascinated me. Also Bob Jenkins work has some strong similarities with RC4 - and all it's derivatives.

But after all the important point is that the PRGA, which generates the keystream, has to be cryptographically secure. Whatever other diffusion levels one might integrate, it all falls back on the CSPRNG. Consider that Eve can strip and peel off any kind of diffusion, but should not be able to reverse the PRGA and/or reconstruct the key.

So the main focus has to be on a strong PRGA in the first place.

My opinion is that the algorithm should be simple to understand, straight and portable to implement and also easy to analyse.

This clearly left me with RC4.

Finally my favourite way of encryption is the several hundred years old idea of Giambattista della Porta using a polyalphabetic substitution cipher - but with the help of the computer and by using a self modifying alphabet - like RC4.

And of course this is what I can see in your approach too. You are trying to implement a way of modifying the keystream over several rounds. Unfortunately currently it seems to be too linear, lacking the unpredictability.

I like to add that you should really keep up the good work and try to improve your idea!

Cheers,
Karl-Uwe

P.S.: And of course, of course, of course I have made so many mistakes and wrong assumptions if you look at all my previous ideas ;-)

Thank you again for another fantastic reply, and for your encouragements.

Originally Posted by Karl-Uwe Frank

And I am totally in favour for stream ciphers, because they are easy to implement and therefore reducing so many failures which you can find in many crappy programs and/or implementations.

My sentiments exactly. Stream ciphers, like ARC4, as you mentioned, are so easy to implement and keep in your head all at once - but are still quite fun to experiment with. Especially in the case of ciphers like Porta which is just so cool.

Originally Posted by Karl-Uwe Frank

You see, from my point of view right now the core of any good stream cipher is clearly and simply a strong PRGA (Pseudo-Random-Generation-Algorithm) in combination with an equally strong KSA (Key-Schedule-Algorithm or Key-Generation-Algorithm).

You are exactly right. Unfortunately, aside from using a BigInt library, I'm not sure if there are many ways using native JavaScript to implement a CSRNG. The methods I've attempted so far utilize exponents which create exceedingly large integers. JavaScript does not seem suited to handle those, so a shift of language may be in order at any rate.

Needless to say, my research into RNG's, which I'm sure could span a lifetime, continues. In the meantime, I've been focused on a PRKGA to replace the heavily flawed algorithm KB15 currently employs over the past few days.

As you mentioned, any amount of diffusion does not matter when we consider that Eve is advanced enough to strip it away. In this regard, and from reading several of the sources you've posted, I've shifted my attention to a strong initial pass. As well, the output of the algorithm has changed. Where before we matched the length of the seed (which resulted in, basically, a simple substitution cipher), we now always generate a 32 Char (256 bit) keystream, expressed as 64 digits. My efforts have resulted in the following:

seed = A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A
seed raw = 1111111111111111111111111111111111111111111111111111111111111111
key raw = 0325476980123456789321076541110230076541110456701231213547610321

As you can see, there is MUCH less bias now when using a keystream with identical symbols!

It's still not entirely perfect, but definitely moving towards the goal of shifting keystreams through multiple iterations. I haven't dropped this into KB15 itself yet, because I'd still like to perform some more analysis on it, but with a bit more tweaking I'm wondering if this wouldn't be a good start towards a stronger cipher.

I'd like to thank you again for your feedback, time, and the links - they are always teaching me so much more. If you're so inclined, I welcome you (or anyone) to critique this new algorithm or point me towards any additional resources. Likewise please point out anything rather obviously wrong, for I have most likely missed it.

I haven't dropped this into KB15 itself yet, because I'd still like to perform some more analysis on it

Well, I got to do some analysis this morning and my initial results were rather upsetting. As it turns out, the first 33 chars of the keystream were PR enough...but from char 34 through 64 a LARGE amount of bias surfaced again. So much so, in fact, that the same pattern would emerge across multiple seeds:

Not very good. I got to thinking about why this might be the case and realized that, at the 34th char we're computing seedXOR with an x value of ... 34. From this point of the loop onwards it seems the same values are being fed into the algorithm...resulting in duplicated output. Not only that, but the IF statement set to catch double digit INTs (which I originally devised as a way to prevent excessive "1's" from entering the keystream) was really just making it worse.

To simplify, I rewrote the seedXOR function as a single equation, removed the IF statement, and added some simple multiplication, modulo 9. This modified the output slightly:

Still, not quite enough though as you can see from the sections that I've highlighted.

Examining the equation again, I removed the second reference to i and replaced it with x. Then modified to modulo 3 since, I'm assuming, that will provide the greatest variation of remainders given the first part of the equation. I need to verify this, however. As a result:

Still SOME bias though not has heavy as before. Well, Mathematically the bias is there, as you can see, with values only being a few "steps" away from one another - however I wonder how this would translate as the stream is chopped to form character based streams - it would depend on which integer the chop fell, I suppose.

EDIT: Here's a slight modification which seems to be producing much better results: