I repeated the same test as the author. 10000 16 character alphanumeric strings from random.org. The following was output as the summary in sequencer:

"The overall quality of randomness within the sample is estimated to be: poor. At a significance level of 1%, the amount of effective entropy is estimated to be: 26 bits."

Now to explain this. First of all, the character-level analysis looked ok, a few anomalies in the 'transitions' but overall not too bad. The first thing I noticed looking at the bit-level analysis is Burp was only reporting up to 79 bits, even though the tokens were 16 characters long.

After I RTFM [1], I found that Burp creates a custom encoding for tokens on the fly based on the size of the character set used at each character position. I had 16 character positions each with a charset size of 62 characters. Since 62 is not a power of 2, there will be 'information loss' (according to the sequencer manual [1]) meaning rather than use a 6 bit encoding (the number needed to encode 62 unique items), burp just uses 5 bits and some of our data can't be represented in the bit-level analysis.

Extrapolating a bit, I'm guessing what it does internally is drop the most significant bit off of each of the 6 bit character encodings, the result of this is that 32 of our 62 original characters will encode to the same value in the bit-level analysis! It is no surprise that such an encoding would fail a randomness test.

Being lazy and not having access to the source code, I decided to take a bit of a shortcut to pseudo confirm this hypothesis. I generated a new set of 10000 16 character tokens, this time from the charset "a-zA-Z0-9#!" giving a total of 25 characters. Here is the glorious result in the summary:

"The overall quality of randomness within the sample is estimated to be: excellent. At a significance level of 1%, the amount of effective entropy is estimated to be: 86 bits."

EDIT: Repeated the test for a 16 character charset and got the following, as expected: "The overall quality of randomness within the sample is estimated to be: very good. At a significance level of 1%, the amount of effective entropy is estimated to be: 71 bits."

tl;dr: If your charset is not a power of 2, don't put too much stock into the summary or bit-level analysis in burp sequencer. In fact, the closer to a power of 2 it is without being one, the worse this will be. Instead just use the character-level analysis, this will remain reliable and informative.

Monday, October 15, 2012

This post is pretty much a straight copy/paste from my notes on attacking application encryption. For the original source at the wiki I posted on Google Code (and some semi-complete code that implements these attacks in python), go here http://code.google.com/p/webapp-cryptotest/wiki/AttackingApplicationEncryption.

Identify the cipher and look for attacks on the mode of encryption:

Q: Is it a block Cipher or Stream cipher?

A: Is the output always a multiple of common block sizes (usually 128bits = 16 bytes)? Does pushing the input length over the block size cause another block to be added to the output?

If neither are these are true, it's probably a stream cipher or block cipher being used in CTR or some other stream mode. Try modifying a single byte of input, if you see a single byte change in the output it is a broken implementation of a stream cipher, more details to follow.

Attack - Block Shuffling

Attack - Chosen Boundary Attack - Also works on CBC with static IV

1) Find out about how many bytes precede our input in the CT - look for where the repeated blocks start

2) Manipulate input so that the LAST block of our repeated input lines up with the block boundary. Do this by starting with large input and removing characters until you LOSE a repeated block. Then REMOVE the last character (byte) from the input so that the first byte of ciphertext is picked up within our last block eg:

AAAAAAAAAAAAAAAA (we have a full block of our repeated input)

AAAAAAAAAAAAAAA? (we remove one character and ‘pick up’ the last character or unknown CT)

3) Perform a dictionary attack to find ? using a previous repeated block. We know we have found x when the ciphertext in the 2 blocks matches eg:

Suppose the last block that contains our input (AAAAAAAAAAAAAAA? where ? is unknown) encrypts as:

AAAAAAAAAAAAAAA? ==> JDJDJEIDLELEMSE32

Then we can use one of the blocks where we control all the characters to brute force ?:

AAAAAAAAAAAAAAAa ==> ZHEKJDKDLENAKEJJ3

AAAAAAAAAAAAAAAb ==> DKMDJOWEKLJDSKNN1

AAAAAAAAAAAAAAAc ==> JDJDJEIDLELEMSE32 MATCH

4) To find the next character, remove another character from the last block and modify your block where you control all the characters so that its second from last character is the one found in step 3. Brute force the last character.

5) Repeat.

Q: Is it CTR Mode?

A: Does the ciphertext grow byte-at-a-time as input is added? If so, probably AES w/ CTR, could be RC4 as well, attacks below work for both.

Does a single byte PT change cause a single byte CT change? Indicates broken implementation of a stream cipher (reusing the same NONCE).

Are there repeated strings in the ciphertext? Indicates CTR wrapping and can be attacked.

Must understand how CTR works - Wikipedia/Schneirs book

Vulnerable if we can induce the same keystream to be used for multiple blocks OR multiple messages.

The way the keystream works is there is one unique NONCE per message, a message can be of any length. The counter function C(x) is usually just a simple counter, eg: C(0) = 0, C(1) =1 ... However some implementations will screw this up and have a wrapping or repeating counter. To encrypt the nth block of the message, we AES encrypt the NONCE+C(n) to get 16bytes of ciphertext, then XOR this with block n of the plaintext to get the encrypted block. Decryption is done by XORing the plaintext with the relevant part of the keystream (E(NONCE+C(n))) that was used to encrypt it.

If we can induce the condition where some part of the keystream is repeated (used to encrypt two different blocks of plaintext or 2 different messages) we can possibly recover the content of those plaintext blocks even if they are both unknown. If one is known and one is not, we can DEFINITELY recover it (see the next ‘simple’ attack)

Conditions - Must find 2+ blocks that satisfy:

Same AES key (almost always true)

Same NONCE (Should be unique per message, sometimes can be influenced (parameter), often ZEROED or reused by insecure implementations)

Same Counter (Usually the ith block in all messages will have the same counter value, sometimes a counter will wrap within the same message (look for repeated CT sections, if 3+ bytes of CT repeat, good chance the counter wrapped and we can attack it), sometimes it is timestamp based, sometimes it will count up from a random value...)

Some of these conditions may be able to be induced (i.e. Counter wrapping by providing long input or influencing the NONCE somehow). To know if you’ve gotten it right, you should see multiple blocks of the same plaintext coming out as the same ciphertext. If this is the case, you can break the encryption.

If all you have is a long string of ciphertext (or a bunch of unknown messages), look for repeated strings of bytes. It is VERY unlikely that long strings of bytes will repeat at random, chances are the counter wrapped or nonce was reused. You can use this information to attack the ciphertext as described below.

Attack - Simple Keystream Attack with SOME known Plain Text and Reused NONCE

1) Find the location of some known PT in the CT by trial/error. Supply a long string of input, get the CT. Change the first/last character of your input, look for which bytes in the CT change. If only 2 bytes change and they correspond to the length of your input, this attack will work. This is because they are reusing the nonce between messages.

2) XOR the PT you provided and the CT it encrypted to, this will give you the keystream used to encrypt it. This means we can get arbitrary amounts of keystream starting from where your input begins in the CT.

3a) Now shorten your input to a single character. XOR the keystream you got in step (2) with the ciphertext, starting at the byte in the CT where your input starts. You should get back the plain text!

3b) XOR the same part (byte offset) of the CT in a different message, one you may not control the input to, with the result from (2), may get back plaintext if they are reusing the NONCE which is likely

4) Try XOR'ing key with the correct bytes in more messages from other parts of the application... Any ASCII you find, if you can PREDICT more PT, you can get more keystream. EG if we find “rsonal Inf”, we can try to guess "Personal Information" as the PT, which can give us 10 more bytes of keystream if correct! Go back to step 2 and get more keystream! In this way we can recover the keystream used to encrypt things that occur BEFORE our input in the ciphertext.

Attack - Keystream Attack with known Plain Text and Looping CTR

1) Induce and detect looping in the CTR. Provide a long stream of repeated input, look for where repeated strings start to appear in the CT. This is best done with a script, difficult to 'eye' it. The distance between these repeating strings indicates when the counter wraps.

2)Same principle as the attack above but you are using it WITHIN a single message. You recover keystream by XORing CT with the corresponding known plain text. You know the keystream repeated later in the message because you found out where the CTR wrapped, so XOR that keystream with the CT after the CTR wrap and you will get back the plain text.

If you have NO knowledge on what the original plaintext was but you have a bunch of encrypted strings and you suspect the NONCE or CTR repeat, you can apply the above two attacks still by using character frequency analysis.

1) Remember -> ciphertext XOR ciphertext = plaintext XOR plaintext

2) Use statistics about the text you expect to get back (eg: which one produces the most letter e's since its the most common english letter)

3) You can attack this cipher byte-at-a-time

Take all bytes that are suspected to have been encrypted with the same part of the keystream. For example, if you were able to fix the nonce or suspect it is fixed, take the ith byte of each message.

If the nonce is not fixed but you suspect CTR repetition, look for repeated strings in the CT (use a script, read the article at the top of this attack section). Use the distance between repeated strings to estimate where the CTR wraps. This tells you where the keystream starts to repeat. Knowing this, you can collect bytes encrypted with the same keystream.

If the NONCE is not fixed and there doesn't seem to be any repeated strings in any of the messages you have, try concatenating all of the messages into a single, long string and looking for repetitions in that. This will allow you to detect any NONCE or CTR repetition. Brute force the corresponding key byte by checking which value for the key byte produces the most common characters according to the characte frequency you expect (e.g. English has the highest frequency letter as ‘e’, so look for the most ‘e’s if you think the PT was English). Again this should be scripted and you can use more complex frequency analysis. E.G. Suppose you have 5 16 byte blocks of unknown ciphertext encrypted with the same part of the keystream

-Byte 1 of each block will have been encrypted (XOR’d) with byte 1 of the key

-Brute force byte 1 by XORing the CT with all possible values for that byte - look for highest freqeuncy of the letter 'e' across your 5 bytes at this position

-Repeat for remaining bytes, hopefully you get something intelligable

Q: Is it CBC Mode?

A: Blocks don't repeat, small PT changes cause full or large CT changes

-Do you see static text in fixed locations of each CT (usually beginning)? Probably an Initialization Vector

-Do you see repetition across messages but not blocks? The beginning of long strings encrypts to the same ciphertext, but blocks are not repeated? Probably CBC with a static IV. This will be vulnerable to the chosen boundary attack described for ECB.

A CBC implementation is vulnerable to padding oracle attacks under the following conditions:

1) When valid CT is submitted to the application and it decrypts to valid data you get a distinguishable response (eg: 200 OK)

2) When CT is submitted that decrypts to invalid data but has valid padding you get a distinguishable response (eg: 200 OK with a custom error or something)

3) When invalid CT is submitted and the padding is incorrect you get a distinguishable response (eg: 500 Error due to crypto libraries crapping out on the backend)

An example of how this can be satisfied is (1) giving the 200 OK response and fulfilling a request, (2) gives an application level error because the decrypted data is invalid in the application context, (3) gives a server error (exception is thrown when the decryption fails). As long as errors (2) and (3) are different, we can use a padding oracle attack to decrypt the ciphertext that is being submitted to the application.

It is not explained, but what is happening here is really cool. If an application is decrypting some part of the encrypted value and displaying it, and you want to know some OTHER part of the encrypted value, plug the block+previous block in where the stuff is being displayed from. In this case, he wanted to know what the first 16 bytes of ciphertext were so he used IV+cipher[:16]. What happens here is on decryption, the PT is xored with the previous block of CT, so the IV gets jumbled on decryption but the cipher[:16] block is decrypted correctly.

Hashes:

Q: Does the application validate messages based on the hash of the message with a secret key? This is not a valid way to use CURRENT hashing algorithms (MD5, SHA1, SHA256) but some applications do it anyways (Flikr compromise and Stripe CTF) - There is now a great resource and new tool for attacking this http://www.skullsecurity.org/blog/2012/everything-you-need-to-know-about-hash-length-extension-attacks.

E.G. Suppose X is a secret and M is the message, h(X | M) is used to validate M comes from a trusted source. We can modify M (lengthen it) to M’ and compute h(X | M’) without knowing X because of how current hash functions work.

h(X | M) is actually an intermediate value when computing h(X | M'), so we seed the hash function with h(X | M) which is known, then we compute the hash for our extension. The trick is to pad the extension message at the beginning so that our entire extension falls into a new block AND doesn't modify anything in the previous block. To do this, we prepend out extension with the padding the hash algorithm uses. This has the form:

Thursday, October 11, 2012

After beating my head off the wall for an hour or so I finally figured out an interesting way to do this. Extensive Googling didn't turn anything up so I decided to post here for future reference and to save others the headache.

This is a pretty common and necessary task. Some good examples of when you may need to do it are to deal with CSRF tokens that update on every request or test for SQL injection in a multistep process. BURP macros and Session Handling can deal with these scenarios but for some reason sqlmap doesn't like to be proxied for HTTPS URL's, I think it's probably because of the certificate that BURP uses.

Anyways doing it is quite easy once you figure it out. Just enable your proxy and under Proxy -> Options -> Request Handling, select "Force Use of SSL". Then in sqlmap, feed it a plain http url rather than https. BURP will translate this to HTTPS when it receives requests.

Now you can let BURP work its macro and session handling magic on the sqlmap requests!

Wednesday, October 3, 2012

While reading Michal Zalewski's (lcamtuf) "The Tangled Web", I was inspired to play with an example from the book. It highlighted an interesting variant on cross site script inclusion attacks that doesn't actually require JSONP, or for that matter, any JavaScript to be included in the application's response.

After contacting lcamtuf to see if my modified method of executing this attack was something that was well known or that he had seen, he told me that it looked like a bypass in one of the security features that Mozilla implements in E4X and that it might be worthwhile to contact them. After contacting Mozilla, I was informed that they probably won't patch it because they want to kill E4X anyways. Although one of the developers was pretty technical and was really cool about it. So no bounty for me but the bug is still interesting and as a bonus should work on all versions of FireFox prior to 17 (which is a few releases away)!

For those unfamiliar with cross site script inclusion attacks, I was going to try and 'borrow' someone else's explanation but I wasn't able to find any good ones. So bear with me because this could get ugly:

Traditional XSSI Description

Traditionally XSSI occurs when JSONP is used by an application and the response includes sensitive data. So suppose there was some ridiculous web service exposed by an application that would show you your password if you had already logged in and had a valid session. Let's assume this crazy web service resides at the following URL:

http://example.com/showPassword.phpAnd since we are assuming this crazy web service responds with JSONP, an example response could look like this:showPassword({"password":"mybadpassword123"})The call to the service could be as simple as:<script src="http://example.com/showPassword.php"></script>The reason people use JSONP is exactly what makes it vulnerable. It is useful if you want javascript on non same origin sites to be able to access data from your web service. When the response is returned to this JSONP call, the showPassword function on the non same origin site is executed using the data that came from the source.It should be obvious now that you don't want to use JSONP where sensitive data is involved since ANY external site can force the victim to make the request and steal the sensitive data!

(Ab)using E4X In Firefox for XSSI

Now here's where things get interesting. The FireFox JavaScript parser has an extension called ECMAScript for XML -- basically it allows simple, automatic conversion from XML to JSON, which sounds great. For example, you could have the following which would be equivalent:

var a = {"test":"123","test2":"567"}

var a = <test>123</test><test2>567</test2>

In fact, any well formed XML that doesn't begin with the <html> tag seems to pass right through the JavaScript parser! For example, the following script will go through the interpreter no problem:

<script><b>Some other html and stuff</b><p>This is supposed to be secret!</p><b>so is this</b><i>More html after</i></script>

Well suppose we have some page that echoes some user controlled data into it, for example it takes some query string parameters and includes them in the response. For example, consider the following Django template where {{paramX}} gets replaced by the appropriate query string parameter:

Now consider what happens when we set param1 and param2 as follows:param1 = {x=param2 = }

The application response will look like this:<b>Some other html and stuff</b>
{x=

<p>This is supposed to be secret!</p>

<b>so is this</b>}

<i>More html after</i>

If an attacker includes this page in a script, as in an XSSI attack like the following example, he will have access to a global variable "x" which contains potentially sensitive data, everything occurring between param1 and param2!
<script src="http://192.168.1.135:8000/?param1={x%3d&param2=}"></script>

I've actually not tested the exact attack above, it is simpler than my original (tested) idea and probably works. The problem is FireFox has a security restriction that will be thrown if an entire script is composed of XML, but I think the above bypasses that:
"SyntaxError: XML can't be the whole program"

My original idea which definitely works uses the following URL for the attack, it accomplishes the same thing in a more complicated way:

<script>
<h1>This is the page title, it occurs before our injection point</h1>
var a = (<r><![CDATA[xxxThis line is injected...
<p> This is part of the page between the first injection point and second </p>
<a href='xxx'>So is this </a>
yyyThis line is the second injection point]]></r>).toString();alert(a);
</script>

The CDATA section wrapped in <r> tags makes it so newline characters within the stolen data don't screw up the javascript parser, but otherwise it's the same idea.

Tuesday, October 2, 2012

First post! Been putting this off for a while. Basically this is just a place for me to collect all my random research and experiences in one place. As of now they are scattered all over Google docs, Reddit, my hard disk, my brain (very volatile storage) and various other places. The writing may be terrible, but hopefully the content makes up for it. So here we go...

Just had an amazing first experience at a conference, DerbyCon! Even if I only made two talks. The CTF was a blast, myself and the rest of team JollyAndFriends owned it. Although it was really tight, right down to the last 15m and we only won by 10pts. It was amazing to meet and work with the whole team and guys like mubix who hung around and grinded away at a couple of the challenges with us for a while. Props to the organizers and attendees of the con.

This was actually my second CTF and I think I may be addicted. The Stripe web CTF was also amazingly well done, although I participated in that one solo. At some point in the near future I'll probably give it its own dedicated blog post describing my approach to some of the more interesting challenges.

Pretty happy I actually got to put some of my research into practice in the form of a really basic crypto attack for the Derby CTF. I plan on posting my crypto notes/research/attack code some time later, but the simple attack used in the CTF was the following:

We were provided with 3 files, "plain1_encrypted", "plain2_encrypted", "plain2"; they represent exactly what their names imply, some encrypted and plain text files. We were also provided with the binaries used for the encryption and decryption routines, but that is actually irrelevant, they didn't contain keys and you don't really need them.

Notice all of the highlighted values are repeated bytes that occur in both encrypted files. The last 16 bytes (1 block) are all consecutive and all the same, this was probably some sort of IV. The repeated bytes that occur within the cipher text are more interesting...

Since the repeated bytes are separated by non repeated sections, this suggests that it could not be AES in CBC mode. Since the repetition occurs at the byte level rather than in full blocks at a time, that rules out ECB mode. We will make an educated guess at this point that the encryption algorithm in use is probably AES in CTR or OFB mode, with a static IV (bad!).

The way that AES in a steam cipher mode like OFB or CTR works is that the AES algorithm is used to encrypt the IV concatenated with some other data to generate a "keystream". In CTR mode, the "other data" is a simple counter like 1,2,3,4... In OFB mode, there is feedback from previously encrypted data. I'll post far more detail in my crypto notes. This keystream is then XOR'd with the plaintext to produce the ciphertext. The danger with this approach is that if the keystream EVER repeats, this can be detected because repeated sequences of bytes will be observed in the ciphertext when the plain text contains repetition (just like we saw in the CTF).

Due to the following relationship (easy to deduce by taking a quick look at the truth table for the XOR operation) it is trivial to decrypt text given some known plain text when the keystream is repeated:

keystream XOR plaintext = ciphertext -- therefore

ciphertext XOR plaintext = keystream

And once you have keystream, to recover the plaintext from unknown ciphertext that uses the same keystream is easy!

ciphertext XOR keystream = plaintext

So let's see that in action! I used the python interpreter to do the XOR operation:

**First we do plain2 XOR plain2_encrypted to get keystream***>>> hex(0x4b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b4335374b433537 ^ 0x0bb7b2efebc76a8e5ac651d6507d82af09800e1c201aaf1ca2e27959558a89d78c1aeefc98c7207961693101cdbe996103eb8910face67966092d3d7d1a6e15541d7353c8dc5ad424f65bfff7fdd91359556d0f9effaf1bc6742da6992822ec4ba36ca122c9060bba0addfbc60f5dad29b2e0d9412925c43f8de8e4fa494644d)'0xbb7b2efebc76a8e5ac651d6507d82af42c33b2b6b599a2be9a14c6e1ec9bce0c759dbcbd384154e2a2a043686fdac5648a8bc27b18d52a12bd1e6e09ae5d4620a94000bc686987504268ac8349ea402de15e5cea4b9c48b2c01ef5ed9c11bf3f175ff2567d3558cebeeea8b2bb6efe5d06d38a359d16974b39dbb78efd7517aL' **THIS IS KEYSTREAM****Now we do keystream XOR plain1_encrypted to get plain1***>>> hex(0xbb7b2efebc76a8e5ac651d6507d82af42c33b2b6b599a2be9a14c6e1ec9bce0c759dbcbd384154e2a2a043686fdac5648a8bc27b18d52a12bd1e6e09ae5d4620a94000bc686987504268ac8349ea402de15e5cea4b9c48b2c01ef5ed9c11bf3f175ff2567d3558cebeeea8b2bb6efe5d06d38a359d16974b39dbb78efd7517aL ^ 0x0bb7b2efa5835abe5ac651d61e39b29f09800e1c6e5e9f2ca2e279591bceb9e781159a8ceec0702d58537442d295c91024c9db73deca37d57fb983abff9c980d6694000bc686987504268ac8349ea4029556d0f9a1bec18c6742da69dcc61ef4ba36ca1262d4508ba0addfbc2eb1eae29b2e0d9412925c43f8de8e4fa494644d)'0x4e443030000000004e4430304b433537050705074b43353705070507464c41473d44656372797074546865466c6167546f4765745468654b65794c6f6c0000000000000000000000000000004b433537050705074b433537050705074b433537050705074b433537050705074b4335374b4335374b4335374b433537L'

If we hex decode the green highlighted hex which SHOULD be plain text, we get this:ND00ND00KC57 KC57 FLAG=DecryptTheFlagToGetTheKeyLolKC57 KC57 KC57 KC57 KC57KC57KC57KC57

Success! So it was as easy as XORing the bytes of plain2_encrypted with the bytes of plain2 to recover keystream, then XORing those bytes with plain1_encrypted to yield the plaintext!