Hi people. I am trying to reverse engineer an encryption algorithm. This is for making an open source version of a software. I think they might be using an existing algorithm but I am not sure. I would like to know if any techniques exist to identify the algorithm used for encryption? If there isn't any existing technique then where will I find technical info on encryption algorithms? If you guys require more info then lemme know

First thing you need to do is check that you are not going to fall foul of any local laws. In many countries it is illegal to reverse-engineer protection mechanisms such as encryption. Particularly, if they use an inbuilt 'key' to communicate with the server - you could them be commiting fraud... it could be seen that you are gaining access to their servers in an unauthorised fashion using some fraudulent token... etc. So, CHECK.

Of course, if they issue a session key, then you are probably in the clear since you are 'requesting' a key and by responding with one they cannot claim you have broken their security. But CHECK.

If you take a look at the binaries (DLL's, EXE's etc) in a hex editor you may find strings that indicate a specific algorithm. You may also find strings that indicate a specific 3rd party encryption library. Also, the names of DLL's should be checked - If, for example, ssleay.dll/libeay.dll are present - then its a cinch that they are using SSL... certainly for 'some' aspect of their comms.

This can often provide quick answers before turning to debuggers and such.

If, as is increasingly common, they use a 3rd party library - you should look up the functions exported by that library and see what parameters they take and how they are used (the vendors site will probably document this) - If its an external library run a dependency checker such as that which comes with microsofts visual studio environment ... this will tell you the exports from that library. Often there are clues in the naming of such exported functions.

You can then use API Hooking in order to trap the calls you are interested in. for example, with LIBeay/SSLeay you'd be looking at ssl_read and ssl_write. This will give you access to the plaintext - and thus you can then start dumping whole sessions and examining the raw protocol. You're going to need to get familiar with this : )

If the encryption appears to be built into the executable (perhaps static linked) or they appear to be using their own code - then you're going to have to start poking around with a debugger following where data goes after socket reads (Watch the recv buffer and halt when accessed). This should help you locate the decryption routines. Help with that is beyond the scope of this forum.

I've spent a plenty of time cracking almost all of the online poker rooms in order to build monitoring apps for third-party clients who were previously using crappy 'screen-scraping' techniques. Most are very simply cracked and, whilst I must stress that this doesn't give any immediate advantages, most can be broken with simple API hooking. I'd say only a quarter of these required any extensive debugging. If this is in any way reflective of other applications then I'd say the odds are firmly stacked in your favour even if you are not familiar with debugging or reverse engineering.

So, don't be disheartened. Check the legalities and then go for it,

If it's an interesting project I might be able to find the time to assist you if you do hit a brick wall with the above approaches.

This, in one context, has been discussed, previously. A quick search may reveal other useful threads that discuss the aspects of a distinguishing attack; that is, a cryptanalytical technique for distinguishing between an actual, realized primitive, or the theoretical, ideal model. (i.e., a saturation attack (chosen-plaintext attack) on six rounds of Rijndael due to being able to distinguish three rounds from a random permutation) Basically, if the primitive is secure, you shouldn't be able to distinguish it. That's the hope, at least, and if the software is implementing a secure primitive, then this route probably won't get you anywhere; this isn't the appropriate context.

However, implementations of cryptographic primitives paint a different picture. It really depends on the details (i.e., headers, parameters, et cetera), but it may be possible to identify, for example, the round function, S-boxes, and subkeys, for an unknown block cipher in tamper-proof hardware, given that the adversary has the ability to induce faults (differential fault analysis); in another case, it may be possible to identify cryptographic code within compiled binaries. An interesting paper on that latter very thing can be found here. So, I imagine this is more along the lines of what is going to be useful for you.

If I may ask, what is the software in question? Is it legal for you to reverse engineer and develop an open-source derivative? What is the primary application and environment of that software, and your open-source variation? Are you bound by compliance issues with the existing closed-source software, or could you not just implement something like applying CMAC-AES to the ciphertext of AES in CTR mode (or some other set of good primitives and modes)? That would most likely be sufficient.

(I had typed up the responses to several posts up in a text editor, partially, as I'm strapped for time, but wanted to reply. Pay attention to M3DU54's comments, as the implementation level is where you'll most likely find indicators of which cryptographic primitives were used, if there are any such indicators. It's mostly outside of the scope of cryptographic theory.)

Ok, since the common question, both of you asked me was whether it is legal to reverse engineer and build an open source derivative of the software? Well, first let me give you details about the software and the reason why I am trying to reverse engineer it.

I am primarily a GNU/Linux user. So, softwares which are "Windows only" annoy me. Some ISPs here use custom protocols and custom software to allow users to "login" to their servers in-order to use their broadband connections. I was their customer but had to shift due to their Linux unfriendly attitude. Anyway, the login client that they used until now was very simple. It used to send usernames and passwords all in clear text. But after several complaints, they have changed their client and started using encryption. I haven't yet read the EULA on the software as I am not their customer, I dont have access to their client.

I would like to mention that these guys do have a Linux client. But it needs to be run strictly as root ( which is unnecessary ). And the binary is so old that on recent distributions it just crashes most of the times. I dont think they have updated it in a long time. So, many people prefer the Java client. But that client needs to be kept minimized all the time in the taskbar which annoys most people. Even the windows version is so badly put together that the essential features are missing. Hence the need for alternative clients...

I am pretty sure that the EULA wont prohibit me from releasing an Open Source version of their software as there are 2 alternative clients which supported their older protocol. None of them had to face any legal action. So I assume that they are ok with the idea of Open Source clients. Personally, the newest version of their software (win32) forces the user to run it as admin and adds / edits your IPSec policies without your knowledge or permission. This is done in good faith I guess. For example, they try to disable NetBios and block certain ports commonly used by worms. But at the same time, these "features" sometimes cripple the machine. Personally, I feel the user will have a choice to use either their clients or alternative ones.

What I am NOT trying to do here is crack any proprietary software which will give me any personal benefits or cause any loss in revenue for the company. And certainly cracking their encryption isn't for any malicious intent.

Here are two topics on a different forum where we were discussing the possibility of making a new Open Source client.

I originally posted the question here to explore the possible options to analyze the encryption. Truthfully, I have no experience in identifying / breaking encryption algorithms. Just wanted to know that I wouldn't be going on a wild goose chase! Both of you have given me a wide variety of options. I will obtain packet dumps of the login / logoff sessions and start analyzing them!

P.S.: Their software comes with a file called as crypt.dll. I googled for this DLL and found that it is generally used by Windows SSH clients. So could these guys be using SSL? Afterall, they are POSTing a form to a PHP script to authenticate the user.

A quick search may reveal other useful threads that discuss the aspects of a distinguishing attack; that is, a cryptanalytical technique for distinguishing between an actual, realized primitive, or the theoretical, ideal model.

I am pretty sure that the EULA wont prohibit me from releasing an Open Source version of their software as there are 2 alternative clients which supported their older protocol.

Of course, thats if they can determine that the developers were ever subject to their EULA : )

max-in wrote:

None of them had to face any legal action. So I assume that they are ok with the idea of Open Source clients.

That doesn't always mean that they ain't gonna stir up a fuss later. But this does mean you may not have been subject to their EULA. It also means that 'client' code is available in the public domain - so you may not have reversed their code at all.

Perhaps this is a better place to start. Being open-source it should serve you well as an outline. You should also be able to read the source to find both the encryption methodology AND the proprietary protocol.

Try compiling this open source client, and placing liberal fprintf statements around sections handling server communications. You should be able to get a lovely dump of unecrypted data going in and out of the server. This will give you a live example of a session to base your protocol parser around and test against. A few more fprintf's and not only will you see the unencrypted data that came in, but also, which parts of the code were hit.

An example of the type of information you can get with a few liberal fprintf's. This will help you clearly understand the link between protocol and action.

max-in wrote:

What I am NOT trying to do here is crack any proprietary software which will give me any personal benefits or cause any loss in revenue for the company. And certainly cracking their encryption isn't for any malicious intent.

Oh, please don't missunderstand me - I wasn't suggesting that you were doing anything 'wrong' - I was only concerned that you should be aware of any future legalities.

max-in wrote:

I originally posted the question here to explore the possible options to analyze the encryption. Truthfully, I have no experience in identifying / breaking encryption algorithms. Just wanted to know that I wouldn't be going on a wild goose chase! Both of you have given me a wide variety of options. I will obtain packet dumps of the login / logoff sessions and start analyzing them!

A packet dump is unlikely to help you if encrypted. Try looking at the source for the open clients, this almost certainly will present all the information you need if you study it a little.

max-in wrote:

P.S.: Their software comes with a file called as crypt.dll. I googled for this DLL and found that it is generally used by Windows SSH clients. So could these guys be using SSL? Afterall, they are POSTing a form to a PHP script to authenticate the user.

It sounds VERY likely. It sounds like you shouldn't have any problems breaking this. You can use either ssleay.dll or crypt32.dll in your code to provide your SSL, and it may be possible to use an existing 'web control' component rather than a socket - which would save you reimplementing all the HTTP protocol and allow you to concentrate on the parsing post data and return results.

But, spend some time looking at the open-source clients. They did it, and they provided source ... seems like that's the best documentation you could hope for !

M3DU54, I think you didn't get me. The reason I am trying to write my own Open Source client is because ALL other clients have been rendered useless by the _new_ protocol that these guys have introduced. This is the _first_ time these guys are using encrypted authentication. Hence, I require help to determine what kind of authentication scheme they are trying to use. I have tried contacting the authors of the Open Source clients but I am yet to get a reply from any one of them.

If the OpenSource clients did exist and they supported the _new_ protocol then I wouldn't have asked for help in indentifying the authentication scheme / encryption used by their client.

Thanks for your other ideas. BTW I am thinking of using Open SSL in my code.

M3DU54, I think you didn't get me.
--8<00 snip --8<--
If the OpenSource clients did exist and they supported the _new_ protocol then I wouldn't have asked for help in indentifying the authentication scheme / encryption used by their client.

Ok. Heres an update. The software isn't using SSL. Infact the latest version isn't even using crypt.dll. I searched the DLLs that came with the software and found out that it had these functions. Why are they jumbled up? I used bindump.exe that comes with Microsoft VS.NET 2003 to extract these names. I also got the same result using the "strings" command in Linux. Most other DLLs show up fine. Some do give a similar output. Are they ub different formats? Any explanation for it? I checked whether the DLL i had was corrupt. But its fine.

Of use to you is the function syntax lookup. This will tell you what parameters each function takes - This is useful for API hooking any interesting functions in order to monitor what this client software is actually doing.

Also, have you tried searching with a hex-editor for any of the 'magic numbers' Justin mentioned ? They should appear in somewhere one of the binaries if that particular cryptographic technique is used.

I'd have thought they'd be glad to release the protocol to someone interested in designing an OSS client to their networks. They clearly value a linux user base because they wrote a client for it before, now someone's coming along and wants to do it for free so why object?

M3DU54, I havent tried asking them for the protocol specs. I will try and ask them for the encryption algorithm info. Anyway, I tried searching for magic numbers to get the algorithm. But Google wasn't much of a help. Where will I get the magic numbers for the most common algorithms?

BTW I have already read up on some articles about API hooking. I will try to get info on these 'decorated' function names. Thanks for your time!

M3DU54, I havent tried asking them for the protocol specs. I will try and ask them for the encryption algorithm info. Anyway, I tried searching for magic numbers to get the algorithm. But Google wasn't much of a help. Where will I get the magic numbers for the most common algorithms?

Justin mentioned some common ones in his post (above) - They should appear in one of the binaries if that particular cryptographic technique is used.

Search all of the binaries using a hex editor for portions of those numbers. Bear in mind that this doesn't mean the crypto is used, merely that it is present. Some DLL's may include multiple cryptographic functions, not all of which will actually be called by the application.

If you find data indicating multiple encryption types, set a 'READ' breakpoint on each magic number using SoftIce, OllyDbg or some other debugger of your choice. If either of the magic numbers is 'read' by the program at run-time then it is a safe bet that it is being used. If it is read many times during runtime then its use is certain.

When your debugger halts the code because a magic number was read, the instruction pointer will probably be inside an encrypt or decrypt function or subfunction. If your debugger allows you to backtrace you will find the entry point for the operation (the exposed interface which the application calls) ...Setting a breakpoint on the entrypoint of the cryptographic function should halt the program at a point here you can read the parameter buffer -step over the function- and read the returned buffer.

This should give you not only the cryptographic method, but also the plaintext protocol (also damned useful). If you do require a specific encryption key this should be easy to find from here.

Alternatively, you can work the other way... Break on access to winsock functions. When encrypted data is read from the socket into a buffer, you can simply track what happens to the buffer as the code executes. At some point it will be passed to a routine which takes the encrypted buffer, and returns a plaintext buiffer. Setting your breakpoint here will let you watch the plaintext protocol during a session. Knowing where this entrypoint is can be a great help in locating keys, determining the method, and learning the underlying proprietary protocol.

If you are not familiar with debuggers then you've got a steep learning curve ahead - if you want faster results enlist someone with some debugging/reverse-engineering experience. They should be able to locate the routines using those 'magic numbers' quite easily.

Also, looking for copyright strings and other static text in the binary may help you locate the author/version of the cryptographic module - Then, you should be able to get hold of documentation regarding the API. This is of greatest help when the cryptographic library is static-linked into the main executable and hence the external interface is not visible to utilities like microsofts 'depends' - knowing 'what' you are looking for is invaluable in these instances.

max-in wrote:

BTW I have already read up on some articles about API hooking. I will try to get info on these 'decorated' function names. Thanks for your time!

Sorry for the late reply, had some exams. I searched for the magic numbers but didn't find them. And yes, I am relatively new to debuggers. Anyway, I searched for Copyright strings but none turned out in neither the binary nor the DLLs. My next step is to identify the functions and their parameters in those DLLs. Once I am able to identify those, I plan to isolate the encryption, decryption functions. Next I will use the API to pass some test data to those functions and use a debugger to follow the execution of those functions to identify the algorithm used to encrypt the data. I will then implement my own version of the algorithm.