License

Publication of this article as a whole or in part(s) is prohibited without authorization from Kamal Shankar.

Commercial exploitation/distribution of the software(s) in conjunction with this article is allowed provided an arrangement exists between Kamal Shankar and the licensee.

The software(s) packaged with this article comes with NO WARRANTIES EXTENDED. Use at your own risk.

Anybody interested in building a PE Loader just for the sake of it - Thanks. Read on..

Background

This is intended to be a free, open software protection system based upon a simple assumption that makes this protection system viable only for small software companies or actually a single programmer.

The philosophy of this program is different from almost ALL the currently available (commercial) software protection systems, and it is simple: We will NOT be depending on preventive measures (which can ALWAYS be bypassed by a clever computer student) but rather upon the first and most basic idea behind ALL encryption algorithms - you have the key, you have the data (otherwise try to break down the titanium vault!)

The USP of this protection system will be this : Anybody having the software can make infinite copies of it, but they will just crash on an unauthorized machine. They will however, run transparently on the correct machine as if it was not protected at all!

In this way, it's not really a copy protection system in the true sense, but more of a one machine - one software kind of protection. Those using this protection do NOT need to use protection calls or APIs in their program, nor will the net executable code be changed in anyway - code obfuscation, IAT modification, section merging, entry point hiding or whatever. In short, what you compiled is what is finally loaded into the user's computer (provided it's the correct machine)

As the great +ORC had once said - "Anything that runs can be cracked".

What we do is fully leave the user to run our program(s) on his system. The program is encrypted using a machine dependent key and only the correct key will produce the correct decoded program file(s), else it will be just plain garbage.

The encryption/decryption key will be calculated dynamically at run time by a program, let's call it HardwareID. The user/customer will be required to run HardwareID on his machine. The resultant key (it's length depending upon the particular encyption algorithm) will be sent in plaintext to the software publisher, who will encrypt all or the important program files using that key and send the distribution to the customer along with the program loader (let's call the loader Loader32), which will again calculate the key dynamically and just decrypt and run the files from primary memory. We will NOT be decrypting the files to secondary memory (HDD) and this will make our distribution about .1% secure from being cracked.

[Anything which runs can be cracked. And if you have the correct tools , you just need to click].

Stumbling Block

We can approach the problem in two ways:

Decrypt data directly to memory, and try to make the Win32 program loader run the image.

Now, as far as I am aware, none of the SDK APIs allow us to run a image directly from memory, they ALL want a disk image\valid filename. What we could do is to decrypt to memory, and then write our own Win32 loader (never mind that the memory could be dumped). (Please refer to Open Source Software protection system - Part 2 to see the implementation )

Write our own Win32 loader to read the encrypted file directly from disk, decrypt it and then load it

(Embarrassed Grin) Well, the heart of the idea - the PE loader is probably beyond me : (I just cannot think of rewriting something which will be able to handle relocations, function lookups and ordinal lookups and translation, image handling, CS:EIP management, the stack and ...

The program packaged with this article will extract the encrypted data to HDD and run it from there .

Anybody with a working idea/implementation of anything able to decrypt a encrypted PE file directly to memory and from there run it - welcome if you wish! So actually what I am asking is code to a program packer. Obviously the open UPX comes to mind, but why can we not use it? Because:

I cannot think of writing CAST or TWOFISH in assembly.

The external modules used by me will be too much of a pain in the you-know-what to port to be NASM compatible.

The code for UPX Win32PE loader is BEYOND my easy comprehension (I do program in assembly but this?)

Why not use one of the many commercial packers ?

First of all, I wanted to do it to LEARN rather than EARN

I am sure that this initiative will be looked upon NOT as my own, but all. Thank you and happy coding!

And then, the biggest disadvantage of almost all currently packers is that they implement preventive measures.

From my experience, preventive measures succeed in keeping out the beginners. Real crackers just need to find the locations where the preventive measures are operating, and can bust them easily !

This system is different - it has no preventive measures for the cracker to crack in the first place !

To Make it Simple - outlines for fabricating such a system

First of all, this system must satisfy the following clauses:

It must not employ any preventive measures (antidebugging, blacklists, dead code..).

Must be as transparent as possible to the end user.

Must not require source level modification. Ideally should work with just binaries.

For the time being, let's limit ourselves to Win32 and PE images, protected mode and 32 bit addressing. Code should be Visual C++, but if ASM does come in, the algorithm should be mentioned and code well documented. No other language. [If you come across something similar but in another language, it's welcome anyway, but try to be C++ ] [Might I suggest that we try to use existing code which has already been tried and tested?. Has anybody got suggestions about this ?]

Of course, Win32 APIs (as well as debugging APIs) are allowed.

Something tells me that perhaps referring to the implementation of the Win32 loader of WINE will help us out, but they have implemented EVERYTHING themselves - actually, many APIs written by them behave VERY differently from the orignal Microsoft API's. In fact, their code implements page aligned relocations and addressing rather than byte alignment used on Microsoft Windows. I will gladly stand corrected on this aspect if I got it wrong.

I might take this opportunity to make you all aware that it's just a beginning, we will soon be having more features once the Stumbling Block (the Loader) is done. I promise .

The Source

As you will see, I have uploaded a source package. It contains the HashLibProper files. Include the header file and link to the .lib. Put SysInfo.dll and HashLibProper.dll with the program and you are done.

HashLibProper package is a simple Win32 dll which uses SysInfo classes to get Processor name, speed, OS and installed RAM. It uses a single linked list to hold logical drive information, their labels and serial numbers. It concatenates all this info and returns a C type string. It also returns an MD5 of this string.

You will find ALL these detailed in the ReadMe.

NOW BEFORE YOU START CRYING OUT LOUD - "Where is the source code for HashLibProper package?", I will release the source once the Stumbling Block materializes and we get a concrete solution. [The code for the library is easy and small anyway, and I just want to see a little contribution to this project - if that happens I WILL release the code]

Current Implementation

Software packaged with this article.

This is best understood by going through the source package. In short we put ALL the obtained system info into a C type string and do an MD5 on it. We may use the MD5 digest as the encryption key itself but later we will do a little more math to increase security - BUT AFTER the Stumbling Block has seen some light. This key will also be produced dynamically on the user's system and the encrypted files will be decrypted using this key.

Needless to say, the person with the correct system will create the correct MD5 and thus only he gets the program to run (on other systems the decrypted output should be un-executable garbage).

So select a program file copy it to the directory to which you extracted all our files and rename it to OriginalData.dat; run Encrypt.exe and click on 'OK'. The application will encrypt the chosen file to EncryptedData.dat. Then run Encrypt.exe and select 'Cancel'. You will see a file DecryptedData.dat created - it's the program chosen. Rename it to an EXE (or whatever extension it was) - it will running correctly. Now copy these files to a machine of a different configuration and see what happens!

This is a VERY ROUGH implementation of my idea, but without getting the Stumbling Block cleared, I just cannot see an alternative solution!

History

15th May 2003 : Updated article to reflect implementation of Stumbling block I. Please refer to Part 2 of this article to clear up any confusion

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

Comments and Discussions

I find your idea interesting but I think it is easily breakable. Let assume our hacker has access to a working installation of the protected program. He can trace the HID calculation and get the final value. Then patch the EXE file - instead of detection of hardware parameters, he will put static pre-computed value and then the protected file will work on any system. Usually the software is "stolen"/copied from a system where it is working...

Not a bad solution, but then what's stopping the cracker to dump the process after it's running (Using ReadProcessMemory). All that would be required then is to repatch the IAT table back to it's original form. PE's compiled with the latest Visual C++ compilers also use the same file aignment as the memory alignment so there would be no need to change any of that (unless the PE was compiled specificly using a different file alignment to memory alignment).

Hello. I am trying to compile the exeprot.exe program, and the compiler is complaining that it cannot find md5.h, which is #included in exeprotdlg.cpp. Can anyone tell me where I may obtain this header file?

wrapping the core of the program to be "protected" into a dll and then using a EXE front-end like rundll32 already does for some Windows System Processes?

You could do virtually ALL of your encryption / decryption from within the EXE and simply call LoadLibrary() on the DLL and enter it's main process loop. The same hardwareID encoding could still apply as once the DLL was encrypted only the proper machine could run it.

It's good to see kids turning their minds to wholesum activities such as programming, instead of wasting their lives in the hedonistic disciplines of Sex, Drugs, & Rock & Roll... or Sex with Drugs, or Sex with Rocks while Rolling in Drugs, or whatever new-fangled perversions you little monsters have thought up now...

Hi Pierre,
The PE acronyms stands for "Portable Executable", which is the file format Microsoft designed to store executables (or programs). This file format is used by all Win32-based systems.
If you want to have a better idea of the internal structure of this file format, take a glance at the following link (although it is not so "fresh"... 1994, it provides the essentials):

I think your idea is pretty good - the point with the loader is complicated.What about going some other way :

a) "RAM disk"You create dynamically some sort of "RAM Disk" within the decryption routine and put the whole decprypted file into. Then you can give the path into your "RAM disk" to CreateProcess...After the file is loaded... the would be some way to destroy the "RAM disk"...

b) InterceptionDepending on how deep your knowledge with the OS is you perhaps are even able to intercept the ReadFile Calls made by CreateProcess and deliver exaclty the ( decprypted ) bytes it is asking for.

My knowledge is not deep enough to ptovide some source code... BUT I hope the ideas help somehow.

Your idea has been well taken - actually I had considered BOTH options that you have suggested, bless you !

1.) RAM Disc - I REALLY do NOT know how to implement one ! Any ideas anybody ?
But anyway, as anybody could simply dump the contents of the RAM Disc, so ..

2.) Binary code/API Interception - Now that's a VERY good suggestion, and I had thought of it.

You see Yahia, API Hooking is VERY interesting, and OFTEN used in protection, and with Microsoft(R) Detours it has never been so easy!
Unfortunately, I could sucessfully implement Detours on WinNT based kernels only (Win2k, WinXP), but it just fails to work on Win9x !
Most of Detours DOES NOT work on Win9x
Currently, I am writing in the correct code after system dependent decryption directly into memory.

And PLEASE do NOT underestimate the power of giving great ideas - not everbody can code, and neither can everybody come up with good ideas !

So till then, learn, work and sleep

"God then made two great lights; the greater light to rule the day, and the less light to rule the night"
- Genesis 47:3

I am currently reading in the .text or .code section of the executable to be protected using PE Header information and dumping it from the application entry point onwards into an external file "code_sec.dat" which is encrypted using Blowfish using the HardwareID MD5 as the key. The .code section of the original EXE from the application entry point onwards is overwritten using binary NULLs.

The idea is that the "loader" is doing CreateProcess() on the protected file with the suspend flag on. This, as far as I know stops code execution from the application entry point. Then using a single WriteProcessMemory() line I am writing "code_sec.dat" contents (after decrypting it using the current HardwareID) onto teh process memory, and then Resuming execution after this has been done.

Believe me, it is working perfectly - the only probelem being that the Blowfish code that I have requires 8 byte (64 bit) blocks - just what the Blowfish algo is all about.

Now, the Blowfish refrence code from CounterPane offers us a help in the form of a function, quoted from the doc itself :

"void BF_cfb64_encrypt(unsigned char *in,
unsigned char *out,
long length,BF_KEY *schedule,
unsigned char *ivec,
int *num,int encrypt);
This is one of the more useful functions in this Blowfish library, it implements CFB mode of Blowfish with 64bit feedback.This allows you to encrypt an arbitrary number of bytes,you do not require 8 byte padding. Each call to this routine will encrypt the input bytes to output and then update ivec and num."

However, the EXE fails to run saying that "Blowfish.DLL" is reqd. The .LIB file supplied fails to be recognised as valid by my VC++ 6.0 linker and there's no source which I can adapt.

So the option is to either shift to another algo which does not operate on blocks (and I am not aware of any good one) or find out another Blowfish implementation which can encrypt an arbitrary number of bytes, not requiring 8 byte padding.

Anybody might help ?

"God then made two great lights; the greater light to rule the day, and the less light to rule the night"
- Genesis 47:3

why not process everything as normal up to the last complete block and then do the following:

1. copy a remaining bytes in a new buffer
2. fill the rest of the buffer with 0 (or some other constant value)
3. encode

for decode you do the same but only copy the required bytes to the destination

PS: don't overwrite the zero'ed values with the decoded ones. In case of relocation that won't work. Simple add the decoded data to the code segment because the loader (CreateProcess) has already allpied the relocations to the zeroed code/data segment.

What you have just described is actually the process of padding - it has already been implemented in my code.

You see, what I mean to say is that say the code segment is 145679 bytes long. That's not divisible by 8 so I add 1 byte padding by appending a NULL to to, and then we have 8 byte blocks which we can easily encode/decode using Blowfish

The problem is, that the code section was originally 145679 bytes long, and considering that this size IS the length of the .text (code) section of the EXE file, this means that the 145680 th byte in the EXE file is actually part of anoter section , so I just cannot overwrite it

That's why I have to put the encrypted code section in a separate file named "code_sec.dat", which I am dynamically writing into the EXE memory process using Read/WriteProcessMemory()

HOLY sh*t ! I just realized - what I have been speaking about is the new code that I wrote and forgot to upload to CodeProject !

you don't need to write the decrypter in asm. YOu place all the code in another segment (say ".etext" - encrypted text) and in the WinMain/DllMain you simple decrypt this block. You can get the size using extern "C" variables that live in a simple asm file:

I forgor one important thing:
Before you try to decrypt the code you must undo all relocations (can be easly found with the imagehelper library).
Then decode the block and compare the checksum. If the checksum is wrong reencode with the wrong password (to restory to the original data) or quit the application.
At last reapplly the relocation data. Relocation data is simple:
Header:
DWORD start_offset;
DWORD size_of_block; (including header)
Entry:
WORD Value;