Search This Blog

Saturday, November 26, 2011

Advanced DLL Injection

It has been a while since my last article. Special thanks to those who decided to stay with me despite the long break and welcome to new readers!

In this article I am going to cover such a trivial (as it may seem) subject as DLL injection. For some reason, most of the tutorials on the web only give us a brief coverage of the topic, mostly limited to invocation of LoadLibraryA/W Windows API function in the address space of another process. While this is not bad at all, it gives us the least flexible solution. Meaning that all the logic MUST be hardcoded in the DLL we want to inject. On the other hand, we may incorporate all the configuration management (loading config files, parsing thereof, etc) into our DLL. This is better, but still fills it with code which is only going to run once.

Let us try another approach. What we are going to do, is write a loader (an executable what will inject our DLL into another process) and a small DLL, which will be injected. For simplicity, the loader will also create the target process. Being a Linux user, I used Flat Assembler and mingw32 for this task, but you may adjust the code for whatever environment you prefer.

A short remark for nerds before we start. The code in this article does not contain any security checks (e.g. checking correctness of the value returned by specific function) unless it is needed as an example. If you decide to try this code, you'll be doing this at your own risk.

So, let the fun begin.

Creation of target process

Let's assume, that the loader has already passed the phase of loading and parsing configuration files and is ready to start the actual job.

Windows provides us with all the tools we need to start a process. There are more then one way of doing that, but let us use the simplest and use CreateProcess API function. Its declaration looks quite frightening, but we'll make it as easy as possible:

BOOLWINAPI CreateProcess(

__in_optLPCTSTR lpApplicationName,

__inout_optLPTSTR lpCommandLine,

__in_optLPSECURITY_ATTRIBUTES lpProcessAttributes,

__in_optLPSECURITY_ATTRIBUTES lpThreadAttributes,

__inBOOL bInheritHandles,

__inDWORD dwCreationFlags,

__in_optLPVOID lpEnvironment,

__in_optLPCTSTR lpCurrentDirectory,

__inLPSTARTUPINFO lpStartupInfo,

__outLPPROCESS_INFORMATION lpProcessInformation

);

We only have to specify half of the parameters when calling this function and set all the rest to NULL. This function has two variants CreateProcessA and CreateProcessW as ASCII and Unicode versions respectively. We are going to stick with ASCII all way long, so, our code would look like this (due to the fact that "CreateProcess" is rather a macro then function name, we should explicitly specify A version as some compilers tend to default to W versions):

Don't forget to set the cb field of startupInfo to (DWORD)sizeof(STARTUPINFO), otherwise it would not work.

If the function succeeds, we get all the information about the process (handles and IDs) in the processInformation structure, which has the following prototype:

typedefstruct_PROCESS_INFORMATION

{

HANDLE hProcess; //Handle to the process

HANDLE hThread; //Handle to the main thread of the process

DWORD dwProcessId; //ID of the new process

DWORD dwThreadId; //ID of the main thread of the process

}PROCESS_INFORMATION, *LPPROCESS_INFORMATION;

By now, the process has been created, but it is suspended. Meaning that it has not started its execution yet and will not until we call ResumeThread(processInformation.dwThreadId) telling the operating system to resume the main thread of the process, but this is going to be the last action performed by our loader.

Lancet

One may call it a shellcode, but it has nothing to do with the viral payload or any other malicious intent (unless, someone would say that breaking into address space of another process is malicious by definition). It is the code, that we are going to inject into the target process. It, theoretically, may be written in any language as long as it may be position independent and compiled into native instructions (in our case x86 instructions), but I prefer to do such things in Assembly language.

It is always a good idea, to think of what your code is intended to do before writing a single line of it, in this case it is a golden idea. The code needs to be small, preferably fast and stable as it is a bit of a headache to debug once it has been injected.

There are two basic tasks that you would want to assign to this code:

Load our DLL

Call the initialization procedure exported by our dLL

and one unavoidable condition - it has to be a function declared as ThreadProc callback, due to the fact that we are going to use the CreateRemoteThread function in order to launch it. The prototype of a ThreadProc callback function looks like this:

DWORDWINAPI ThreadProc( __inLPVOID lpParameter);

which means that it has to return a value of type DWORD (which is actually unsigned int). It accepts one parameter, which may either be an actual value (but you have to cast it to LPVOID type) or a pointer to an array of parameters. One more thing about this function (the last but not the least!) it is an stdcall function - WINAPI macro is defined as __declspec(stdcall). This means that our function has to take care of cleaning the stack before return. In our case it is quite easy, simply use ret 0x04 (assuming that size of LPVOID is 4 bytes).

Another important thing to mention - you will, obviously need to know how many bytes your function occupies in order to correctly allocate memory in the address space of the target process and move your code there. In addition to allocation of one block of executable memory for our function, you will also need to allocate one block for data - configuration settings to be passed to the injected DLL. It is easy to pass the address of the parameters as an argument to our ThreadProc.

The skeleton of the function would look like this:

lancet:

push ebp

mov ebp, esp

sub esp, as_much_space_as_you_need_for_variables

push registers_you_are_planning_to_use

;function body

pop registers_you_used

mov esp, ebp

pop ebp

ret 0x04lancet_size = $-lancet

The last line gives us the exact size of the function in bytes. The following is the source file template:

So, what are we going to insert into the "function body"? First of all, as our code, once it is injected, has no idea of where in the memory it is, we should save our "base address" and calculate all the offsets relative to that address. This is done in a simple manner. We call the next address and pop the return address into our local variable.

call @f @@: pop dword [ebp-4] sub dword [ebp-4], @b-lancet

that's it. Now the variable at [ebp-4] contains our "base address". Each time we want to call another function or access our data (strings with names, remember?) we should do the following:

Now about the execution itself. Although, we now know where we are, we have no idea of what the address of LoadLibraryA is. There are, at least, two ways to get that address nicely. First has been described in my "Stealth Import of Windows API" article. The second is also interesting - PEB. Yes, we are going to access the Process Environment Block, find the LDR_MODULE structure which refers to KERNEL32.DLL and get its base address (which is also a handle to the library). Some may say that this way is not reliable, not stable and even dangerous, but I will say, that statements like these are not serious. We are not going to change anything in those structures. We are only going to parse them.

How do we find the PEB? This is quite simple. It is located at [FS:0x30]. Once we have it, we are on our way to PEB_LDR_DATA address, which is at PEB+0x0C. In order to parse the PEB_LDR_DATA structure, we should declare the following in our Assembly code:

I leave the implementation of the module list parsing function up to you. You just have to keep in mind that the string you are going to check are represented by the UNICODE_STRING structure (described in the article referenced above). Another thing to remember, is that it is better to implement case insensitive string comparison function.

Once you find the LDR_MODULE wich baseDllName is "kernel32.dll" you have its handle (simply in the baseAddress field). You may use the _get_proc_address function from the same article (mentioned above) in order to get the address of the LoadLibraryA function. Having that address, you are ready to load your DLL (do the actual injection). Personal suggestion - do not put lots of code into the DllMain function.

LoadLibraryA returns a handle to the newly loaded DLL, which you can use in order to locate you initialization function (remember it has to be exported by your DLL and preferably use the stdcall convention). After you _get_proc_address of your initialization function, call it and pass the address of the data block as a parameter (it was passed to our lancet function as a parameter on stack):

push dword [ebp+8] ;parameter passed to lancet is here call dword [ebp-12] ;assume that you stored the address of the initialization ;function here

That's it. Your code may now return. The DLL has been injected and initialized.

Injectionsomehow, we have missed the exciting process of injection of our lancet code. Don't worry, I have not forgotten about it.

As I have mentioned above, we have to allocate two blocks - for code and data. This can be done by calling the VirtualAllocEx function, which allows memory allocations in the address space of another process.

Use MEM_COMMIT as flAllocationType and PAGE_EXECUTE_READWRITE and PAGE_READWRITE for allocation of code and data block respectively. This function returns the address of allocated block in the address space of the specified process or NULL.

The WriteProcessMemory API function is used to copy your code and data into the address space of the target process.

Once you have copied both the data and the code, you will want to call your thread function. The only way to call a function which resides in the memory of another process is by calling the CreateRemoteThread API.