Go to page

Wave

Guest

In this thread I am going to discuss a short bit about anti-rootkit development, and I will be providing some basic code examples to help you on your way (should you be interested in anti-rootkit development). Throughout the thread I will be discussing both kernel-mode and user-mode methods.

Notes:

I recommend you zoom out one from the default zoom on your browser as it may make reading this thread (or any other large threads) much easier.

Please use any code examples presented in this thread responsibly and not for malicious purposes – all code examples presented within this thread are for nothing but educational purposes and I am not responsible for any misuse of them.

This thread is mainly focused on termination of rootkit-protected processes as opposed to detecting concealment by a rootkit (e.g. detecting hidden objects – I can save this for a future thread).

Part 1 – terminating rootkit protected threads
The interesting thing about rootkits is that they do not always just conceal malicious activity being performed on the system (e.g. hide malicious processes, registry keys and files) but they may also protect malicious software from being removed from the system, too. More often than not, the rootkit will tend to protect the malicious processes against termination, either via kernel-mode hooks or via user-mode hooks, not normally a kernel-mode callback (but on x64 systems if a rootkit has kernel-mode components then it’s a possibility since it’d be an alternate to SSDT hooking which cannot be done without a PatchGuard/Kernel Patch Protection bypass).

Since rootkits tend to hook important functions which we need to use to open a handle to the malicious processes so we can terminate them (or process’ threads) (such as NtTerminateProces, NtOpenProcess, NtSuspendProcess, NtTerminateThread, NtAllocateVirtualMemory, NtWriteVirtualMemory, NtUnmapViewOfSection – and in kernel-mode patching they may hook functions closer to the kernel like PsLookupProcessByProcessId), we have a few options to try: we can attempt to circumvent the hook by using a function which has not been hooked by the rootkit which can still be used to terminate the “protected” processes (e.g. if a user-mode basic rootkit hooks Kernel32.dll!TerminateProcess then we can circumvent the hook with a call to NtTerminateProcess); we can attempt to identify which functions are hooked if we fail to terminate the “protected” processes and if the functions are hooked (e.g. SSDT, IAT/EAT) then we can attempt to repair the function prologue to remove the hook; if user-mode functions are hooked we can attempt to perform a direct NTAPI system call; we can attempt to circumvent the hook via bypassing access checks from kernel-mode using kernel-only functions (most effective on x64 systems where SSDT hooking cannot be performed by default, since this method won’t be able to be patched up to block it).

More sophisticated rootkits may actually hook functions like KiFastSystemCall, X86SwitchTo64BitMode or Wow64 functions (e.g. Wow64SystemServiceEx – may be useful to hook) from user-mode, and on x86 systems a rootkit may prefer to hook an instruction like SYSENTER so it can intercept all NTAPI calls coming from user-mode processes from one single hook and filter out the function being called for and the caller process (IoGetCurrentProcess function) via the System ID of the function.

I have decided to share a method of terminating rootkit protected processes (most effective on x64 since on x86 a kernel-mode component can hook the function I use for this method, which is PsLookupProcessByProcessId and ObOpenObjectByPointer), it can be pretty beneficial to security software developers who want to make their malware termination much more powerful. The reason ObOpenObjectByPointer is useful in this scenario is that it will bypass the access checks, thus it will bypass the protection set-up by kernel-mode callbacks like ObRegisterCallbacks. The downside is that you cannot do this from user-mode since ObOpenObjectByPointer (and PsLookupProcessByProcessId) are not exported by ntdll.dll, however security software tends to be given admin privileges (since the user provides consent for its installation) and therefore the security software can install a device driver containing this functionality.

The code is pretty self-explanatory however I will explain it briefly: when the device driver loads it will set the unload routine (even though nothing is being done, it’s still good practise to have an unload routine so in the case of a BSOD crash it won’t get stuck and not auto-reboot), afterwards it will call the terminate_process function with one parameter (the PID (Process Identifier) of the process we want to terminate). The terminate_process function will start off by setting up some variables for usage with empty values, and then we will use the PsLookupProcessByProcessId function so we can receive a pointer to the EPROCESS structure of the target process we want to terminate. Once we have the pointer to the EPROCESS structure of the target process, we will call ObOpenObjectByPointer so we can open the target object and have a handle with our desired access rights returned whilst bypassing accessing checks (in this scenario, we want a handle to the target process returned); to finish off the job we will call ZwTerminateProces and pass in the newly acquired handle to the target process – since we have the handle we can now use it for all sorts of things (e.g. termination, suspension, injection, etc). That being said, you cannot use ZwSuspendProcess from kernel-mode without manually acquiring the address, but you can still perform injection from kernel-mode.

The driver source code is shown below (it's written in C, not C++), it’s a basic demonstration, however if you changed the target PID and then re-compiled it and loaded it (e.g. with OsrLoader for testing purposes) then it should work as-is:

You do not need to modify the linker settings since /integritycheck is not required to use PsLookupProcessByProcessId or ObOpenObjectByPointer.

It is a basic method of terminating a rootkit protected process however it can be very effective on x64 systems especially, and much more sophisticated than just calling NtOpenProcess to obtain the process handle.

Please remember that where comes great power also comes great responsibility.

Part 2 – detecting and repairing user-mode hooks
In this part I will be discussing a bit about how you can go about identifying and then repairing hooked functions present in the NTAPI. Before I can get started I need to tell you that there are many different types of user-mode API hooking, popular methods would include IAT/EAT hooking (for user-mode).

IAT stands for the Import Address Table and if you perform an IAT hook then whenever the target program attempts to call the function you had hooked (normally), it will trigger your hook since your modification will become executed.

EAT stands for the Export Address Table and if you perform an EAT hook (most effective early-on) then whenever the program attempts to find the address to a specific function it wants to use (e.g. via GetProcAddress from Kernel32.dll) it will return the address of your callback function as opposed to the original genuine address to the function stub.

Hooking can be as simple as obtaining the address of the target function you wish to hook, modifying the memory protection at this address so you can alter the bytes in memory at the start of the target function prologue, placing a JMP instruction followed by your callback address (the proxy to redirect execution flow to once the hook becomes triggered) and then re-modifying the memory protection back to how it originally was.

The below code is a very basic example of an x86 hook which doesn’t even use a trampoline, it simply works by using the logic mentioned above (C++):

The above code is bad because it doesn’t use a trampoline which makes it less efficient and not as powerful, but it’s fine for an example demonstration. When the MessageBoxW call is occurred, the callback function is triggered and it calls MessageBoxA with different parameters instead, meaning I don’t see “Hello Wave!” in the MessageBox but the new value I put through the MessageBoxA call from my hook function instead. If I try to use MessageBoxW from within the callback, it’ll cause a crash since it’ll just trigger my hook over and over again, since it doesn’t use a trampoline to call the original function.

You can easily make a trampoline through allocating memory and copying the original bytes over and then using that trampoline address to call the original function (since the original function would be stored there).

Now moving onto basic hook detection, I will first have to explain how the above code worked with some better demonstrations.

As we already know, the code will start off by getting the address of the target function I want to hook (in this case, it’ll use GetProcAddress to find the address of User32.dll!MessageBoxW). After it has gotten the address, it will modify the memory protection so we can modify the memory at that address, and then it will proceed to place a JMP instruction at the very start of the function prologue, followed by the offset (address of our callback function).

If I check the disassembler I can see that the start of the function prologue (first 5 bytes before the hook) is as follows:

Therefore, we overwrite the first 5 bytes with a JMP instruction (and then +1 byte from this JMP instruction we place the address of our callback/offset). The first 5 bytes of the function after we have placed our hook is as follows:

Code:

E9 80 86 3C 8A jmp MessageBoxW_Callback (0AD1235h)

As we can see, there is no longer the following byte sequence at the function prologue: 8B FF 55; instead there is E9 80 86 3C 8A (JMP <addr>) – which is 5 bytes in size which is the correct patch size for the function we are hooking (this is why we only modify the memory protection for the first 5 bytes at the function prologue).

As we can see from the above information, the start of the function prologue is not meant to have a JMP instruction (0xE9), but the first byte of the function should be MOV (0xB8). Therefore, to make a simple hook detector for this function hook (e.g. for functions which have been hooked via a JMP instruction which should not start with one), all we need to do is perform some simple memory checks to check if the start of the function prologue is equal to 0x8B or not – if the start of the function prologue is not 0x8B then it means the function has been hooked (this may not be for every function in the Win32/NTAPI, but it will work fine for this specific function we are dealing with).

If I call this function after using the code from earlier, it will return as true because the function will detect that the start of the function prologue for MessageBoxW is not 0x8B (since it was replaced with 0xE9).

Now we can identify whether the function prologue at the start is how it should be or not, we can now repair the function if necessary if the start if the first byte of the function is not 0x8B. Below is some example code which will use the above check code but will repair the function if it has been hooked:

Therefore, after hooking the function and then using the above function, the function prologue will be restored back to how it originally was like we never hooked it in the first place. You may have noticed the VirtualProtect usage in the above code, we have to use this function again as earlier when hooking the function so we can actually modify the bytes in memory at the address.

If you wish to detect hooks on NTAPI functions then bear in mind that most NTAPI function start with instructions for MOV EAX, SystemID. Obviously, SystemID is replaced with the System ID for that particular function. MOV EAX is equivalent to 0xB8, therefore you could detect the same way as shown in this thread but compare the start for 0xB8 instead, and of course then adapt the repair code for that particular function you are scanning and repairing. Also bear in mind that it depends if the function was hooked inline or not... And the hook method used. The above hook repair code won't detect e.g. EAT hook.
-------------------------------------------------------------

Hopefully someone found this thread useful, I know it is not the most detailed or longest thread out there but it may still be beneficial since there is some useful information contained within this thread.

Wave

Guest

I was meant to write this up yesterday but I decided to just do it today as I had all day to write it up and post it. I had a friend awhile back who was good with API hooking so he taught me the very basics (e.g. basic x86 hook like done in this thread) and I went on from there... Long time practise later I'm much better, expanded to device driver development, etc. So why not share on the knowledge?

Wave

Guest

As an addition to the original thread:
You can copy ntdll.dll and re-name the copy to something else and then load that into memory and use the functions from there (or any other Windows API library like kernel32.dll, user32.dll, etc), and it may trick a rootkit which hooks the functions for that module into not hooking the functions within the re-named copy version which you loaded into the address space of the process. I can do a code example if someone would like me too.

The reason this may work in some cases is because the rootkit may use GetProcAddress (or a custom function wrapper to do the work that GetProcAddress would have done) on specifically the target module, and may not be able to identify the module copy version which exports the same functions.

As for unhooking, if you can detect the hook then you can read the function prologue from a mapped copy of ntdll.dll into memory and copy the bytes across to the address in memory at your own program, as opposed to manually going through and restoring the instructions in memory as presented in this thread. Unless it has been patched on disk through file infection, of course.

Level 26

I'm really enjoying these posts, very educational. Even Though I've used Windows for the Majority of my computing life I have always dual booted a Linux distro and ran various distros exclusively for a few years and a lot of my knowledge now lies within hardening Linux distros in various different ways. So, having posts such as these is fantastic learning. Thank you yet again @Wave

Level 1

As an addition to the original thread:
You can copy ntdll.dll and re-name the copy to something else and then load that into memory and use the functions from there (or any other Windows API library like kernel32.dll, user32.dll, etc), and it may trick a rootkit which hooks the functions for that module into not hooking the functions within the re-named copy version which you loaded into the address space of the process. I can do a code example if someone would like me too.

The reason this may work in some cases is because the rootkit may use GetProcAddress (or a custom function wrapper to do the work that GetProcAddress would have done) on specifically the target module, and may not be able to identify the module copy version which exports the same functions.

As for unhooking, if you can detect the hook then you can read the function prologue from a mapped copy of ntdll.dll into memory and copy the bytes across to the address in memory at your own program, as opposed to manually going through and restoring the instructions in memory as presented in this thread. Unless it has been patched on disk through file infection, of course.

Level 26

@Wave-For some of us, like myself, maybe this is like reading a book in a foreign language that we briefly studied or for a short time. Anyone in that situation and interested in coding sill surely find your coding dialogs extremely helpful. Thank you.

Wave

Guest

@Wave-For some of us, like myself, maybe this is like reading a book in a foreign language that we briefly studied or for a short time. Anyone in that situation and interested in coding sill surely find your coding dialogs extremely helpful. Thank you.

I did read your PM and I didn't mean to ignore it by not responding - I was on my phone at the time. However, I'm not going to provide any source code examples on things like system calls... This thread felt a bit risky for me with the device driver part alone, and other threads I've written, and by posting a working solution on system calls I feel it would be on the verge of me helping malware authors more than security software developers, really.

Just check the function prologue by opening up ntdll.dll in IDA Pro (or by getting the address dynamically and then set a breakpoint and then go to the disassembly at the address of that function which is mapped into memory) and then hard-code the bytes into the code and copy it into some newly allocated memory (VirtualAlloc, memcpy will do)... Then use it like an API hook trampoline. (or do the other method where you make a copy of ntdll.dll -> rename it -> load with LoadLibraryA and then use the functions from that version - or map ntdll.dll from disk into memory and copy the bytes across to new memory, then use it... Which would make it dynamic and not hard-coded for each OS version).

Anyway regarding different function prologues, they should be more or less the same with NTAPI... Since if they were suddenly changed then previous apps wouldn't work on newer ones correctly. It's a critical module in the Windows OS to allow transition to the kernel for Native API functions, which are called based on the Win32 functions via a system call - this is how NTDLL works, it's a wrapper for system calls.

Therefore, the function prologue for NtTerminateProcess on Windows 10 is the same as it is on Windows 7, for example.

If a modification is based to the function prologue then Microsoft make a new function name by adding the Ex prefix to it; this is why we have functions like CreateRemoteThread, but also another one called CreateRemoteThreadEx (there are differences)...

Regarding the EAT questions, I'm pretty sure you'll go far if you use Google and put inurl:stackoverflow at the end of the search query, most likely. I am certain someone asked a question like in your PM on that site before because I use to read them all when I was learning about it

Level 1

First of all, thank you for your attention and for the quick and enlightening response.
Usually NTAPI start with mov eax,000000XX i believe
but if we compare with 64 bits WIN 7 and 32 bits XP(nowdays alot of people still using XP) OS this can be a problem, because alot of new functions has been added.

(or do the other method where you make a copy of ntdll.dll -> rename it -> load with LoadLibraryA and then use the functions from that version - or map ntdll.dll from disk into memory and copy the bytes across to new memory, then use it... Which would make it dynamic and not hard-coded for each OS version).

Regarding the EAT questions, I'm pretty sure you'll go far if you use Google and put inurl:stackoverflow at the end of the search query, most likely. I am certain someone asked a question like in your PM on that site before because I use to read them all when I was learning about it

Wave

Guest

First of all, thank you for your attention and for the quick and enlightening response.
Usually NTAPI start with mov eax,000000XX i believe
but if we compare with 64 bits WIN 7 and 32 bits XP(nowdays alot of people still using XP) OS this can be a problem, because alot of new functions has been added.

No, nowadays a lot of people are using Windows 10 (home users), although it is true that the users of Windows XP did spike up in the past few months - there are still a lot of enterprises using it also but I doubt your Anti-Cheat engine will bother them at all. You could always check the OS version and depending on the OS version can depend on which bytes you copy over to the newly allocated memory for usage, it's not the end of the world.

Yes, the function call usually starts with mov eax, <addr> (replace <addr> with the system identifier for the function being called, this is so this can be accessed from kernel-mode after the syscall instruction).

Instead of me answering this, try it with both and see how it goes. That way you'll gain experience through knowledge and it's a really good habit to start doing.

Yes, I do recommend re-naming the DLL on disk, however you should manual map the DLL by loading it via LdrLoadDll (exported by ntdll.dll) so it's not linked to the PEB list, this giving you further concealment.

Level 1

Yes, I do recommend re-naming the DLL on disk, however you should manual map the DLL by loading it via LdrLoadDll (exported by ntdll.dll) so it's not linked to the PEB list, this giving you further concealment.

Deleted member 65228

Guest

I was going through this section and the threads and noticed this part was unanswered therefore I will explain the differences between the modules present in SysWOW64 and System32.

WOW64 stands for "Windows On Windows 64". To make it more clear, it is "Windows 32-bit on Windows 64-bit". It is an emulation implementation which is present only on 64-bit versions of Windows to provide compatibility for 32-bit compiled software. Without WOW64, 32-bit compiled applications will simply not work because they would be unable to perform any calls to the NT functions (exported by NTDLL - the real functions exist in kernel-mode memory however NTDLL is just a trampoline for user-mode to kernel-mode switching).

The SysWOW64 folder ("\\??\\C:\\Windows\\SysWOW64\\") is used to hold the 32-bit copies of Windows modules on an 64-bit environment, and the System32 folder ("\\??\\C:\\Windows\\System32\\") is used to hold the 64-bit copies of Windows modules on an 64-bit environment. However, on a 32-bit system the SysWOW64 folder will not exist and the 32-bit compiled modules will be placed under the System32 folder.

The reason there are 32-bit compiled copies of Windows modules on an 64-bit environment (such as ntdll.dll, kernel32.dll, user32.dll, sechost.dll, etc.) is because they are used for emulation. Well, WOW64 is the emulation system and there are also modules specifically related to Wow64, however some of the Wow64 functions are exported by ntdll.dll and kernel32.dll as well. A 32-bit process running on an 64-bit environment is running under WOW64, whereas a 64-bit process on an 64-bit environment is not running under WOW64.

If a process is running under WOW64 (32-bit process on an 64-bit OS, also known as X86-X64 process), the SysWOW64 modules will be loaded for Windows modules instead of the System32 modules. However, the 64-bit version of ntdll.dll will ALSO be loaded with the 32-bit copy. The reason for this is because WOW64 will change the segment selector to 0x33 (which is for 64-bit world) and the 64-bit copy of ntdll.dll contains the NTAPI function stubs which performs the the system calls (e.g. via SYSENTER), so after WOW64 has been passed through, execution flow moves to the 64-bit stub for the calling NT function and then returns back.

There is a way to use the 64-bit NTDLL exported functions on an x86-x64 process without passing through WOW64, and it is via a technique known as Heavens Gate. This works by manually entering 64-bit world... You can intercept WOW64 by setting detours on various WOW64 functions, too.

Summary:
- WOW64 is an emulation implementation to provide support for 32-bit software on 64-bit versions of Windows.
- 32-bit software will pass through the emulation implementation internally so the OS can execute 64-bit code for functionality to be maintained (behind the scenes).
- 32-bit programs on an 64-bit environment load 32-bit Windows modules as well as an 64-bit copy of ntdll.dll (alongside the 64-bit copy).

Deleted member 65228

Guest

Yes and No. It really depends on how the process protection is enforced for the target process.

Windows process protection is enforced on an even deeper level (e.g. csrss.exe) - Windows Vista enforces at PspOpenProcess at least. Functions like ObOpenObjectByPointer will lead back to pass through non-exported functions. It isn't impossible for a protected process to be manually enforced through detouring a routine like ObOpenObjectByPointer, too. By 'detouring' I am referring to byte-patching the function (for 32-bit systems only of course due to PatchGuard).

Generally speaking though, protected processes enforced only through a kernel-mode callback by ObRegisterCallbacks, will probably be able to be terminated through acquiring a handle with ObOpenObjectByPointer.

We use cookies to improve your browsing experience on our site, show personalized content and targeted ads, analyze site traffic, and understand where our audience is coming from.
By continuing to use this site, you are consenting to our use of cookies.