Now that the program stopped at the beginning of NtSetContextThread, let’s try to let it execute until the return (windbg: pt). If everything works, the kernel should change the context of the thread and there should be no return.

0:000> pt
ntdll!NtSetContextThread+0xc:
771782fc c20800 ret 8

Obviously it didn’t work as well as we had hoped. The kernel did not change the context of the thread and we did reach the return instruction.

Let’s take a look at the NTSTATUS that the syscall returned by displaying the content of the register eax (windbg: r):

0:000> r eax
eax=c000000d

Oh no, STATUS_INVALID_PARAMETER.

A Quick Look at the Kernel

Let’s take a look at the internals of NtSetContextThread (by looking at its implementation in the Windows kernel).

NtSetContextThread calls PsSetContextThread

Figure 5: call PsSetContextThread

PsSetContextThread calls PspSetContextThreadInternal:

Figure 6: call PspSetContextThreadInternal

PspSetContextThreadInternal calls KeVerifyContextRecord:

Figure 7: call KeVerifyContextRecord

KeVerifyContextRecord calls CFG’s RtlGuardIsValidStackPointer. This function will check that the value of ESP in the CONTEXT structure passed to NtSetContextThread is valid.

It then loads EAX with the address of the ETHREAD structure of the current thread, and then uses EAX (which now points to the current thread’s ETHREAD structure) to load ECX with the address of the current thread’s TEB:

Figure 10: RtlGuardIsValidStackPoint – Store TEB in ECX

If ESI is located “below” the stack limit (ecx+8 points to the stack limit)

Figure 11: RtlGuardIsValidStackPoint – Check Stack Limit

Or if ESI is located “above” the stack base (ecx+4 points to the stack base)

To put it simply, if the value passed to RtlGuardIsValidStackPointer is not between the stack base and the stack limit, the function returns FALSE. If it is between the stack base and the stack limit the function returns TRUE. Basically this function checks whether the value of ESP in the CONTEXT structure passed to NtSetContextThread is really located within the stack of the current thread.

After the call to RtlGuardIsValidStackPointer returns, we have some bitwise manipulation that will cause KeVerifyContextRecord to return STATUS_SUCCESS (0x00000000) if RtlGuardIsValidStackPointer returns TRUE (0x1), and STATUS_INVALID_PARAMETER (0xC000000D) if RtlGuardIsValidStackPointer returns FALSE (0x0).

Figure 15: call RtlGuardIsValidStackPointer

One Last Hurdle

We have two options to bypass this protection mechanism:

The TEB is stored in the user mode part of the virtual address space of the target process, and it has RW protection. We can change the values (StackBase and StackLimit) so that our ROP chain appears to be on the stack.

We can query the StackLimit of the target process and place our ROP chain close to it. This way we won’t corrupt the actual data stored on the stack (which will be used once we restore execution to the hijacked thread) and CFG (along with other security products, and mitigation solutions) will not flag our stack pivot.

I decided to go with option 2. I used GetThreadSelectorEntry to query the address of the target process’s TEB, and then ReadProcessMemory to read the StackBase and StackLimit.

Generally speaking, stack pivoting is quite prone to detection by anti-exploitation tools. Moving the ROP chain to the stack makes this attack much more stealthy and harder to detect. A fully weaponized injection will also construct its ROP chain to bypass EMET-like mitigation.

In order to avoid malicious use of this technique, we will not publish this code.