Friday, April 25, 2014

0x50 Debugging - When it's not the antivirus!

0x50, the bug check we all love because it's so easy to say 'Remove avast!, AVG, Kaspersky, McAfee, Norton, ESET, etc' because most commonly this
bug check is caused by antiviruses corrupting the file system,
interceptors conflicting if anti-malware and antivirus active
protections are running (maybe two antiviruses running at once), etc.
Lots of different possibilities. However, what if we're not so quick to
blame the antivirus, and come to find instead that it's faulty RAM?
Well, let's talk about it!

---------------------------

PAGE_FAULT_IN_NONPAGED_AREA (50)Invalid system memory was referenced. This cannot be protected by try-except,it must be protected by a Probe. Typically the address is just plain bad or itis pointing at freed memory.Arguments:Arg1: fffffa806589b700, memory referenced.Arg2: 0000000000000000, value 0 = read operation, 1 = write operation.Arg3: fffff803fd7133d4, If non-zero, the instruction address which referenced the bad memory address.Arg4: 0000000000000002, (reserved)

^^
Here we of course have the basic output of the bug check. As we can
see, parameter 1 is the memory that was referenced, and parameter 3 (if
non-zero), is the instruction address which referenced the bad memory
address (parameter 1). So, we can so far say that address fffffa806589b700 was written to by the instruction at address fffff803fd7133d4. Pretty easy so far!

6: kd> r cr2cr2=fffffa806589b700

^^
We can see above that the 1st parameter address was stored in cr2 prior
to calling the page fault handler. This doesn't tell us anything we
don't already know about the bug check, just a confirmation, if you
will.

---------------------------

Now that we know all of this, let's go ahead and run !pte on the 1st parameter address. !pte
displays the page table entry (PTE) and page directory entry (PDE) for the specified address.

^^ On the instruction we failed on, address fffff803`fd7133d4 deferenced r13+10h where r13 is 0000000000000000. All of this would result in a memory write to the address 00000000`00000010. Let's go ahead and run !pte on 00000000`00000010to see whether or not it's a valid address.

Right, so the code wanted to write to 00000000`00000010 which as we can see above is a completely invalid and/or bogus address. The 1st parameter and cr2 however note we failed writing to address fffffa806589b700. This does not make sense, and is essentially not logically possible.

MiAgeWorkingSet told the hardware to write to 00000000`00000010 (which again by the way is a completely invalid address), and the hardware came back and said 'I cannot write to fffffa806589b700'. I like ntdebug's analogy on this, which can be read (here).
The way I like to explain it in this specific scenario is if you kindly asked the waiter of your
table for some delicious hot lava water (doesn't exist, of course! :']), he writes it down, but comes back and says 'I'm
sorry, but we're all out of coffee'.

Ultimately, the hardware was told to write to a completely invalid address, and then came back with an entirely different invalid address. Seems very fishy on hardware, doesn't it?

We can also very likely confirm that this is a hardware issue not just by the analysis alone, but this specific crash dump was verifier enabled, yet failed to find a 3rd party driver being the culprit (because there isn't one):