Locreate: An Anagram for Relocate
skape
12/2006
mmiller@hick.org
1) Foreword
Abstract: This paper presents a proof of concept executable packer
that does not use any custom code to unpack binaries at execution time. This
is different from typical packers which generally rely on packed executables
containing code that is used to perform the inverse of the packing operation
at runtime. Instead of depending on custom code, the technique described in
this paper uses documented behavior of the dynamic loader as a mechanism for
performing the unpacking operation. This difference can make binaries packed
using this technique more difficult to signature and analyze, but only when
presented to an untrained eye. The description of this technique is meant to
be an example of a fun thought exercise and not as some sort of revolutionary
packer. In fact, it's been used in the virus world many years prior to this
paper.
Thanks: The author would like to thank Skywing, spoonm, deft,
intropy, Orlando Padilla, nemo, Richard Johnson, Rolf Rolles, Derek Soeder,
and Andre Protas for their discussions and feedback.
Challenge: Prior to reading this paper, the author recommends that
the reader attempt to determine the behavior of the packer that was used on
the binary included in the attached code sample. The binary itself is
innocuous and just performs a few simple printf operations.
Previous Research: This technique has been used in the virus world far in
advance of this writing. Examples that apply this technique include
W95/Resurrel and W95/Silcer. Further research indicates that Peter Szor did a
write-up on this technique entitled ``Tricky Relocations'' in the April 2001
edition of Virus Bulletin[2,3].
2) Locreate
Executable packers, such as UPX, are commonly employed by malware as a means
of delaying or otherwise thwarting the process of static analysis. Packers
also have perfectly legitimate uses, but these uses fall outside of the scope
of this paper. The reason packers make static analysis more difficult is
because they alter the form of the binary to the point that what appears on
disk is entirely different from what actually ends up executing in memory.
This alteration is typically accomplished by encapsulating a pre-existing
binary in a ``host'' binary. The algorithm used to encapsulate the
pre-existing binary in the host binary is what differs from one packer to the
next. In most cases, the host binary must contain code that will perform the
inverse of the packing operation in order to decapsulate the original binary.
The code that is responsible for performing this operation is typically
referred to as an unpacker. The process of unpacking the original binary is
usually done entirely in memory without writing the original version out to
disk. Once the original binary is unpacked, execution control is transferred
to the original binary which begins executing as if nothing had changed.
This general approach represents an easy way of altering the form of a binary
without changing its effective behavior. In fact, it's pretty much analagous
to payload encoders that are used in conjunction with exploits to alter the
form of a payload in order to satisify some character restrictions without
changing the payload's effective behavior. In the case of payload encoders,
some arbitrary code must be prefixed to the encoded payload in order to
perform the inverse of the encoding operation once the payload is executed.
However, like payload encoders, the use of custom code to perform the inverse
of the packing or encoding operation can lead to a few problems.
The most apparent of these problems has to do with the fact that while the
packed form of an executable may be entirely different from its original, the
code used to perform the unpacking operation may be static. In the event that
the unpacker consists of static code, either in whole or in part, it may be
possible to signature or otherwise identify that a particular packing
algorithm has been used to produce a binary and thus make it easier to restore
the original form of the binary. This ability is especially important when it
comes to attempting to heuristically identify malware prior to allowing a user
to execute it.
The use of custom code can also make it possible for tools to be developed
that attempt to identify unpackers based on their behavior. Ero Carrera has
provided some excellent illustrations relating to the feasibility of this type
of attack against unpackers[1]. An understanding of an unpacker's behavior may
also make it possible to acquire the original binary without allowing it to
actually execute by simply tracing the unpacker up until the point where it
transfers execution control to the original binary. In the case of malware,
this weakness means that benefits gained from packing an executable can be
completely nullified.
Both of these problems are meant to illustrate that even though custom unpacking
code is often a requirement, its mere presence exposes a potential point of
weakness. If it were possible to eliminate the custom code required to unpack
a binary, it could make the two problems described previously much more difficult
to realize. To that point, the technique described in this paper does not
rely on the presence of custom code in a packed binary in order to unpack
itself. Instead, documented behavior of the dynamic loader is used to perform
the unpacking whenever the packed binary is executed. While this approach has
its benefits, there are a number of problems with it that will be discussed
later on. In the interest of brevity, the packer described in this paper will
simply be referred to as locreate. As was already mentioned,
locreate leverages a documented feature of most dynamic loaders in order to
perform its unpacking operation. Given that the process of unpacking
typically involves transforming the original binary's contents back into its
original form, there are only a finite number of dynamic loader features that
might be abused. Perhaps the feature that is best suited for transforming the
contents of a binary at runtime is the dynamic loader feature that was
designed to do just that: relocations.
In the event that a binary is unable to be loaded at its preferred base
address at runtime, the dynamic loader is responsible for attempting to move
the binary to another location in memory. The act of moving a binary from its
preferred base address to a new base address is more commonly referred to as
relocating. When a binary is relocated to a new base address, any references
the binary might have to addresses that are relative to its preferred base
address will no longer be valid. As such, references that are relative to the
preferred base address must be updated by the dynamic loader in order to make
them relative to the new base address. Of course, this presupposes that the
dynamic loader has some knowledge of where in the binary these address
references are made. To satisfy this presupposition, binaries will typically
include relocation information to provide the dynamic loader with a map to the
locations within the binary that need to be adjusted. When a binary does not
include relocation information, it's classified as a non-relocatable binary.
Without relocation information, a binary cannot be relocated to an alternate
base address in an elegant manner (ignoring position independent executables).
The structures used to convey relocation information differs from one binary
format to the next. For the purpose of this paper, only the structures used
to describe relocations of Portable Executable (PE) binaries will be
discussed. However, it should be noted that the approaches described in this
paper should be equally applicable to other binary formats, such as ELF. In
fact, other binary formats make the technique used by locreate even easier.
For example, ELF supports applying relocation fixups with an addend. This
addend is basically an arbitrary value that is used in conjunction with a
transformation. The PE binary format conveys relocation information through
one of the data directories that is included within the optional header
portion of the NT header. This data directory is symbolically referred to
through the use of the IMAGE_DIRECTORY_ENTRY_BASERELOC. The base relocation
data directory consists of zero or more IMAGE_BASE_RELOCATION structures which
are defined as:
typedef struct _IMAGE_BASE_RELOCATION {
ULONG VirtualAddress;
ULONG SizeOfBlock;
// USHORT TypeOffset[1];
} IMAGE_BASE_RELOCATION, *PIMAGE_BASE_RELOCATION;
The base relocation data directory is a little bit different from most other
data directories. The IMAGE_BASE_RELOCATION structures embedded in the data
directory do not occur immediately one after the other. Instead, there are a
variable number of USHORT sized fixup descriptors that separate each
structure. The SizeOfBlock attribute of each structure describes the entire
size of a relocation block. Each relocation block consists of the base
relocation structure and the variable number of fixup descriptors. Therefore,
enumeration of the base relocation data directory is best performed by using
the SizeOfBlock attribute of each structure to proceed to the next relocation
block until none are remaining. The VirtualAddress attribute of each
relocation block is a page-aligned relative virtual address (RVA) that is used
as the base address when processing its associated fixup descriptors. In this
manner, each relocation block describes the relocations that should be applied
to exactly one page.
The fixup descriptors contained within a relocation block describe the address
of the value that should be transformed and the method that should be used to
transform it. The PE format describes about 10 different transformations that
can be used to fixup an address reference. These transformations are conveyed
through the top 4 bits of each fixup descriptor. The bottom 12 bits are used
to describe the offset into the VirtualAddress of the containing relocation
block. Adding the bottom 12 bits of a fixup descriptor to the VirtualAddress
of a relocation block produces the RVA that contains a value that needs to be
transformed. Of the transformation methods that exist, the one most commonly
used on x86 is IMAGE_REL_BASED_HIGHLOW, or 3. This transformation dictates that
the 32-bit displacement between the original base address and the new base
address should be added to the value that exists at the RVA described by the
fixup descriptor. The act of adding the displacement means that the value
will be transformed to make it relative to the new base address rather than
the original base address. To better understand how all of these things tie
together, consider the following source code example:
#include
#include
int main(int argc, char **argv)
{
printf("Hello World.\n");
return 0;
}
When compiled down, this function appears as the following:
sample!main:
00401010 55 push ebp
00401011 8bec mov ebp,esp
00401013 6800104200 push offset sample!__rtc_tzz (sample+0x21000) (00421000)
00401018 e80c000000 call sample!printf (00401029)
0040101d 83c404 add esp,4
00401020 33c0 xor eax,eax
00401022 5d pop ebp
00401023 c3 ret
At address 0x00401013, main pushes the address of the string that contains
``Hello World!'':
0:000> db 00421000 L 10
00421000 48 65 6c 6c 6f 20 57 6f-72 6c 64 2e 0a 00 00 00 Hello World.....
In this case, the push instruction is referring to the string using an
absolute address. If the sample executable must be relocated at runtime, the
dynamic loader must be provided with the relocation information necessary to
fixup the reference to the absolute address. The dumpbin.exe utility from
Visual Studio can be used to confirm that this information exists. The first
requirement is that the binary must have relocation information. By default,
all DLLs will contain relocation information, but executables typically do
not. Executables can be compiled with relocation information by using the
/fixed:no linker flag. When a binary is compiled with relocations, the
presence of relocation information is simply indicated by a non-zero
VirtualAddress and Size for the base relocation data directory. These values
can be determined through dumpbin.exe /headers:
26000 [ EE8] RVA [size] of Base Relocation Directory
Since relocation information must be present at runtime, there should also be
a section, typically named .reloc, that contains the virtual mapping
information for the relocation information:
SECTION HEADER #5
.reloc name
1165 virtual size
26000 virtual address (00426000 to 00427164)
2000 size of raw data
24000 file pointer to raw data (00024000 to 00025FFF)
0 file pointer to relocation table
0 file pointer to line numbers
0 number of relocations
0 number of line numbers
42000040 flags
Initialized Data
Discardable
Read Only
In order to validate that this executable contains relocation information for
the absolute address reference made to the ``Hello World!'' string, the
dumpbin.exe /relocations command can be used:
File Type: EXECUTABLE IMAGE
BASE RELOCATIONS #5
1000 RVA, A8 SizeOfBlock
14 HIGHLOW 00421000
2C HIGHLOW 00420350
...
This output shows the first relocation block which describes the RVA 0x1000.
Each line below the relocation block header describes the individual fixup
descriptors. The information displayed includes the offset into the page, the
type of transformation being performed, and the current value at that location
in the binary. From the disassembly above, the location of the address
reference that is being made is 0x00401014. Therefore, the very first fixup
in this relocation block provides the dynamic loader within the information
necessary to change the address reference to the new base address when the
binary is relocated. If this binary were to be relocated to 0x50000000, the
HIGHLOW transformation would be applied to 0x00401014 as follows. The
displacement between the new base address and the old address would be
calculated as 0x50000000 - 0x00400000, or 0x4fc00000. Adding 0x4fc00000 to
the existing value of 0x00421000 produces 0x50021000 which is subsequently
stored in 0x00401014. This causes the absolute address reference to become
relative to the new base address.
Based on this basic understanding of how relocations are processed, it's now
possible to describe how a packer can be implemented that takes advantage of
the way the dynamic loader processes relocation information. As has been
illustrated above, relocation information is designed to make it possible to
fixup absolute address references at runtime when a binary is relocated.
These fixups are applied by taking into account the displacement between the
new base address and the original base address. More often than not, this
displacement isn't known ahead of time, thus making it impossible to reliably
predict how the content at a specific location in the binary will be altered.
But what if it were possible to deterministically know the displacement in
advance? Knowing the displacement in advance would make it possible to alter
various locations of the binary in a manner that would permit the original
values to be restored by relocations at runtime. In effect, the on-disk
version of the binary could be made to appear quite different from the
in-memory version at runtime. This is the basic concept behind locreate.
In order for locreate to work it must be possible to predict the displacement
reliably. Since the displacement is calculated in relation to the preferred
base address and the expected base address, both values must be known.
Furthermore, the binary must be relocated every time it executes in order for
the relocations to be applied. As it happens, both of these problems can be
solved at once. Since a binary is only guaranteed to be relocated if its
preferred base address is in conflict with an existing address, a preferred
base address must be selected that will always lead to a conflict. This can
be accomplished by setting the preferred base address to any invalid user-mode
address (any address above 0x80000000 inclusive). This assumes that the machine
that the executable will run on is not running with /3GB. If so, a higher
address would have to be used.. Alternatively, the base address can be set to
SharedUserData which is guaranteed to be located at 0x7ffe0000 in every
process. Setting the binary's preferred base address to any of these
addresses will force it to be relocated every time it executes. The only
unknown is what address the binary is expected to be relocated to.
Determining the address that will be relocated to depends on the state of the
process' address space at the time that the binary is relocated. If the
binary that's being relocated is an executable, then the process' address
space is generally in a pristine state since the executable is one of the
first things to be mapped into the address space. As such, the first
available address will always be 0x10000 on default installations of Windows.
If the binary is a DLL, it's hard to predict what the state of the address
space will be in all cases. When a conflict does occur, the kernel searches
for an available address region by traversing from lowest to highest address.
For the purposes of this paper, it will be assumed that an executable is being
packed and that the address being relocated to is 0x10000. Further research
may provide insight into how to better control or alter the expected base
address.
With both the preferred base address and the expected base address known, the
only thing that remains is to perform the operations that will transform the
on-disk version of the binary in a manner that causes custom relocations to
restore the binary to its original form at runtime. This process can be both
simplistic and complicated. The simplest approach would be to enumerate over
the contents of each section in the binary, altering the value at each
location by subtracting the displacement and then creating a relocation fixup
descriptor that will ensure that the contents are restored to the expected
value at runtime. This is how the proof of concept works. A more complicated
approach would be to create multiple relocation fixup descriptors per-address.
This would mean that the displacement would need to be subtracted once for
each fixup descriptor. It should also be possible to apply relocations to
individual bytes within a four byte span rather than applying relocations in
four byte increments. Even more interesting would be to use some fixup types
other than HIGHLOW, although this could be seen as something that might make
generating a signature easier.
The end result of this whole process is a functional proof of concept that
packs a binary in the manner described above. To get a feel for how different
the binary looks after being packed, consider what the implementation of main
from earlier in this paper looks like. Notice how the first two instructions
are the same as they were previously. This has to do with the fact that base
addresses must align on 64KB boundaries, and thus the lower two bottoms are
not changed. This could be further improved such as through the strategies
described above:
.text:84011000 loc_84011000:
.text:84011000 push ebp
.text:84011001 mov ebp, esp
.text:84011003 in al, dx
.text:84011004 add [eax+0], dh
.text:84011006 add [edi+edi*8+1209C15h], eax
.text:8401100D test [ebx-3FCCFB3Ch], al
.text:84011013 loope near ptr 84010FD8h
.text:84011015
.text:84011015 loc_84011015:
.text:84011015 push (offset off_8401139C+1)
The locreate proof of concept has been tested on Windows XP and Windows 2003
Server. Initial testing on Windows Vista indicates that Vista does not
properly alter the entry point address after relocations have been applied
when an executable is packed. Even though the proof of concept implementation
works, there are a number of more fundamental problems with the technique
itself.
The first set of problems has to do with techniques that can be used to
signature locreate packed executables. Since locreate relies on injecting a
large number of relocation fixups, it may be possible to heuristically detect
an increased number of relocation fixups with relation to the size of
individual segments. This particular attack could be solved by decreasing the
number of relocation fixups injected by locreate. This would have the effect
of only partially mangling the binary, but it might be enough to make people
wonder what's going on without giving things away. Even if it weren't
possible to heuristically detect an increased number of relocation fixups,
it's definitely possible to detect the fact that an executable packed by
locreate will have an invalid preferred base address that will always result
in a conflict. This fact alone makes it mostly trivial to at least detect
that something odd is going on.
Detection is only the first problem, however. Once a locreate packed
executable has been detected, the next logical step is to attempt to figure
out some way of obtaining the original executable. Since locreate relies on
relocation fixups to do this, the only thing one would have to do in order to
obtain the original binary would be to relocate the executable to the expected
base address that was used when the binary was packed, such as 0x10000. While
it's trivial to develop tools to perform this action, the Interactive
Disassembler (IDA) already supports it. When opening an executable, the
``Manual Load'' checkbox can be toggled. This will cause IDA to prompt the
user to enter the base address that the binary should be loaded at. When the
base address is entered, IDA processes relocations and presents the relocated
binary image. The mitigating factor here is that the user must know the
expected base address, otherwise the binary will still appear completely
mangled when it's relocated to the wrong base address.
In the author's opinion, these problems make locreate a sub-par packer. At
best it should be viewed as an interesting approach to the problem of packing
executables, but it should not be relied upon as a means of thwarting static
analysis. Anyone who reads this paper will have the tools necessary to unpack
executables that have been packed by locreate. With that said, it should be
noted that there is still an opportunity for further research that could help
to identify ways of improving locreate. For instance, a better understanding
of differences in the way the dynamic loader and existing static analysis
tools process relocation fixups could provide some opportunity for
improvement. Results from some of the author's initial tests of these ideas
are included in appendix A. Here's a brief list of some differences that could
exist:
1. Different behaviors when processing fixups
It's possible that the dynamic loader and static analysis tools such as IDA
may not support the same set of fixup types. Furthermore, they may not
process fixup types in the same way. If differences do exist, it may be
possible to create a packed executable that will work correctly when used
against the dynamic loader but not render properly when relocated using a
static analysis tool such as IDA.
2. Relocation blocks with non-page-aligned VirtualAddress fields
It's unknown whether or not the dynamic loader and static analysis tools are
able to properly handle relocation blocks that have non-page-aligned
VirtualAddress's. In all normal circumstances, VirtualAddress will be
page aligned.
3. Relocation blocks that modify other relocation blocks
An interesting situation that may lead to differences between the dynamic
loader and static analysis tools has to do with relocation blocks that modify
other relocation blocks. In this way, the relocation information that exists
on disk is not what is actually used, in its entirety, when relocating an
image during runtime.
Even if research into these topics doesn't yield any direct improvements to
locreate, it should nonetheless provide some interesting insight into the way
that different applications handle relocation processing. And after all,
gaining knowledge is what it's really all about.
Appendix A) Differences in Relocation Processing
This appendix attempts to describe some tests that were run on different
applications that process relocation entries for binary files. Identifying
differences may make it possible to have a binary that will work correctly
when executed but not when analyzed by a static analysis tool such as IDA. To
test out these ideas, the author threw together a small relocation fuzzing
tool that is aptly named relocfuzz. This tool will take a pre-existing binary
and create a new one with custom relocations. The code for this tool can be
found in the other code associated with this paper.
The tests included in this appendix were performed against three different
applications: the dynamic loader (ntdll.dll), IDA, and dumpbin. If the same
tests are run against other applications, the author would be interested in
knowing the results.
A.1) Non-page-aligned Block VirtualAddress
In all normal cases, relocation blocks will be created with a page-aligned
VirtualAddress. However, it's unclear if non-page-aligned VirtualAddress
fields will be handled correctly when relocations are processed. There are
some interesting implications of non-page-aligned VirtualAddress's. In many
applications, such as the dynamic loader, it's critical that addresses
referenced through RVAs are validated so as to prevent references being made
to external addresses. For example, if relocations were processed in
kernel-mode, it would be critical that checks be performed to ensure that RVAs
don't end up making it possible to reference kernel-mode addresses. The
reason why non-page-aligned VirtualAddress's are interesting is because they
leave open the possibility of this type of attack.
Consider the scenario of a binary that is relocated to 0x7ffe0000, ignoring
for the moment that SharedUserData already exists at this location. Now,
consider that this binary has a relocation block with a virtual address of
0x1ffff. This address is not page-aligned. Now, consider that this
relocation block has a fixup descriptor that indicates that at offset 0x4 into
this page, a certain type of fixup should be performed. This would equate to
modifying memory at 0x80000003, a kernel-mode address. If relocations were
being processed in kernel-mode, like they are on Windows Vista for ASLR, then
a failure to check that the actual address being written to would result in a
dangerous condition.
Here's an example of some code that attempts to test out this idea:
static VOID TestNonPageAlignedBlocks(
__in PPE_IMAGE Image,
__in PRELOC_FUZZ_CONTEXT FuzzContext)
{
PRELOCATION_BLOCK_CONTEXT KillerBlock = AllocateRelocationBlockContext(1);
PrependRelocationBlockContext(
FuzzContext,
KillerBlock);
KillerBlock->Rva = 0x10001;
KillerBlock->Fixups[0] = (3 << 12) | 0;
}
In this example, a custom relocation block is created with one fixup
descriptor. The VirtualAddress associated with the block is set to 0x10001
and the first fixup descriptor is set to modify offset 0 into that RVA. If
the binary that is hosting these relocations is relocated to 0x10000, a write
should occur to 0x20001 when processing the relocations. Here are the results
from a few initial tests:
ntdll.dll: The relocation fixup is processed and results in a write
to 0x20001.
IDA: Ignores the relocation fixup, but only because it writes outside of the
executable from what it would appear.
dumpbin.exe: Parses the relocation block without issue.
A.2) Writing to External Addresses
Due to the fact that the VirtualAddress associated with each relocation block
is a 32-bit RVA, it is possible to create relocation blocks that have RVAs
that actually reside outside of the mapped executable that is being relocated.
This is important because if steps aren't taken to detect this scenario, the
application processing the relocation fixups might be tricked into writing to
memory that is external to the mapped binary. Creating a test-case for this
example is trivial:
static VOID CreateExternalWriteRelocationBlock(
__in PPE_IMAGE Image,
__in PRELOC_FUZZ_CONTEXT FuzzContext)
{
PRELOCATION_BLOCK_CONTEXT ExtBlock = AllocateRelocationBlockContext(2);
ExtBlock->Rva = 0x10000;
ExtBlock->Fixups[0] = (3 << 12) | 0x0;
ExtBlock->Fixups[1] = (3 << 12) | 0x1;
PrependRelocationBlockContext(
FuzzContext,
ExtBlock);
}
In this test, a relocation block is created that has a VirtualAddress of
0x10000. When the binary is relocated to 0x10000, the actual address of the
region that will be written to is 0x20000. In almost all versions of Windows
NT, this address is the location of the process parameters structure. The
block itself contains two fixup descriptors, each of which will result in a
write to the first few bytes of the process parameters structure. The results
after running this test are:
ntdll.dll: The relocation fixup is processed and results in two 32-bit writes
to 0x20000 and 0x20001.
IDA: Ignores RVAs outside of the executable.
dumpbin.exe: N/A, dumpbin doesn't actually perform relocation fixups.
A.3) Self-updating Relocation Blocks
One of the more interesting nuisances about the way relocation fixups are
processed is that it's actually possible to create a relocation block that
will perform fixups against other relocation blocks. This has the effect of
making it such that the relocation information that appears on disk is
actually different than what is processed when relocation fixups are applied.
The basic idea behind this approach is to prepend certain relocation blocks
that apply fixups to subsequent relocation blocks. This all works because
relocation blocks are typically processed in the order that they appear. An
example of this basic concept is described shown below:
static VOID PrependSelfUpdatingRelocations(
__in PPE_IMAGE Image,
__in PRELOC_FUZZ_CONTEXT FuzzContext)
{
PRELOCATION_BLOCK_CONTEXT SelfBlock;
PRELOCATION_BLOCK_CONTEXT RealBlock;
ULONG RelocBaseRva;
ULONG NumberOfBlocks = FuzzContext->NumberOfBlocks;
ULONG Count;
//
// Grab the base address that relocations will be loaded at
//
RelocBaseRva = FuzzContext->BaseRelocationSection->VirtualAddress;
//
// Grab the first block before we start prepending
//
RealBlock = FuzzContext->NewRelocationBlocks;
//
// Prepend self-updating relocation blocks for each block that exists
//
for (Count = 0; Count < NumberOfBlocks; Count++)
{
PRELOCATION_BLOCK_CONTEXT RelocationBlock;
RelocationBlock = AllocateRelocationBlockContext(2);
PrependRelocationBlockContext(
FuzzContext,
RelocationBlock);
}
//
// Walk through each self updating block, fixing up the real blocks to
// account for the amount of displacement that will be added to their Rva
// attributes.
//
for (SelfBlock = FuzzContext->NewRelocationBlocks, Count = 0;
Count < NumberOfBlocks;
Count++, SelfBlock = SelfBlock->Next, RealBlock = RealBlock->Next)
{
SelfBlock->Rva = RelocBaseRva + RealBlock->RelocOffset;
//
// We'll relocate the two least significant bytes of the real block's RVA
// and SizeOfBlock.
//
SelfBlock->Fixups[0] = (USHORT)((IMAGE_REL_BASED_HIGHLOW << 12) |
(((RealBlock->RelocOffset - 2) & 0xfff)));
SelfBlock->Fixups[1] = (USHORT)((IMAGE_REL_BASED_HIGHLOW << 12) |
(((RealBlock->RelocOffset + 2) & 0xfff)));
SelfBlock->Rva &= ~(PAGE_SIZE-1);
//
// Account for the amount that will be added by the dynamic loader after
// the first self-updating relocation blocks are processed.
//
*(PUSHORT)(&RealBlock->Rva) -= (USHORT)(FuzzContext->Displacement >> 16) + 2;
*(PUSHORT)(&RealBlock->SizeOfBlock) -= (USHORT)(FuzzContext->Displacement >> 16) + 2;
}
}
This test works by prepending a self-updating relocation block for each
relocation block that exists in the binary. In this way, if there were two
relocations blocks that already existed, two self-updating relocation blocks
would be prepended, one for each of the two existing relocation blocks.
Following that, the self-updating relocation blocks are populated. Each
self-updating relocation block is created with two fixup descriptors. These
fixup descriptors are used to apply fixups to the VirtualAddress and
SizeOfBlock attributes of its corresponding existing relocation block. Since
a HIGHLOW fixup only applies to two most significant bytes, the RVAs of the
corresponding fields are adjusted down by two. The end result of this
operation is that the first n relocation blocks are responsible for fixing up
the VirtualAddress and SizeOfBlock attributes associated with subsequent
relocation blocks. When relocations are processed in a linear fashion, the
subsequent relocation blocks are updated in a way that allows them to be
processed correctly.
Running this test against the set of test applications produces the following
results:
ntdll.dll: The relocation blocks are fixed up accordingly and the application
executes as expected.
IDA: Initial testing indicates that IDA is capable of handling self-updating
relocation blocks.
dumpbin.exe: Crashes as the result of apparently corrupt relocation blocks:
DUMPBIN : fatal error LNK1000:
Internal error during
DumpBaseRelocations
Version 8.00.50727.42
ExceptionCode = C0000005
ExceptionFlags = 00000000
ExceptionAddress = 00443334
NumberParameters = 00000002
ExceptionInformation[ 0] = 00000000
ExceptionInformation[ 1] = 7FFA2000
CONTEXT:
Eax = 0000000A Esp = 0012E500
Ebx = 00004F00 Ebp = 00000000
Ecx = 7FFA2000 Esi = 00000000
Edx = 781C3B68 Edi = 7FFA2000
Eip = 00443334 EFlags = 00010293
SegCs = 0000001B SegDs = 00000023
SegSs = 00000023 SegEs = 00000023
SegFs = 0000003B SegGs = 00000000
Dr0 = 00000000 Dr3 = 00000000
Dr1 = 00000000 Dr6 = 00000000
Dr2 = 00000000 Dr7 = 00000000
A.4) Integer Overflows in Size Calculations
A potential source of mistakes that could be made when processing relocations
has to do with the handling of the SizeOfBlock attribute of a relocation
block. There is a potential for an integer overflow to occur in applications
that don't properly handle situations where the SizeOfBlock attribute is less
than the size of the base relocation structure (which is 8 bytes). In order
to calculate the total number of fixups in a section, it's common to see a
calculation like (Block->SizeOfBlock - 8) / 2. However, if a check isn't made
to ensure that SizeOfBlock is at least 8, an integer overflow will occur. If
this happens, the application processing relocations would be tricked into
processing a very large number of relocations. An example of a test for this
issue is shown below:
static VOID TestIntegerOverflow(
__in PPE_IMAGE Image,
__in PRELOC_FUZZ_CONTEXT FuzzContext)
{
PRELOCATION_BLOCK_CONTEXT EvilBlock = AllocateRelocationBlockContext(0);
EvilBlock->SizeOfBlock = 0;
EvilBlock->Rva = 0x1000;
PrependRelocationBlockContext(
FuzzContext,
EvilBlock);
}
In this example, a relocation block is created that has its SizeOfBlock
attribute set to zero. This is invalid because the minimum size of a block is
8 bytes. The results of this test against different applications are shown
below:
ntdll.dll: Does not perform appropriate checks which appears to result in an
integer overflow:
(9d4.6dc): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000000 ebx=00014008 ecx=00011000 edx=80010000 esi=00015000 edi=ffffffff
eip=7c91e163 esp=0013fa98 ebp=0013faac iopl=0 nv up ei pl nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010206
ntdll!LdrProcessRelocationBlockLongLong+0x1a:
7c91e163 0fb706 movzx eax,word ptr [esi] ds:0023:00015000=????
IDA: Ignores the relocation block, but may not process relocations correctly
as a result (unclear at this point).
dumpbin.exe: Refuses to show relocations:
Microsoft (R) COFF/PE Dumper Version 8.00.50727.42
Copyright (C) Microsoft Corporation. All rights reserved.
Dump of file foo.exe
File Type: EXECUTABLE IMAGE
BASE RELOCATIONS #4
Summary
1000 .data
1000 .rdata
1000 .reloc
1000 .text
A.5) Consistent Handling of Fixup Types
Applications that process relocation fixups may also differ in their level of
support for different types of fixups. While most binaries today use the
HIGHLOW fixup exclusively, there are still quite a few other types of fixups
that can be applied. If differences in the way relocation fixups are
processed can be identified, it may be possible to create a binary that
relocates correctly in one application but not in another application. The
following code demonstrates an example of this type of test:
static VOID TestConsistentRelocations(
__in PPE_IMAGE Image,
__in PRELOC_FUZZ_CONTEXT FuzzContext)
{
PRELOCATION_BLOCK_CONTEXT Block = AllocateRelocationBlockContext(16);
ULONG Rva = FuzzContext->BaseRelocationSection->VirtualAddress;
INT Index;
PrependRelocationBlockContext(
FuzzContext,
Block);
Block->Rva = 0x1000;
for (Index = 0; Index < 16; Index++)
{
//
// Skip invalid fixup types
//
if ((Index >= 6 && Index <= 8) ||
(Index >= 0xb && Index <= 0x10))
continue;
Block->Fixups[Index] = (Index << 12) | Index;
}
}
This test works by prepending a relocation block that contains a relocation
fixup for each different valid fixup type. This results in a relocation block
that looks something like this:
BASE RELOCATIONS #4
1000 RVA, 28 SizeOfBlock
0 ABS
1 HIGH EC8B
2 LOW 8BEC
3 HIGHLOW 5008458B
4 HIGHADJ 0845 (5005)
0 ABS
0 ABS
0 ABS
9 IMM64
A DIR64 8000209C15FF8000
0 ABS
0 ABS
0 ABS
0 ABS
0 ABS
The results for this test are shown below:
ntdll.dll: While not confirmed, it is assumed that the dynamic loader performs
all fixup types correctly. This results in the following code being produced
in the test binary:
foo+0x1000:
00011000 55 push ebp
00011001 8c6c8b46 mov word ptr [ebx+ecx*4+46h],gs
00011005 895068 mov dword ptr [eax+68h],edx
00011008 1830 sbb byte ptr [eax],dh
0001100a 0100 add dword ptr [eax],eax
0001100c 00b69b200100 add byte ptr foo+0x209b (0001209b)[esi],dh
00011012 83c408 add esp,8
IDA: Appears to handle some relocation fixup types differently than the
dynamic loader. The result of IDA relocating the same binary results in the
following being produced:
.text:00011000 push ebp
.text:00011001 mov ebp, esp
.text:00011003 mov eax, [ebp+9]
.text:00011006 shr byte ptr [eax+18h], 1 ; "Called TestFunction()\n"
.text:00011009 xor [ecx], al
.text:00011009
.text:0001100B db 0
.text:0001100C
.text:0001100C add byte ptr ds:printf[esi], dl
.text:00011012 add esp, 8
Equates to:
.text:00011000 55 8B EC 8B 45 09 D0 68 18 30 01 00 00 96 9C 20
.text:00011010 01 00 83 C4 08 C7 05 50
dumpbin.exe: N/A, dumpbin doesn't actually perform relocation fixups.
A.6) Hijacking the Dynamic Loader
Since the dynamic loader in previous tests proved to be capable of writing to
areas of memory external to the executable binary, it makes sense to test to
see if it's possible to hijack execution control. One method of approaching
this would be to have the dynamic loader apply a relocation to the return
address of the function used to process relocations. When the function
returns, it'll transfer control to whatever address the relocations have
caused it to point to. An example of this code for this test is shown below:
static VOID TestHijackLoader(
__in PPE_IMAGE Image,
__in PRELOC_FUZZ_CONTEXT FuzzContext)
{
PRELOCATION_BLOCK_CONTEXT Block = AllocateRelocationBlockContext(1);
PrependRelocationBlockContext(
FuzzContext,
Block);
//
// Set the RVA to the address of the return address on the stack taking into
// account the displacement.
//
Block->Rva = 0x0012fab0;
Block->Fixups[0] = (3 << 12) | 0;
}
When a binary is executed that contains this relocation block, the dynamic
loader ends up applying a relocation to the return address located at
0x13fab0. Obviously, this address may be subject to change quite frequently,
but as a means of illustrating a proof of concept it should be sufficient.
And, just as one would expect, the dynamic loader does indeed overwrite the
return address and make it possible to gain control of execution:
(c88.184): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=0001400a ebx=00014008 ecx=0013fab0 edx=80010000 esi=00000001
edi=ffffffff eip=fc92e10b esp=0013fac8 ebp=0013fae4 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
fc92e10b ?? ???
0:000> kv
ChildEBP RetAddr Args to Child
WARNING: Frame IP not in any known module. Following frames may be wrong.
0013fac4 00010000 00261f18 7ffdc000 80010000 0xfc92e10b
0013fae4 7c91e08c 00010000 00000000 00000000 image00010000
0013fb08 7c93ecd3 00010000 7c93f584 00000000 ntdll!LdrRelocateImage+0x1d (FPO: [Non-Fpo])
0013fc94 7c921639 0013fd30 7c900000 0013fce0 ntdll!LdrpInitializeProcess+0xea0 (FPO: [Non-Fpo])
0013fd1c 7c90eac7 0013fd30 7c900000 00000000 ntdll!_LdrpInitialize+0x183 (FPO: [Non-Fpo])
00000000 00000000 00000000 00000000 00000000 ntdll!KiUserApcDispatcher+0x7
Bibliography
[1] Carrera, Ero. Packer Tracing.
http://nzight.blogspot.com/2006/06/packer-tracing.html;
accessed Dec 15, 2006.
[2] Szor, Peter. Advanced Code Evolution Techniques and Computer Virus Generator Kits.
http://www.informit.com/articles/article.asp?p=366890&seqNum=3&rl=1;
accessed Jan 8, 2007.
[3] Szor, Peter. Tricky Relocations.
http://peterszor.com/resurrel.pdf;
accessed Jan 8, 2007.