Resources

Recent Posts

Recent Blog Posts

The PhishLabs Blog

Technical Dive into a Hardened Phish Kit

Many of the cybercriminals behind some of the most devastating cyber-attacks used phishing as the initial attack vector. At PhishLabs, we maintain a massive repository of phish kits that we continually analyze for intelligence about phishing tactics and techniques. The complexity and sophistication of these kits vary greatly.

One particular kit caught our eye as it uses some very interesting countermeasures to prevent reverse engineering. The first layer of obfuscation was performed using a tool created by a miscreant who identifies themselves as ‘Spinner’, known as SpinObf.

Spinner creates various GitHub accounts that contain no source code but merely a shortened URL that redirects to his or her utility.

https://github.com/Spinere/SpinObf-php-obfuscator

Known domains that have previously hosted the SpinObf tool:

http://mohssen.org/SpinObf.php

http://www.semart.net/Obfuscator/

SpinObf obfuscated code is easily recognized, not only by the need for recognition (i.e. author left their mark), but the initialization section usually carries the same pattern. The pattern to which we refer is a combination of O’s (the letter O) and 0’s (zeroes) in a purposely confusing pattern.

A series of variables is used with this pattern in order to make the assignments in the initialization confusing for the reader:

Initialization Section of the Hardened Kit

During initialization, one string is created using a call to the function named urldecode which acts as a character reference bank in which to build other strings. These strings are basically an indirection to call built-in PHP functions. In other words, the function is called using a variable instead of the function name. The following code is reformatted with comments from the image above:

The string contained in parenthesis is merely text in a form known as percent-encoding (more commonly known as URL encoding). Characters are represented using a percent sign (%) followed by a two digit hexadecimal number that represents a specific character corresponding to its equivalent ASCII code. For example, the first character appears as %66 (102 in decimal). Upon looking up this value in the ASCII table you will notice this will translate to the character ‘f’.

The string being built using the concatenation operator (‘.’) is referencing specific values in the string that have been decoded from their percent-encoded form. The first value in the assignment is $OOO000000{4} which is referencing the fifth character in the URL decoded string (arrays are 0-based – start counting at 0 instead of 1). Thus, $OOO000000{4} would translate to the character ‘b’, $OOO000000{9} to ‘a’ and so forth.

During analysis of a kit, analysts will look for calls to functions such as base64_decode. In this instance, the code will contain the variable $GLOBALS['OOO0000O0'] instead of an explicit call to the base64_decode function. The result is the same, however, it is not immediately recognizable by the individual reading the code. Thus, the code will continually reference $GLOBALS['OOO0000O0'] instead of base64_decode for every block of data that needs to be decoded in such a fashion. The same indirect function call will occur for every other defined PHP method you have observed above.

Reverse Engineering Countermeasures

Offset-Based Countermeasure

In an effort to make things even more difficult, the encoded pages rely on specific offsets within the file to perform the rendering of the resulting code. Any attempt to modify the file in its encoded form will likely result in a shift of byte positions causing an unintended ripple effect throughout the file. The consequence of this action would lead to unexpected data at the point of change and trigger an unexpected error that would typically be suppressed. This greatly reduces the number of avenues an analyst can use to try to debug and reverse engineer the kit.

For example, the analyst at this point cannot simply replace calls to eval() with print() as this will have an undesired effect. This is simply because ‘print’ is five characters in length; while ‘eval’ is only four.

The following image is the second-round of deobfuscation employed by the kit:

Second Round in Encoded Form

We can obtain this block of code via manual deobfuscation following a base64_decode (comments added):

Second Round Manually Deobfuscated

By reviewing the suspect code, you will see a number of things occur. The code uses a self-reference to open itself and position the file pointer at offset 0x525. After this happens, a block of data at the size of 0x3d8 (984 in decimal) bytes is decoded using the PHP function strtr. This block is then base64-decoded via the base64_decode function. The final result is subsequently passed to eval() for execution.

The code returned as a result of the eval() call is another countermeasure utilizing a domain-based pattern check which we explain in the next section.

Domain Pattern-Based Countermeasure

After the kit performs the offset-based deobfuscation for the second round, we end up with a block of code containing the following:

This kit is using a domain check to further thwart analysis. If the code itself does not reside on a server with a domain name matching the pattern shown above, the code halts execution. Even more frustrating for some analysts, any attempts to use the assistance of a sandbox or automated PHP environment will result in the domain check code being returned with no other accompanying data. Perhaps you are thinking you can merely go back and change the pattern. This is where things get really tricky! You would have to change the pattern itself. Furthermore, you will have to re-encode the section in question using the same process used for the deobfuscation the kit employs. Finally, you would have to make sure the resulting block is of the same size as any difference would push the needed data from its original offset causing the PHP file not to execute. Only after the domain check finally passes the required prerequisite does the resulting PHP phishing page load inside the browser.

Let’s Talk Dominoes

The good news for us is we discovered that all of the code references an API that resides at a single domain. Thus, all data collected by the distributed copies of the phish are within a central location. This presents a single point of failure. If this domain is taken offline, so is every other kit. The reasoning behind the construction of this architecture was most likely in an effort to make all of the collected data available to the original author. This is probably why the author went through so many painstaking countermeasures of hiding the internal implementation of the phish itself.