InfoSec Handlers Diary Blog

When responding to a malware incident, important questions to be answered are "How was the machine infected?" and "When was the machine infected?".

I encountered a sample that made the work of analysts a bit lighter in this regard.

While browsing through the code of a H-worm variant, I noticed that this worm creates a registry entry with the method and date of infection, and communicates this to the C2 server.

Here is the code:

The string strIndicatorUSBSpreadAndDate (a name I chose) will be set to "true - DATE" when the machine is infected via an USB stick, and to "false - DATE" when it is not.

This string is written to the registry:

The name of the registry key varies: it's the name of the .vbs file (hworm-meoit is a name I chose). It will be under HKEY_LOCAL_MACHINE\Software if the script was executed (elevated) by an administrator, and under the registry virtualization keys when executed by a normal user:

This value is also communicated to the C2 server with every HTTP POST request (inside the User Agent String header):

Of course, one would still look at other evidence when establishing a timeline.

The solution is easy: the script contains another script, encoded with numbers using a simple substitution cipher (shift 12).

The problem is identifying that numeric obfuscation is used, and figuring out which one exactly.

One way to try to identify it, of course, is just to look at the script:

Scrolling down, you will find this:

That's clearly a string with a large amount of numbers separated by an exclamation mark (!). This is a clear indicator of numeric obfuscation in malicious scripts: a list of numbers.

Now you need to convert these numbers to ASCII characters. And maybe first apply a mathematical transformation on each number (depending on the type of obfuscation).

I have a tool to help with this: numbers-to-string.py. This tool reads a text file, extracts numbers per line, performs an optional calculation on each number, and then converts them to ASCII.

There is a new option in this tool, to perform a simple statistical analysis. This is done with option -S:

With this information, you know that:

on line 206, 25830 numbers were found, ranging between 22 and 137, and with an average value of 90.

on line 517, a single number was found: 12

All numbers between 22 and 126 can be converted to an ASCII character, and numbers between 127 to 137 to an extended ASCII character (depending on the code page).

Because we will end up with extended ASCII characters if we just convert the numbers to a character, it's probably that we need to perform a mathematical operation on each number, to end up with just ASCII characters.

You can first try without mathematical operation (""), like this:

It's clear that this is not what you are looking for. This does not look like a script. But nevertheless, this output provides a bit of information: that all numbers can be converted to a character. If this would not be possible, then numbers-to-string would not output anything for that line.

The next step is to figure out what mathematical operation has to be performed. This too can be done with just trial-and-error, by starting with the most simple operation: adding or subtracting a constant number.

You can start with:

n + c

n - c

Where n is the number found in the obfuscated script, and c is the constant to add or subtract. But what constant should you use?

Remember the output of the statiscal analysis of the script: there was a second line, with a single number: 12.

So first try with 12:

n + 12

n - 12

That's no improvement. Try with "n - 12" now:

That's clearly a script: it's a variant of the H-worm. This one connects to C2 shkis[.]publicvm[.]com on port 83 via HTTP. It polls the C2 about every 5 seconds with a POST command using path /is-ready to indicate that it is ready to execute commands.

In my experience, simple numeric obfuscation like this sample appears frequently. By looking more closely at the statistical results, one could also deduce that the operation is a subtraction: because there are numbers larger than 126, and for ASCII, the largest printable character is 126 (~).

XORSearch can also help to identify the mathematical operation to be performed. I'll probably cover this in another diary entry.