Goal

Analyze code re-use in binaries to attribute unknown samples to families / threat actors. To do this we can obtain the list of functions in an executable and hash each function's opcodes using a fuzzy hash algorithm (kind of what Diaphora does). For the fuzzy hashing I will be using ssdeep, and for the opcode extraction r2 (and r2pipe).

Data preparation

First, we need some data to work on (hashes + samples). I will be using a QuantLoader sample set, as it is a rather simple malware.

With that in mind, we need to identify some landmark functions in a Quant

On the most recent phishing attacks, PowerShell is usually employed to load and execute position-independant shellcode via a macro-enabled Office document.

So, in order to know what actions are being carried away the truly interesting part here is the shellcode being executed. However, to slow down analysis or lower detection, shellcode is usually encoded, being shikata ga nai the most used encoder (for the samples I have observed at least).

Shikata ga nai

Shikata ga nai is a polymorphic encoder based on a decoder stub. The decoder stub XORs the encoded bytes with an incremental key. Having an incremental key

I've recently come across an APT named Dimnie by Palo Alto's Unit42 (full report here). The purpose of Dimnie is to exfiltrate sensitive data to the attacker. The peculiarity about Dimnie is that it does so with a peculiar trick. Dimnie changes the HTTP Host header to point to a well-known trusted domain while it actualy sends the request to an attacker-controlled server. This can be done as the HTTP Host header is not actually used for routing, but rather for virtual hosting purposes, as the docs say. This makes the HTTP request look like it's headed to a trusted