1 post from July 2017

Jul 05, 2017

In a past article, we introduced “impfuzzy for Neo4j”, a tool to visualise results of malware clustering (developed by JPCERT/CC). In this article, we will show the result of clustering Emdivi using the tool. Emdivi had been seen until around 2015 in targeted attacks against Japanese organisations. For more information about Emdivi, please refer to JPCERT/CC’s report.

Clustering Emdivi with impfuzzy for Neo4j

Emdivi has two major variants - t17 and t20, and we chose the former for this analysis. Figure 1 shows the output of running impfuzzy for Neo4j.

Figure 1: Emdivi t17’s clustering result using impfuzzy for Neo4j

As a result of the analysis, 90 samples were clustered into 4 types. Figure 2 visualises the clustering results. Detailed results are documented in Appendix A. (For detailed instructions on the tool, please see our past blog article.)

It stood out that each cluster (Type 1 through Type 4) highly corresponds to the compiled date of the malware sample (see Appendix A).

Hash values of malware samples are generated by impfuzzy (Import API), which is then used to calculate the similarity. Therefore, the reason for this type clustering is unknown solely from this analysis. Manual analysis is required to examine what makes Import APIs different in each type.

The following sections will describe the reason why Emdivi t17 samples were clustered into 4 types and how the transition occurred from one type to another.

From Type 1 to 2

The clustering results in Appendix A indicate the transition from Type 1 to 2 occurred around September 2014. We noticed a change in linker versions.

PE files have header information called IMAGE_OPTIONAL_HEADER[1]. This contains MajorLinkerVersion and MinorLinkerVersion, which indicates its linker version. Looking into the linker version used when creating Emdivi t17, Type 1 mainly uses 10.0 (Visual Studio 2010) while Type 2 uses 9.0 (Visual Studio 2008). It is considered that these samples were differentiated due to the change in the linker version, which accordingly changed the Windows APIs that the malware loads.

From Type 2 to 3

It was around November 2014 when Type 2 changed to Type 3, and this transition reflects the change in the method of loading Windows API. Usually, PE file loads Windows API upon execution by specifying an API name in Import Name Table (INT) inside the PE header. (Please refer to a past blog article for more information.)

However, Type 3 samples possess some obfuscated Windows API names and load it when using Windows API. Figure 3 is the results of decoding obfuscated strings in Emdivi t17, which indicates that Type 3 contains some obfuscated Windows API names (marked in red).

The Windows APIs obfuscated in Type 3 are deleted from its INT. This means that the Windows API that the malware aims to execute cannot be identified by just looking at the INT.

This change in Windows API load method is thought to be the reason for the difference between Type 2 and 3.

From Type 3 to 4

Transition from Type 3 to 4 occurred around May 2015. This is due to a new bot (remote control) function being added. Here is the list of bot functions that Type 4 has. “GOTO” is the new function to Type 4.

GOTO

DOABORT

DOWNBG

GETFILE

LOADDLL

SETCMD

SUSPEND

UPLOAD

VERSION

The added bot function resulted in new Windows APIs being used, which distinguishes Type 4 from 3.

Summary

It is not practical to manually analyse a large number of malware samples. It is rather important to automate malware clustering process to find new types of malware and changes in malware features. With the analysis example, we demonstrated an example of effective malware analysis using impfuzzy for Neo4j by focusing on samples with different features. The tool is available on Github, and we hope this helps your malware analysis.