Analysis of an Interesting Malicious HTA File

Posted on 2019-04-30 by Amirreza Niaka…

In this article, we dissect an HTA file that we found in the wild. We found this instance on VirusTotal a few days back on April 12. This malware instance uses a handful of techniques notably dynamically loading a serialized .NET library and dll sideloading to evade detection mechanisms.

The sample we'll dive into originally popped up on our RADAR on April 12. The initial sample and some relevant reports:

As of the time of this blog post, 4 out of 58 AVs on VirusTotal detects this sample. Next, we'll reveal the techniques that help the malware to stay under the radar.

Analysis

This HTA file contains three script blocks: the first and the third blocks contain JavaScript and the second block contains VBScript code. Both JavaScript code blocks instantiate various .NET classes, which are COM visible, and call their methods to perform different operations such as decoding a base-64 encoded string. As an example, in the first block, the following function is defined, which utilizes System.Text.ASCIIEncoding, System.Security.Cryptography.FromBase64Transform, and System.IO.MemoryStream .NET classes to decode a base-64 encoded string.

In the third script block, several .NET classes are used to dynamically load a serialized .NET dll and consume its class, which apparently generated by @tiraniddo's DotNetToJScript tool as pointed by @bartblaze.

Fig 2. dynamically load a serialize .NET dll and consume its class

In the above code block, first so is decoded by calling base64ToStream function, which is defined in the first block. Then the decoded string is deserialized by calling Deserialize_2 method of System.Runtime.Serialization.Formatters.Binary.BinaryFormatter instance. Next, class HTA is instantiated on line 51 and then its pink method is called by passing several arguments including da variable on line 56.

So far, we learned that so string contains a .NET dll. To extract this binary file, we first need to decode the so string with a base-64 decoder and save it in a file. We know that a .NET dll file is in PE format and hence starts with MZ marker and most probably ends with a long sequence of null characters. By knowing this, we can easily carve the .NET binary embedded in the serialize object using a hex editor such as Hexinator.

Fig 3. The start of the embedded .NET library (0x04C7)

Fig 4. The end of the embedded.NET library (0x1CE3)

As it is shown in Fig. 3 and 4, the PE binary file starts at 0x04C7 and ends at 0x1CE3. By dumping this section, we can get the embedded NET library.

Next, we use ILSpy to decompile the carved .NET binary and take a look at its code. As it is mentioned earlier, the JavaScript code creates an instance of HTA class and then calls its pink method. So let's start looking at this method to see what the logic behind it is.

Figure 5 depicts the pink method. As it is shown, this function first decodes base-64 encoded doc parameter and then decompresses it. The resulted string is then saved as a file on the local filesystem and then the code opens this dumped file. Next, the code downloads the second stage HTA file from a remote server (hxxps://www.cdn-aws.net/cgi/5ed0655734/1252/1397/ec470000/file.hta).

Fig 5. Definition of pink method in HTA class

To obtain the dumped file, we can simply use any base-64 decoder to decode da variable in the first HTA file (this variable passed to the pink method as doc) and save the result in a file. Next, we can use 7zip to decompress this file. Fig. 6 shows the content of the resulted file, which is a decoy PDF file.

Fig 6. Decopy PDF file

The decoy PDF is showing an employee form as shown in Fig. 7.

Fig 7. Viewing the decoy PDF file in a PDF viewer

The second stage HTA also contains an encoded serialized .NET dll file. It dynamically loads this dll and then instantiates from preBotHta class. Next, method Work is called by passing several arguments including ad variable as the first argument, dllBase64. dllBase64 is first base-64 decoded, then the resulted data is decompressed. The resulted string is a template for a dll file and contains several placeholders. The first placeholder, {yyyyyyyy}, is replaced by the size of url parameter (left-padded with 0). The second placeholder, {rox}, is replaced by the protocol scheme of url parameter, in this instance https. The third placeholder, a sequence of 1000 # characters, is replaced by url (right-padded with #). The result is then dumped as Duser.dll in %appdata%\$instfolder, where $instfolder is "dsk\dat2.1 ". In the end, credwiz in this folder is executed, which is copied from %windir%\system32\credwiz. Credwiz sideloads Duser.dll.