Thursday, March 2, 2017

Covert Channels and Poor Decisions: The Tale of DNSMessenger

Executive Summary

The Domain Name System (DNS) is one of the most commonly used Internet application protocols on corporate networks. It is responsible for providing name resolution so that network resources can be accessed by name, rather than requiring users to memorize IP addresses. While many organizations implement strict egress filtering as it pertains to web traffic, firewall rules, etc. many have less stringent controls in place to protect against DNS based threats. Attackers have recognized this and commonly encapsulate different network protocols within DNS to evade security devices.

Typically this use of DNS is related to the exfiltration of information. Talos recently analyzed an interesting malware sample that made use of DNS TXT record queries and responses to create a bidirectional Command and Control (C2) channel. This allows the attacker to use DNS communications to submit new commands to be run on infected machines and return the results of the command execution to the attacker. This is an extremely uncommon and evasive way of administering a RAT. The use of multiple stages of Powershell with various stages being completely fileless indicates an attacker who has taken significant measures to avoid detection.

Ironically, the author of the malware called SourceFire out in the malware code itself shortly after we released Cisco Umbrella, a security product specifically designed to protect organizations from DNS and web based threats as described here.

Details

What initially drew our interest to this particular malware sample was a tweet published by security researcher on Twitter (thanks simpo!) regarding a Powershell script that he was analyzing that contained the base64 encoded string 'SourceFireSux'. Interestingly enough, Sourcefire was the only security vendor directly referenced in the Powershell script. We searched for the base64 encoded value 'UwBvAHUAcgBjAGUARgBpAHIAZQBTAHUAeAA=' which was referenced in the tweet, and were able to identify a sample that had been uploaded to the public malware analysis sandbox, Hybrid Analysis. Additionally, when we searched for the decoded string value we found a single search engine result that pointed to a Pastebin page. The hash listed in the Pastebin led us to a malicious Word document that had also been uploaded to a public sandbox. The Word document initiated the same multiple-stage infection process as the file from the Hybrid Analysis report we previously discovered and allowed us to reconstruct a more complete infection process. Analyzing our telemetry data, we were ultimately able to identify additional samples, which are listed in the Indicators of Compromise section of this post.

As a security vendor, we know that we are doing something right when malware authors begin to specifically reference us within their malware. Naturally we decided to take a closer look at this particular sample.

In this particular case, we began by analyzing the Powershell file that had been incorrectly submitted to the public sandbox as a VBScript file, which we are now referring to as 'Stage 3'. It turns out the string referenced earlier is used as a mutex, as you can see in the deobfuscated Powershell below in Figure 1.

Figure 1: Mutex Creation

Stage 1 Malicious Word Document

As previously mentioned, we identified the source of this infection chain, which was a malicious Microsoft Word document that was delivered to the victim via a phishing email message. Interestingly, the Word document was made to appear as if it were associated with a secure email service that is secured by McAfee. This is likely an effective way to increase the odds of the victim opening the file and enabling macros as McAfee is a well known security vendor and likely immediately trusted by the victim. The document informs the user that it is secured and instructs the user to enable content.

Figure 2: Malicious Word Document

The document uses the Document_Open() function to call another VBA function. The called function sets a long string that defines a Powershell command and includes the code to be executed. The command is then executed using the Windows Management Interface (WMI) Win32_Process object using the Create method.

The code that is passed to Powershell via the command line is mostly Base64 encoded and compressed using gzip, with a small portion at the end that is not encoded which is then used to unpack the code and pass it to the Invoke-Expression Powershell cmdlet (IEX) for execution. This allows the code to be executed without ever requiring it to be written to the filesystem of the infected system. Overall, this is pretty typical for malicious Word documents that we see being distributed in the wild. We noted that while there is a VBA stream that references a download from Pastebin, the samples we analyzed did not appear to make use of this functionality.

We also observed that the AV detection on this particular sample was fairly low (6/54) and that ClamAV was able to successfully detect this particular sample.

Figure 3: VirusTotal Results

Stage 2 Powershell

The execution of the Powershell that is passed to IEX by the Stage 1 Word document is where we begin to observe several interesting activities occurring on an infected system. A function at the end of the Powershell script described in Stage 1 defines the actions for Stage 2 as well as characteristics related to Stage 3.The code in Stage 2 has been obfuscated, and we will refer to the main function used by this stage as 'pre_logic' as the main function used by Stage 3 is referenced as 'logic'.

The 'pre_logic' function present in this stage supports two switches. One is used to determine whether or not to achieve persistence for the next stage of the infection process on the target system. If persistence is selected the other switch defines whether or not the Stage 3 code should be executed once it is staged.

Figure 4: Deobfuscated 'pre-logic' Function

In addition to these two switches, the 'pre_logic' function also supports four parameters which are subsequently passed to the 'logic' function in the next stage of the infection process. These parameters are used to determine what subdomains to use when sending DNS TXT record queries in the next stage of the infection process.

The function then unpacks the Powershell that will be used during the next (Stage 3) stage from a base64 encoded blob located within the Powershell script itself. It also defines some of the code which will be used later, including the function call and parameters to use when executing the next stage of the infection.

If the option to achieve persistence was selected when the 'pre_logic' function was called, the function will then query the infected system to determine how to best achieve persistence. Depending on the access rights of the user account within which the malware is operating, the malware will then query registry paths that are commonly used by malware to achieve persistence.

If operating under an account with Administrator access to the system the script will query and set:

$reg_win_path: "HKLM:Software\Microsoft\Windows\CurrentVersion"

$reg_run_path: "HKLM:Software\Microsoft\Windows\CurrentVersion\Run\"

If operating under a normal user account, the script will query and set:

$reg_win_path: "HKCU:Software\Microsoft\Windows"

$reg_run_path: "HKCU:Software\Microsoft\Windows\CurrentVersion\Run\"

Figure 5: Registry Activity

The script then determines the version of Powershell that is being used on the infected system. If the infected system is using Powershell 3.0 or later, the decoded Stage 3 payload is written to an Alternate Data Stream (ADS) located at '%PROGRAMDATA%\Windows\' and named 'kernel32.dll'.

If the system is running an earlier version of Powershell, the Stage 3 payload is encoded and written to the registry location dictated by the assignment of $reg_win_path earlier with the key name of 'kernel32'. The code to unpack and execute the Stage 3 payload is also later written to the registry location of $reg_win_path with the key name of 'Part'.

Figure 6: PS Check & Persistence

Once this has completed, the script will again check to determine the access level of the user running the malware. If the malware has been executed with Administrator permissions, the WMI event subscriptions for '_eventFilter', 'CommandLineEventConsumer', and '_filtertoconsumerbinding' will be removed from the infected system. The malware then establishes its own permanent WMI event subscription, filtered for 'Win32_LogonSession' events and tied to 'CommandLineEventConsumer'. This is what is used to read and execute the Stage 3 payload that was previously stored in the ADS whenever a new logon session is created on the infected system. This is essentially the WMI equivalent of a registry-based run key from a persistence perspective. The Stage 3 malware is by default set to run 'onidle' after 30 minutes. If the switch associated with the execution of Stage 3 was passed to the 'pre_logic' function at the beginning of this stage, the Stage 3 payload will then be executed immediately.

Figure 7: Persistence Mechanism

As seen above, the malware also creates a Scheduled Task on the infected system named "kernel32" which is associated with the Stage 3 payload that was stored in the ADS or registry depending on the version of powershell running on the infected system. In analyzing other samples associated with this campaign, we observed that the scheduled task may change across samples.

Stage 3 Powershell

The Stage 3 powershell that is executed by Stage 2 of this infection process was obfuscated primarily through the use of obtuse function and variable names (e.g. ${script:/==\/\/\/==\__/==} instead of $domains). Base64 string encoding was also present throughout the script. Once we deobfuscated it, we found that the script contained a large array of hard coded domain names, with one of them being randomly selected and used for subsequent DNS queries. It is important to note that while the Powershell scripts for stages 3 and 4 contain two arrays of domains, the first array is only used if a failure condition is reached while the sample is using the second array.

Figure 8: Stage 3 Domain List

The 'logic' function present within this Powershell script randomly selects a C2 domain from the second array in the script and uses this domain to perform an initial lookup. If the result of the initial DNS TXT record request is empty or in the case the lookup fails, the 'do_lookup' function is then called and randomly selects a domain from the first array in the script. Interestingly, the domains used by the 'do_lookup' function did not appear to have active 'www' or 'mail' TXT records.

The script also uses specific subdomains which are combined with the domains and used for the initial DNS TXT record queries performed by the malware. The malware uses the contents of the TXT record in the response to these queries to determine what action to take next. For instance, the first subdomain is 'www' and a query response with a TXT record containing 'www' will instruct the script to proceed. Other actions that may be taken are 'idle' and 'stop'.

Figure 9: Stage 3 Command Processing

Once the initial DNS response is received by the malware, it then iterates to the next subdomain which is 'mail'. The malware uses this domain in another DNS TXT record query to attempt to retrieve the Stage 4 payload associated with this infection process. The response to this DNS request results in the transmission of the fourth stage malware, stored within the TXT record as displayed in Figures 10 and 11. Due to the size of the Stage 4 payload, DNS makes use of TCP for this transaction.

Figure 10: Response Containing Stage 4 Payload

Another view showing the Wireshark interpretation of the DNS protocol and packet payload is below.

Figure 11: Alternate View of Stage 4 Payload

The code associated with this fourth stage is then cleaned and passed into the Invoke-Expression Powershell cmdlet (IEX) and executed within the context of the third stage process. The fourth stage payload is not autonomous and simply attempting to execute the fourth payload itself will fail, as it relies upon a decode function present within the third stage Powershell script.

Figure 12: Stage 3 Decode Function

This function is responsible for a couple of different operations. It takes the code received in the DNS query response and defines a string variable which contains the code. It then calls the decode function from the third stage and passes the decoded string into IEX to further extend the Powershell environment. Once this is complete, it then calls a function in the newly extended environment to execute the fourth stage code along with specific parameters. These parameters include the fourth stage C2 domain to use as well as the program to execute which in this case is the Windows Command Line Processor (cmd.exe). This is interesting because it results in the fourth stage payload never actually being written to the filesystem of the infected system.

Stage 4 Powershell

As described above, the Stage 4 Powershell payload is decoded by the 'dec' function present within Stage 3. At the end of the Stage 4 payload is a call to the 'cotte' function, present in the decoded Stage 4 code, which provides additional parameters including the C2 domain to use as well as the program to execute (cmd.exe). When the function executes cmd.exe it redirects STDIN, STDOUT, and STDERR to allow the payload to read from and write to the command line processor.

The domain provided to the function call is then used to generate the DNS queries used for the main C2 operations. Just like in the Stage 3 Powershell script, the Stage 4 payload also contains two arrays of hard coded domains, but this stage only appears to make use of the second array.

Figure 13: Stage 4 Domain List

Every 301st DNS response from main C2 server, the sample sends a separate DNS TXT resolution request to a domain taken from the array described above using the Get-Random cmdlet. This secondary C2 request is to determine whether the malware should continue to run on the infected system. Similar to what we saw with the Stage 3 Powershell script, this request is made to the 'web' subdomain of the secondary C2 domain.

Figure 14: Stage 4 Secondary C2 Domain Generation

If the secondary C2 server returns a TXT record that contains the string 'stop', the malware will cease operations.

Figure 15: Stage 4 Stop Command

The main C2 channel itself is established through the transmission of a "SYN" message from the infected system to the main C2 server.

Figure 16: Example Stage 4 'SYN' Message Response

Once this is completed, the STDOUT and STDERR output that was captured from the Windows Command Line processor earlier in Stage 4 is transmitted using a "MSG" message. This allows the attacker to send commands to be executed directly by the Command Processor and receive the output of those commands all using DNS TXT requests and responses. This communication is described in greater detail in the following section. Below is the DNS analysis and contents of the query request send from an infected system to the C2 server.

Figure 17: Example 'MSG' Message

The query domain structure is obfuscated. If we take the DNS request query and run it through a decoding function, we can clearly see that it is the output of the Windows Command Line Processor being sent to the C2 server.

Figure 18: Decoded TXT Request

This clearly illustrates the establishment of an interactive C2 channel that can be used to execute system commands as well as receive the output of those commands.

Command and Control (C2) Communications

The C2 domains associated with the infection chain from the malicious Word document were initially registered on 2017-02-08. The domains associated with the Powershell sample that we analyzed from Hybrid Analysis were initially registered on 2017-02-18. Several of the domains were registered by a registrant account using the following email address:

valeriy[.]pagosyan[@]yandex[.]com

The remaining domains were registered using the NameCheap proxy registration service.

According to data available within Umbrella, the majority of DNS activity related to the domains used by the powershell sample appears to have occurred between 2017-02-22 and 2017-02-25. There was less activity associated with the other identified sample, with most occurring on 2017-02-11.

Figure 19: Sample DNS Traffic Graph

All C2 communications associated with this malware are performed using DNS TXT queries and responses. The interactive 'MSG' queries require the successful establishment of a C2 communications channel via the use of the prerequisite 'SYN' query. The messages consist of the following elements:

$session_id - A four digit number that is initially generated by infected machines. It never changes and is included in all subsequent DNS queries and responses.

$sequence_num - A four digit number that is initially generated by infected machines. It changes periodically during C2 communications and the new value must be included in the next query.

$acknowledgement_num - A four digit number that is set by the response to the 'SYN' message. This value does not appear to change and must be included in all subsequent 'MSG' queries.

Bytes 5 and 6 of the DNS queries and responses determine the message type and can be any of the following values:

00 - 'SYN' message

01 - 'MSG' message

02 - 'FIN' message

The 'MSG' queries which are used to send commands to execute and return the output of the executed commands are hex-encoded and use a dot separator after every 30 bytes.

The following diagram illustrates the overall flow of the C2 communications. Note that during C2, there may be several 'MSG' queries and responses depending on what the attacker is attempting to execute on an infected host.

Figure 20: C2 Traffic Flow

Below is a diagram illustrating how the different messages and associated responses are formed.

Figure 21: C2 Message Structure

Conclusion

This malware sample is a great example of the length attackers are willing to go to stay undetected while operating within the environments that they are targeting. It also illustrates the importance that in addition to inspecting and filtering network protocols such as HTTP/HTTPS, SMTP/POP3, etc. DNS traffic within corporate networks should also be considered a channel that an attacker can use to implement a fully functional, bidirectional C2 infrastructure. Cisco Umbrella is a product that can be used specifically for this purpose. In addition to stopping this particular attack, DNS monitoring and filtering can also disrupt a large portion of overall malware infections, as the over 90% of malware makes use of the DNS network protocol at some stage of the infection or post-infection process.

Coverage

Additional ways our customers can detect and block this threat are listed below.

Advanced Malware Protection (AMP) is ideally suited to prevent the execution of the malware used by these threat actors.

CWS orWSA web scanning prevents access to malicious websites and detects malware used in these attacks.

Email Security can block malicious emails sent by threat actors as part of their campaign.

6 comments:

given that all of the domains mentioned in this article are obsolete and useless by the time of this posting, wouldn't it make more sense to publish a snort sid that looks for extraordinarily long TXT records that don't have spaces in them?So which domains is this malware using today?