Mtracker is going to be one of the sources of this intel (among honeypots, sandboxes and similar systems). During last few years we’ve reverse engineered a lot of various malware families and we often have a deep understanding of their inner workings and communication protocols. Because of that, we can mimic them during communication with C&C server and download new samples or webinjects automatically, without any delay or human intervention.

Motivation

The typical approach to an analysis of malware network traffic and communication is executing it in a controlled environment (like a long-term dedicated sandbox) and observing its behavior through a large set of filters, analyzers and monitors.

a lot of computing power is required – we want to track every known config simultaneously and that would require dozens of virtual machines for every major family,

unless additional precautions are taken, sandboxed malware can still do harm to others – for example by being a proxy or sending spam. This is very complicated from a legal point of view,

not every change is visible immediately. Spam and C&C addresses can easily be tracked, but new malware samples, injects and configuration changes are hard to track behaviorally.

Some of these problems can be resolved (for example by throttling network connection or blocking outgoing SMTP connections), but some are inherent to the approach.

We solved this problem differently:

We have a deep knowledge about malware communication protocols (thanks to our proactive research in this regard), so we decided that we can reimplement networking stack of a few chosen malware families and communicate with them directly.

This approach has a lot of benefits because it solves mentioned problems immediately:

it’s relatively lightweight – even low-grade virtual server can track more than 300 bots without a problem,

no malicious traffic is sent – commands from C&C are received, analyzed (for example new samples are downloaded) and ultimately ignored,

usually, all commands are received without delay, so we know about botnet updates very soon.

A huge disadvantage of this method is a large amount of work required for reverse engineering, initial implementation and maintaining scripts for every family. Stability is also a problem (routine malware updates can break network protocols).

Architecture

Everything has begun as a set of loosely-related scripts, designed to download webinjects from banker trojan C&C’s. Back then everything was simple:

We have started with a static malware configuration extractor called ripper – one of our older projects. It’s able to extract hardcoded configuration from malware samples (for example: C&C server URLs, encryption keys etc). This was usually enough to start communication with malware and we used this data to semi-automatically download webinjects from known campaigns.

Everything was working great for a while, but our hunger for new knowledge grew and we noticed that we can easily adapt our system to download new malware samples at the same time (as malware usually receives updates through the same channel as other commands). This resulted in following “entangled” architecture:

This experiment turned out very well, so we went further in that direction. Our focus shifted to P2P botnets and spam at that time so we started to store more and more information:

spam email templates,

malicious samples downloaded from spam,

peers’ IP addresses.

At this point, we were generating quite a lot of traffic and we started being banned/blacklisted from a lot of botnets. Partially because of a large number of requests being done, but probably also because of occasional sloppiness on our part (like using the same bot_id all the time or hardcoding various fingerprints to constant values).

Being blacklisted is nothing new for us (anecdotally, long time ago we managed to accidentally put a whole NASK ISP network on a Zeus blacklist (!)). But in this case, due to a relatively big scale of operation, being “unbanned” wasn’t easy, even after fixing our code. Because of this, we had to do start using proxies which complicated our architecture a lot:

All configs are now being tracked from a few different proxies independently. This also allowed us to solve a problem of geolocalised campaigns – it’s very common that malware sample checks its location and refuses to infect computers outside of its target zone (most notable example is Dridex) or have a different set of injects/modules for every country (for example Emotet). Additionally, sometimes C&C servers are kept in the Tor network, so they can be reached only through a Tor proxy.

The final change (so far) that we introduced to our system is augmenting DNS. Using .bit (namecoin) TLDs is getting more popular with malware creators recently and if we want to support them we need to provide our own DNS resolver (.bit domains are not present in root TLD zone). After implementing this feature, we noticed another opportunity – sometimes C&C domains are taken down soon after campaign start, but the server is still working and responding to its original IP address. So when a domain fails to resolve (or resolves to a sinkholed domain) we’re using data from our passive DNS instead:

Last, but not least, an important part of the project is a web UI used to orchestrate, monitor and analyze results of the engine.

Results

In some sense, this is a conclusion of a lot of other projects. We have a few systems which collect raw data, and they are combined as inputs to mtracker to produce actual actionable information (like webinjects, spam templates, malicious IPs etc).

Most important source of information for us is the ripper and static configurations that can be used to track C&Cs. We have analyzed and can extract configurations from quite a lot of malware families. Percentage of distinct configurations by malware family that we have received:

Of course, not all of these families were active all the time. A better overview of tracked malware history is given by the next image (time “flows upwards”, configs grouped by family):

In theory, we could track all these families, but due to a lot of reasons (like limited funding and time) we only focused on a few of them. History of successful config downloads (grouped by family) is shown below:

Or in extended form (successful config downloads grouped by content, grouped by family):

Analysis of a Polish BankBot

Recently we have observed campaigns of a banking malware for Android system, which targets Polish users. The malware is a variant of the popular BankBot family, but differs from the main BankBot samples. Its victims were infected by installing a malicious application from Google Play Store. We are aware of at least 3 applications that were smuggled to Google Play Store and bypassed its antivirus protection:

Crypto Monitor

StorySaver

Cryptocurrencies Market Prices

The last one is an older version which was uploaded to VirusTotal on 13.10.2017.

According to the ESET’s analysis “Crypto Monitor” and “StorySaver” reached between 1000 and 5000 downloads. In each case, the malware pretended to be a benign, useful application.

The primary function of the trojan is the theft of credentials to online banking systems. It contains a hardcoded list of Polish banking applications that the malware impersonates, for example PKO Bank Polski, mBank, ING Bank Śląski and Bank Pekao SA (the full set of targeted applications is presented further in this post). Samples that we have analyzed do not target banking applications in languages other than Polish.

Additionally, BankBot is also capable of reading SMSes and showing fake notifications.

Basic information

So far we have not observed any code obfuscation in this variant of BankBot. The trojan uses Firebase, which is a platform that allows developers to create applications using a cloud service. It allows to use multiple services, including databases, without the need to host a backend server. Interesingly, Bankbot uses the platform only for generating tokens and receiving messages from Firebase Cloud Messaging, however stolen login credentials are sent to an external C&C server via HTTP. Firebase is not suitable for storing credentials because, with a link to the database, it would be possible for third parties to read and write any data.

The URL that identifies the database hosted by Firebase can be found in file res/values/strings.xml. For example, for the application “Crypto Monitor” it contains the following entry:

The method of stealing money

First, the malware generates a Firebase ID token, which is used to identify users. Then, it obtains a list of installed applications and compares it with the list of names of attacked banking applications. The corresponding decompiled code is presented below:

If any of these applications is installed, a WebView object is created. Such objects are used to embed web pages in an application. In this case, the trojan embeds a phishing page corresponding to that bank.

For example:

If a user submits login details, they are not stored on the server-side immediately, as happens in the case of an ordinary phishing webpage. Instead, the phishing website shows a JavaScript prompt window that contains data that was just provided (if the user has written my_password in a password field, the window will contain pl.pkobp.iko user: my_password). These windows are invisible if a victim tries to log-in through the malicious application, because they are “hooked” through overwriting of the onJsPrompt function inside a class that extends WebChromeClient. The trojan reads credentials from invisible windows and sends them to a C&C server via HTTP.

It is shown in the following code:

To withdraw money from the bank, the botmaster still needs to obtain an authorization code that is sent via SMS. This is done by capturing all received SMSes and sending them to the C&C server, similarly as other data. The code that is responsible for this malicious activity is a part of a class that extends BroadcastReceiver, presented below:

The malware can also display fake login windows after receiving a message from Firebase Cloud Messaging, which contains the name of the application that will be impersonated. The relevant code is inside a class that extends FirebaseMessagingService. In this scenario, a false notification from the banking service is displayed. If a victim opens the notification, a window with a fake banking login page will open.

The trojan also implements the “lock” command, which sets itself as a default application for receiving SMS messages:

Communication with C&C

Before sending stolen data to the server, registration of the infected phone takes place. It is done by sending an HTTP request which contains the following information:

IMEI number

name of the network operator

phone number

Android version

current country

list of installed applications

phone model

1.0 constant (probably a version of the malware)

token that was generated by Fireabse

server_id: a constant that has different value in every sample.

After a bot is registered, all further requests contain the IMEI number as the bot identificator.

The following script can be used to communicate with a C&C server:

How to avoid infection?

We advise to avoid installing applications from untrusted sources. However, recent experiences show that Google Play Store can contain harmful applications as well. To protect from malicious software, one can take advantage of built-in security mechanisms of Android itself: before installing an application, it shows the list of requested permissions. For example, if an application is supposed to monitor currency exchange rates, but asks for the permission to read SMSes, it is likely that it steals them. On the other hand, excess of required permissions does not necessarily mean that an application is malicious. Sometimes developers request more permissions than the application actually needs, which might be a simple mistake or due to insufficient understanding of the Android security model on their part, although it is a rare case.

Other analyses

]]>A deeper look at Tofsee moduleshttps://www.cert.pl/en/news/single/a-deeper-look-at-tofsee-modules/
Thu, 19 Oct 2017 11:36:19 +0000https://www.cert.pl/en/?post_type=post&p=14068Tofsee is a multi-purpose malware with wide array of capabilities – it can mine bitcoins, send emails, steal credentials, perform DDoS attacks, and more. All of this is possible because of its modular nature. We have already published about Tofsee/Gheg a few months ago – https://www.cert.pl/en/news/single/tofsee-en. Reading or at least skimming it is probably required [...] Read more

Tofsee is a multi-purpose malware with wide array of capabilities – it can mine bitcoins, send emails, steal credentials, perform DDoS attacks, and more. All of this is possible because of its modular nature.

We have already published about Tofsee/Gheg a few months ago – https://www.cert.pl/en/news/single/tofsee-en. Reading or at least skimming it is probably required to fully understand this post. Note that it is meant as an extension of that research, focusing on plugin functionality that we previously ignored. We will shortly summarize each plugin and highlight its most important features.

The post is rather long – for the impatient, list of hashes and table of contents in one:

Md5 of decrypted backdoor = 49642f1d1b1673a40f5fa6263a66d056. This file is protected by packer, and it’s the only packed binary that we observed during our analysis of Tofsee – it suggests that the binary could’ve been created by another actor and reused in Tofsee.

7. locsR.dll

Original filename: z:\cmf5\cmf5\small2\plugins\plg_locs\plg.cpp

This plugin steals network credentials for Microsoft Outlook:

After extracting them from the registry, they are decrypted and used to send more emails. Additionally, it generates email in form [computer name]@mail.ru and attempts to send emails using it (with raw SMTP protocol).

Strings from binary:

10. hostR.dll

This is HTTP server plugin. It masquerades as Apache/2.2.15 (Win32). It can serve files, probably for other bots.

It is able to blacklist some IPs – probably security analysts (for example Forcepoint and Google are banned).

Configuration for this module, fetched from the C&C:

11. text.dll

Original filename: p:\cmf5\small2\plugins\plg_text\plg_text.cpp

Very short plugin, it is able to process email templates downloaded from C&C.

12. smtp.dll

Very important module – it generates and sends emails. It’s probably biggest module and code is rather complicated sometimes.

Most interesting thing about it is the fact that it uses its own dedicated scripting language for generating messages. Script example, received from C&C:

If someone recognizes this as a real scripting language, we’d be grateful for the information. We have never seen something like this, so we analyzed interpreter of this language.

The syntax is rather simple, but very assemblish and primitive. We hope that malware authors are generating this scripts from a higher level language because writing something like this must really hurt one’s sanity ;].

A lot of opcodes are supported – take a look at this (simplified) parsing function for example:

We didn’t reverse all of them, but few most important ones are:

C ip:port – Connect

L lbl – Create Label lbl.

J lbl – Jump to label lbl.

v name value – Create variable name and assign value value.

W text – Write something to output – in this case to final email.

I lbl condition – If condition is satisfied than jump to lbl

Additionally wrapping text in “”” allows for newlines and escape sequences in it, and __v(XX)__ is a variable interpolation.

Again, few from the most interesting strings from that binary:

We thought that IfYouAreReadingThisYouHaveTooMuchFreeTime is an easter egg for us, malware analysts, but it turns out that it’s just a strange quirk related to hotmail authentication.

Configuration for this module, fetched from C&C:

13. blist.dll

This plugin checks if a bot is listed as a spambot and blacklisted. In the config we observed following DNSBLs (DNS-based Blackhole Lists) were supplied:

DNSBL is a service based on DNS used for publishing IP addresses of spam senders. If spam server uses DNSBL filters, it will do a DNS request to DNSBL domain with each incoming SMTP connection. Technical details are outside of the scope of this post, but any interested reader can take a look at http://www.us.sorbs.net/using.shtml or https://en.wikipedia.org/wiki/DNSBL.

Checking DNSBL is implemented with gethostbyname:

Configuration for this module, fetched from C&C:

14. miner.dll

This is (as the name suggests) cryptocurrency miner. This plugin only coordinates the work, but it has few accompanying binaries, that perform the dirty work.

One binary, called grabb, is distributed straight from the C&C. Other binaries are downloadable through URLs specified in configs – in theory. In practice, servers distributing miners seem to be dead, so we were not able to download miners.

Miner “verifies” that has really downloaded right binary, but hashing was probably too difficult for malware creators to implement, so they settled on size verification – for example, they are check that cores_gt_1 binary has exactly 223744 bytes.

We didn’t analyze it in-depth because crypto miners are boring enough, and strings from binary give enough information about inner workings anyway:

And the rest can be read from the configuration, fetched from C&C:

15. img.dll

This short plugin processes malicious attachments – encodes them with base64 and appends to emails.

Nothing interesting here, as can be seen in hardcoded strings:

Configuration for this module, fetched from the C&C:

16. spread.dll

This plugin is used to spread Tofsee through social media: Facebook, Twitter and Skype communicator.

First, it extracts xs, datr, c_user (and more) cookies.

Exact method depends on the browser, but generally plugin reads cookies stored on disk by the browser – for example cookies.sqlite from \Mozilla\Firefox\Profiles, for Firefox. Supported browsers are Chrome, IE, Firefox, Safari, and Opera.

After that, plugin uses that cookies to impersonate user in facebook API:

List of friends is downloaded through API and a message is sent to them. Format of message is stored in configuration, for example:

‘fb.message1’: ‘%SPRD_TEXT1|%LANG_ID| %SPRD_URL1’

Twitter is handled very similarly: cookies are stolen, followers are downloaded by API call to https://twitter.com/followers, and messages are sent.

VKontakte also seems to be supported, but that functionality is optional and held in another plugin. This module only checks if VK is enabled in config and calls handler (that can be initialized from another plugin), if it’s defined. Malware creators usually don’t like to attack Russia, so this function is disabled and VKontakte plugin is not distributed.

Plugin can also spread itself through Skype, but reverse engineering Skype protocol was clearly too hard for malware authors, so plugin waits until Skype is started, and then sends windows messages to Skype window:

The plugin has dozens of strings hardcoded, so analyzing it in disassembler is a breeze. Few more interesting groups:

References to the OCR plugin – to avoid captchas:

Facebook cookies:

Strings related to Facebook spread:

Strings related to cookie stealing:

Strings related to Skype hijacking:

Twitter cookies:

And Twitter spread:

Finally, things needed to send stolen cookies somewhere:

Rich functionality means rich configuration from the C&C:

17. spread2.dll

This plugin uses methods more than 15 years old, and tries to spread Tofsee through… infected USB drives! This doesn’t sound like an effective idea for A.D. 2017, but despite that, the plugin is still enabled.

First it copies malicious binary into RECYCLER\<random_gibberish>.exe file on the USB drive, then sets READONLY and SYSTEM attributes on that file, and finally writes malicious autorun.inf file:

The malicious binary that will be spread is downloaded from the internet (see also sys.dll plugin and %FIREURL variable).

Nothing too interesting in hardcoded strings, except operation logs:

Configuration for this module, fetched from the C&C:

18. sys.dll

This plugin seems to be a downloader or rather an updater. It sends requests, depending on a value of the %FIREURL configuration variable.

Example values of the %FIREURL variable (one per line):

Variables are expanded recursively, and %SYS_RN means \r\n of course, so first possible value can be read as:

If we send this request to that IP address on port 80, we will get yet another malicious binary. Different requests lead to different binaries.

If a request is invalid, or not supported, following image is sent instead:

We appreciate the humor.

Nothing surprising in hardcoded strings:

Configuration for this module, fetched from the C&C:

Additionally the %FIREURL variable from config is used.

19. webb.dll

This plugin tries to locate iexplore.exe process. If this succeeds, it injects DLL file called IEStub.dll to this process.

IEStub.dll hooks a lot of functions from iexplorer. List of hooked functions:

Hooks intercept called functions and can change their parameters. We haven’t analyzed hooks in depth, but most of them seem to be loggers intercepting “interesting” data from parameters – We haven’t observed any web injects served by Tofsee.

For completeness, interesting hardcoded strings:

Configuration for this module, fetched from the C&C:

20. P2P.dll

Original filename: p:\cmf5\small2\plugins\plg_p2p\plg_p2p.cpp

This plugin is rather short. Despite promising name, it’s rather boring – opening a port on a router and listening for connection is the most important thing it does. It doesn’t implement any commands, this is left for the main module to handle.

Like almost every module, it logs to %TMP%\log_%s.txt, and when this fails falls back to C:\log.txt.

Also adds port mapping using UPnP, in the same way as plugin 4 (proxyR.dll).

]]>Ramnit – in-depth analysishttps://www.cert.pl/en/news/single/ramnit-in-depth-analysis/
Fri, 29 Sep 2017 10:12:15 +0000https://www.cert.pl/en/?post_type=post&p=14026If we look on Ramnit’s history, it’s hard to exactly pin down which malware family it actually belongs to. One thing is certain, it’s not a new threat. It emerged in 2010, transferred by removable drives within infected executables and HTML files. A year later, a more dangerous version was released. It contained a part [...] Read more

]]>If we look on Ramnit’s history, it’s hard to exactly pin down which malware family it actually belongs to. One thing is certain, it’s not a new threat. It emerged in 2010, transferred by removable drives within infected executables and HTML files.

A year later, a more dangerous version was released. It contained a part of recently leaked Zeus source code, which allowed Ramnit to become a banking trojan.

These days, it has become much more sophisticated by utilizing a number of malicious activities including:

Performing Man-in-the-Browser attacks

Stealing FTP credentials and browser cookies

Using DGA (Domain Generation Algorithm) to find the C&C (Command and Control) server

Executable’s analysis

The main binary is packed like a matryoshka – a custom packing method first and then UPX.

Despite being encrypted, extracting the binary from the packer is pretty straight-forward – all one needs to do is to set a breakpoint right after the binary decrypts the code and before it jumps into it.

And if we now navigate to the newly unpacked code section we’ll find the binary right after the loader assembly:

ApplyExploit

If the current user is not already an admin and the process is not running with admin privileges it tries to perform privilege escalation.

Malware contains exploits for CVE-2013-3660 (patched in MS13-053) and CVE-2014-4113 (patched in MS14-058) vulnerabilities, however before it actually tries to run the payload, registry checks are performed to make sure that the host system is indeed vulnerable to said CVEs:

If the exploits succeed or the program is already running with high privileges, a “TRUE” value is stored in a hardcoded random-looking registry key: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\jfghdug_ooetvtgk, which is later used in the CheckBypassed function.

start routine

The routine coordinates ApplyExploit and CheckBypassed – if they both run successfully it creates two svchost.exe processes and writes rmnsoft.dll and modules.dll into them respectively.

Important detail: the binary executes CheckBypassed before ApplyExploit, so the binary has to be executed again in order to make any further progress. This trick outsmarts many single-run malware analysis systems, such as Cuckoo.

Static config

Ramnit encrypts its network communication using RC4 algorithm. Key for RC4 and botnet name are encrypted using xor with a hardcoded password.

XOR encryption is pretty standard, the only catch is that it skips key’s first char and then reverses the key.

XOR function calls:

Ciphertext lengths are almost always too long and we have to rely on null termination:

DGA config seems to be always declared at the beginning of the data section:

Persistence

Program copies itself into C:\Users\User\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\.

DGA

Ramnit generates a list of domains by using a LCG algorithm with a hardcoded seed:

Generating a domain:

DGA recreated in Python:

Communication

Ramnit connects to C&C servers through port 443, but don’t let that fool you – it doesn’t use HTTPS, but its own protocol instead:

Packet’s structure:

Chunks’ structures:

So if we’d like to send a packet containing some data, we would:

encrypt large (>4bytes) chunk data using RC4 with a key recovered from the XOR decryption

create packed chunks from data parts

concatenate all chunks together

wrap the output in packet layer

Traffic example:

Some of available commands:

Command

Byte Value

Short Description

COMMAND_OK

0x01

Server’s response that the command executed successfully

GET_DNSCHANGER

0x11

Get DNS-changer payload

GET_INJECTS

0x13

Get webinjects

UPLOAD_COOKIES

0x15

Upload stolen cookies (zip format)

GET_MODULE

0x21

Get a specific module

GET_MODULE_LIST

0x23

Get a list of downloadable modules

VERIFY_HOST

0x51

Check if the host is able to send a signed message

REGISTER_BOT

0xe2

Register bot (send two MD5s)

UPLOAD_INFO_GET_COMMANDS

0xe8

Upload detailed machine info

Bot registration

When a bot wants to register itself it sends two encrypted md5 hashes, the data structure of which is following:

Python code:

If C&C responds with a success packet (00ff0100000001), malware follows up with a empty 0x51 command. Signature from the response is verified using a hardcoded public RSA key. If there is a mismatch – the execution stops.

Modules

The program can request a list of modules and then download each one individually:

DGA domains for analyzed configs:

]]>Mole ransomware: analysis and decryptorhttps://www.cert.pl/en/news/single/mole-ransomware-analysis-and-decryptor/
Tue, 30 May 2017 10:14:45 +0000https://www.cert.pl/?post_type=post&p=13837Mole ransomware is almost month old ransomware (so it’s quite old from our point of view), that was distributed mainly through fake online Word docs. It’s a member of growing CryptoMix family, but encryption algorithm was completely changed (…again). We became interested in this variant after victims contacted us asking for a decryptor. Remembering that [...] Read more

Mole ransomware is almost month old ransomware (so it’s quite old from our point of view), that was distributed mainly through fake online Word docs. It’s a member of growing CryptoMix family, but encryption algorithm was completely changed (…again).

We became interested in this variant after victims contacted us asking for a decryptor. Remembering that all members of this family so far were plagued with serious crypto flaws, we decided to give it a try and reverse-engineered it thoroughly. It turned out to be a good idea – we were successful and managed to create working decryptor that you can download from: https://nomoreransom.cert.pl/static/mole_decryptor.exe.

In the rest of this article we will share detailed results of our research.

Campaign and Behaviour

Mole ransomware was distributed through malspam linking to fake Microsoft Word documents. Said documents prompted users to download and install a malicious plugin.

Because this variant is not new, it was analyzed by quite a lot of researchers before us. We don’t intend to copy their good work, so for anyone interested in the dynamic analysis we recommend looking at following links:

Instead, we’ll focus on a static analysis of the code and the encryption method.

Static analysis

As in many malware families, Mole won’t run in most Russian-speaking countries. Literally the first thing the binary does after being run is checking keyboard layout and charset – detecting Russian ones leads to immediate process termination. Otherwise, malware achieves persistence (by adding itself to the Autorun in the system’s registry), removes shadow copies (after Windows’ version check), and proceeds to the actual encryption:

After being started ransomware tries to bypass the UAC and displays fake dialog message:

Of course ransomware doesn’t encrypt every file type. Interestingly, encrypted extensions are obfuscated – they were not hardcoded directly, but compared inside giant function, after transformation with following algorithm:

List of encrypted extensions:

And as usual, the most interesting thing in any ransomware is actual file encryption algorithm. In this case it can be summarized as follows (half-decompiled, half-handwritten pseudo-c++ code with non essential parts omitted):

Or in terse pseudocode:

This method is not perfect for a lot of reasons, but we’ll skip detailed cryptanalysis here.

General structure of encrypted file looks like this:

It’s very similar to Revenge ransomware, that is why we believe that Mole is next version of Revenge. On the other hand, RC4 is used here instead of more sophisticated (and stronger) AES. It doesn’t change much, as RC4 is still strong enough for most ransomware purposes, but we’re not sure why ransomware creators decided to take this step back.

Hashes/patterns

]]>Analysis of Emotet v4https://www.cert.pl/en/news/single/analysis-of-emotet-v4/
Wed, 24 May 2017 11:28:38 +0000https://www.cert.pl/en/?post_type=post&p=13778Introduction Emotet is a modular Trojan horse, which was firstly noticed in June 2014 by Trend Micro. This malware is related to other types like Geodo, Bugat or Dridex, which are attributed by researches to the same family. Emotet was discovered as an advanced banker – it’s first campaign targeted clients of German and Austrian [...] Read more

Introduction

Emotet is a modular Trojan horse, which was firstly noticed in June 2014 by Trend Micro. This malware is related to other types like Geodo, Bugat or Dridex, which are attributed by researches to the same family.

Emotet was discovered as an advanced banker – it’s first campaign targeted clients of German and Austrian banks. Victims’ bank accounts were infiltrated by a web browser infection which intercept communication between webpage and bank servers. In such scenario, malware hooks specific routines to sniff network activity and steal information. This technique is typical for modern banking malware and is widely known as Man-in-the-Browser attack.

Next, modified release of Emotet banker (v2) has taken advantage of another technique – automation of stealing money from hijacked bank accounts using ATSs (Automated Transfer Systems, more informations on page 20 of CERT Polska Report 2013). This technology is also used in other bankers. Good examples are ISFB (Gozi) or Tinba.

At the beginning of April 2017, we observed wide malspam campaign in Poland, distributing fraudulent mails. E-mails were imitating delivery notifications from DHL logistics company and contained malicious link, which referred to brand-new, unknown variant of Emotet.

Malware distributed in this campaign differed from previously known versions. Behavior and communication methods were similar, but malware used different encryption and we noticed significant changes in its code. Thus we called this modification version 4.

Dropper

Links from the phishing campaign pointed to a dropper, which downloaded and executed malware. Dropper was written in Javascript and wasn’t highly obfuscated. It was fairly easy to notice, that strings with distribution site URLs were just reversed.

Main module

An interesting thing in Emotet is its modular structure. Main module dropped by script doesn’t contain anything harmful and is used only to download another modules from C&C, which perform specific tasks. Sample dropped by script is protected using some generic packer to avoid recognition by AV software.

After unpacking, malware loads libraries and resolves WinAPI routines used in encryption and communication with C&C. Names of specific functions are obfuscated and stored as array of hashes. Emotet uses simple sdbm hash function for this purpose. To make hashes more varied, values are additionally XORed with some constant specified in binary.

Strings that are distinctive for Emotet are also encoded using 4-byte XOR key, different for each string.

Main executable file contains also a list of IP addresses of C&C servers. Similar to previous versions, sample communicates with Command&Control using plain HTTP.

Encryption

The most significant change in new version is usage of different encryption algorithm. In previous releases, communication was encrypted using RC4. In fourth version, Emotet switched to 128-bit AES in CBC mode.

Before sending, malware performs key generation. In the first stage, Emotet loads 768-bit RSA public key, stored in executable. Then, AES symmetric key is generated using cryptographically secure PRNG (CryptGenKey). Finally, generated key is encrypted using previously loaded public key and attached to the request using PKCS#1v2.0 (OAEP) padding.

Cryptography is based on Microsoft CryptoAPI mechanisms.

Key generation and public key import:

Request encryption:

Communication with C&C

Received response is presented below:

Communication protocol is based on Google Protocol Buffers. Protocol Buffers is a mechanism, which allows developers to simply build own protocols using set of message structure declarations, written in a specific protobuf language. Protocol Buffers generates parsing and serializing modules, which can be directly used in developed solution. Protobuf supports wide set of languages, including Python, Java, PHP or C++. Using this kind of mechanisms isn’t something new in malware, protobuf-based protocols can be found for example in Gootkit malware.

Unfortunately, Emotet’s case is a bit different. Protobuf code inside malware is slightly modified and provides additional type of encoding, which is not specified in the original Protocol Buffers documentation. Because of this small difference, response can’t be properly decoded using generic protobuf parsers e.g. protoc with –decode_raw argument fails.

Anyway, original protocol definitions were successfully reversed:

Registration request contains command id (16) and some information about host operating system. Each field of RegistrationRequestBody structure has been described below:

botId field

This field provides information about values specific to victim’s machine and probably is meant to be unique between bot instances.

[host_name]_[locale]_[host_id]
e.g. CERTCERT_PL_32122958

host_name – contains only chars from 0..9a..zA..Z- charset, another chars are replaced by ‘?’

locale – contains information about locale settings. In this case, dash ‘-‘ is also forbidden

32-bit field, which describes version of Windows running on infected host. It’s a bit field, where each groups of bits contains specific value of OSVERSIONINFOEX structure.

Bits

Description

0..3

dwMajorVersion

4..7

dwMinorVersion

8..11

wServicePackMajor

12..15

wServicePackMinor

16..19

wProductType

20..23

SYSTEM_INFO.wProcessorArchitecture

procList field

Contains comma-separated list of currently running process names.

mailClient field

Provides information about used mail client (read from “HKLM\Software\Clients\Mail” registry key value). If it’s Microsoft Outlook and it’s MAPI DLL is 64-bit, name is followed by ” x64″ suffix.

Response

If a registration request was received, C&C server returns a list of Emotet modules. HTTP status response is always 404 Not Found, regardless of the fact whether request was built properly or not. In this case, response body contains encrypted response.

Structure of encrypted response is quite similar to the request structure. Encrypted payload starts at 116-byte of received message. Response is encrypted using the same AES key, which was passed in request. After successful decryption, we obtain protobuf-like message with list of MZ binaries or URLs.

In this case, malware uses non-standard encoding. Field repeated Module modules = 1 [packed=true]; is illegal in protobuf language, because packed attribute can be used only for primitive numeric type of repeated fields. Surprisingly, list of modules is encoded like packed list of Message objects.

It should be noted that elements of Modules are repeated without Module message tag, which is specific to packed encoding,

type field

This field defines type of blob content and specifies method of module execution. Type field can be one of the following values:

Value

Description

1

Store in %TEMP% and execute with -U argument

2

Like ‘1’, but without arguments

3

Download and execute file from URL specified ‘blob’

4

Use own internal loader – load and execute PE file from ‘blob’

5

Uninstall – delete related ‘.lnk’ from Startup folder

default

Do nothing

Modules

In previous versions, Emotet modules were providing the following set of functionalities:

Stealing money from bank accounts (Man-in-the-Browser attack)

Spreading by sending spam e-mails

Stealing mails and credentials to mail accounts

DDoS module

Stealing browsing history and passwords from web browser

In version 4 distributed in the last campaign, we didn’t observe banking module, which is somewhat unusual for this type of malware. Behavior of other modules was quite similar to previous versions. During analysis, we successfully dropped two types of modules, described below:

Credentials stealer

In server response, we found two similar modules, which purpose was to steal credentials from web browser and mail client. Both modules have embedded NirSoft password recovery software inside:

Recovery software was embedded as XOR-encoded static blob, using 32-bit key (similar to strings). On module startup, software was decoded and stored in %TEMP%, and then executed with /scomma [temp file name] parameter, which leads to dump all passwords into file contained in %TEMP% folder (name generated using GetTempFileNameW). Stealed data were sent to C&C server for malware spreading purpose.

Spam module

Second type of module was spam module, used for malware spreading. Firstly, module asks C&C for message template, list of recipients and list of hijacked accounts, which will be used to spam distribution.

Request structure presents as below:

Fields flags and additionalData specify, which data has been received from server and which we’re expecting in C&C answer.

Server response looks like below:

E-mails are not sent using local account. Distribution is performed using previously scrapped mail accounts, which are sent to each spambot.

This e-mail was sent by <span style="text-transform: uppercase;">&lt;&gt;
&lt;&gt;</span>

Summary

Basic functionality of Emotet in last campaign was just stealing credentials and spreading. Even though, malware is still active and also actively developed. Because of lack of few important modules, Emotet will be probably extended in future. In case of infection, we recommend changing passwords to all accounts, which credentials were stored in mail client or web browser.

]]>SECURE 2017 – Call for Speakershttps://www.cert.pl/en/news/single/secure-2017-call-for-speakers/
Tue, 23 May 2017 15:25:06 +0000https://www.cert.pl/en/?post_type=post&p=13806Call for Speakers for SECURE 2017 is now open. If you have an interesting topic and would like to share your ideas with a crowd of Polish and international IT security specialists, please consider submitting your proposal. You will find all applicable information below. SECURE 2017 will be held on October 24-25 in Warsaw, Poland. [...] Read more

]]>Call for Speakers for SECURE 2017 is now open. If you have an interesting topic and would like to share your ideas with a crowd of Polish and international IT security specialists, please consider submitting your proposal. You will find all applicable information below.

SECURE 2017 will be held on October 24-25 in Warsaw, Poland. This annual conference is dedicated entirely to IT security and addressed to administrators, security team members and practitioners in this field. SECURE’s unique feature is the organisers’ commitment to providing participants with reliable information about everything that is current and meaningful in IT security. A high professional level of the talks is ensured by CERT Polska during the paper selection process. Particular emphasis is on practical solutions, analysis of the current threats, latest trends in countering threats as well as important legal issues. Participants have an opportunity to gain the latest knowledge, improve their qualifications and exchange experience with experts.

Recent months proved that exploit kits are still among current threats, not only to individual users as part of opportunistic campaigns, but also targeting large enterprises and national infrastructure when used in water hole attacks. IoT devices have also become attackers’ tool of choice as they are often insufficiently protected. New ransomware families and popularization of ransomware-as-a-service put individuals as well as businesses, industrial systems and embedded devices. How to fight these threats, as well as all those that are still difficult to detect due to their well thought out and targeted nature? We will be looking for answers to these and many more questions during SECURE 2017.

If you want to share your experience in these topics, or if you are an expert in one of the areas below, this Call for Speakers is for you.

SECURE 2017 will be held on October 24-25 at Airport Hotel Okęcie in Warsaw, Poland.
The conference topics will be roughly grouped in the following tracks:

Presentation topics

We are looking for speakers willing to deliver a talk covering one or more of the following subjects:

malware evolution and analysis, including viruses, worms and botnets

intrusion detection

innovatory honeypot and sandbox applications

Advanced Persistent Threat attacks

monitoring of network threats

security of smartphones and other mobile systems

security events visualisation

security of SCADA/ICS

security of IPv6

cloud security

early warning against network threats

incident handling

standards for security incident data exchange

DDoS attacks and their mitigation

efficiency of methods for mitigation of new attack vectors

open source security tools

protection of online identity

privacy, confidentiality and anonymity

steganography

Polish and European law in regards to computer and information security

law enforcement actions in regards to cybercrime mitigation

research projects in the area of computer and information security

securing the human

Important facts

proposals for presentations must be submitted only via EasyChair: https://easychair.org/conferences/?conf=secure2017

proposals should include at least a title, short abstract, name and bio of the speaker

any questions regarding the submission and selection process should be directed to info@secure.edu.pl

time for presentation: 45 minutes, including Q&A

commercial presentations will not be accepted

all materials should be submitted in one of the following formats: OpenOffice, Microsoft Office, PDF

slides of presentations will be made available to all participants in an electronic version unless strictly prohibited by the speaker

authors of accepted proposals will receive full conference package (workshops not inclusive), but they are responsible for their travel and accommodation

Important dates

Proposals submission until: July 10, 2017 Extended: July 17, 2017

Acceptance notice by: August 7, 2017

Presentation submission by: October 9, 2017

Lightning talks

We encourage participants of SECURE to share their thoughts. One of the conference blocks will include lightning talks, allowing everyone to talk briefly about their projects, works, ideas or problems. Everything goes, as long as it touches IT security.

Important facts about lightning talks

maximum time for a talk is 5 minutes and total time for all talks will be limited

application for a lightning talk can be submitted at any time after you have registered for the conference or during the conference

the organisers reserve the right to accept or refuse any lightning talk application

]]>We are joining the No More Ransom Projecthttps://www.cert.pl/en/news/single/we-are-joining-the-more-ransom-project/
Tue, 11 Apr 2017 09:22:32 +0000https://www.cert.pl/en/?post_type=post&p=13656From the beginning of April we are officially an Associate Partner of the No More Ransom Project. Its main goal is to fight ransomware by helping victims with free decryption of their files. It is coordinated, among others, by Europol, and it connects law enforcement agencies and private sector companies from around the world. Our [...] Read more

]]>
From the beginning of April we are officially an Associate Partner of the No More Ransom Project. Its main goal is to fight ransomware by helping victims with free decryption of their files. It is coordinated, among others, by Europol, and it connects law enforcement agencies and private sector companies from around the world. Our main contribution is providing a decryption tool for Cryptomix, Cryptfile2 and Cryptoshield ransomware families, which we described some time ago.

The project already helped more than 10000 victims and now we can also contribute to this effort. We are proud to take part in this initiative.

]]>Sage 2.0 analysishttps://www.cert.pl/en/news/single/sage-2-0-analysis/
Tue, 14 Feb 2017 12:57:26 +0000https://www.cert.pl/en/?post_type=post&p=13596Introduction Sage is a new ransomware family, a variant of CryLocker. Currently it’s distributed by the same actors that are usually distributing Cerber, Locky and Spora. In this case malspam is the infection vector. Emails from the campaign contain only malicious zip file without any text. Inside zip attachment there is malicious Word document with [...] Read more

Introduction

Sage is a new ransomware family, a variant of CryLocker. Currently it’s distributed by the same actors that are usually distributing Cerber, Locky and Spora.

In this case malspam is the infection vector. Emails from the campaign contain only malicious zip file without any text. Inside zip attachment there is malicious Word document with macro that downloads and installs ransomware.

After starting the ransomware, Windows UAC window is shown repeatedly until the user clicks yes.

At the end the encryption process is started and all files are encrypted:

Ransom message directs us to panel in the Tor network, but before we can log in we have to solve a captcha:

And finally we are greeted with “user-friendly” panel:

We can even chat with malware creators:

Interestingly, this ransomware doesn’t remove itself after encryption, but copies itself to %APPDATA%\Roaming directory and re-encrypts all files after every reboot (until the ransom is paid).

Technical analysis

After this short introduction, We’ll focus on the technical side (because Sage 2.0 is not completely a generic ransomware, few things are rather novel).

Main function of binary looks like this:

As we see, there is a lot of fingerprinting and checks, though most of them are quite standard. More interesting features include:

Debug switch

Probably something didn’t work on the first try, so there is a debug command line parameter to test that configuration data is set correctly:

And surely enough, this debug parameter does what it should:

Someone probably forgot to remove this from the final version, because this is clearly a debugging feature.

Locale Check

Sage 2.0 creators like some nations more than others:

This checks user keyboard layouts:

next == 0x23 -> Belarussian

next == 0x3F -> Kazakh

next == 0x19 -> Russian

next == 0x22 -> Ukrainian

next == 0x43 -> Uzbek

next == 0x85 -> Sakha

We’re a bit disappointed that Polish didn’t make it on the exception list (If Sage creators are reading this: our locale is 0x15).

Location fingerprinting

Sage is trying to get it’s host location by querying maps.googleapis.com with current SSID and MAC:

Canary file

Before encryption Sage checks for existence of a special debug file:

Thanks to this, malware creators don’t have to worry about accidentally running the executable and encrypting their own files.

Finally, if the file is not found, encryption is initiated.

Extension whitelist

Of course, not every file is encrypted – only files with whitelisted extension are touched:

Encryption

As usual, this is the most interesting thing in ransomware code. Sage 2.0 is especially unusual because it encrypts files with elliptic curve cryptography.

The curve used for encryption is y^2 = x^3 + 486662x^x + x over the prime field defined by 2^255 – 19, with base point x=9. These values are not arbitrary – this curve is also called Curve25519 and is the state of the art in modern cryptography. Not only it’s one of the fastest ECC curves, it’s also less vulnerable to weak RNG, designed with side-channel attacks in mind, avoids many potential implementation pitfalls, and (probably) not backdoored by any three-letter agency.

Curve25519 is used with hardcoded public key for shared secret generation. The exact code looks like this (with structures and function names by us):

This looks like properly implemented Elliptic Curve Diffie-Hellman (ECDH) protocol, but without private keys saved anywhere (they are useful only for decryption and malicious actors can create them anyway using their private key).

This may look complicated, but almost all those functions are just wrappers for ECC primitive – named CurveEncrypt by us. For example, computing matching public key is curve25519(secretKey, basePoint) – where basePoint is equal to 9 (one 9 and 31 zeroes).

Shared key computation is very similar, but instead of using constant base point we use public key:

Due to the design of Curve25519, converting between any sequence of random bytes and a secret key is very easy – it’s enough to mask few bits:

And, also because of this, secret key generation is completely trivial (it’s enough to generate 32 random bytes and convert them to the secret key):

That’s all for the key generation. What about file encryption? Files are encrypted with ChaCha (unconventional algorithm, again) and key is appended to output file – but after being encrypted with Curve25519:

AppendFileKeyInfo fucntion appends sharedKey and pubKey to the file:

ChaCha is not very popular algorithm among ransomware creators. It’s very closely related to Salsa20 which was used in Petya ransomware. We don’t know why AES is not good enough for Sage – probably it’s only trying to be different.

In other words, there are two sets of keys + one key pair for every encrypted file:

After ransomware finishes we know only my_public, sh_public, fl_shared, but we need chachakey to actually decrypt the file.

This encryption scheme is quite solid because it makes offline encryption possible – there is no need to bother connecting with C&C and negotiating encryption keys – the public key is hardcoded in binary and because of asymmetric cryptography decryption is impossible. Assuming that malware creators didn’t make any drastic implementation mistakes (and we have no reason to suspect that they did), recovery of encrypted files is impossible. Of course, it’s always possible that master encryption key will eventually be leaked or released.

]]>Nymaim revisitedhttps://www.cert.pl/en/news/single/nymaim-revisited/
Mon, 30 Jan 2017 10:02:27 +0000https://www.cert.pl/?post_type=post&p=13305Introduction Nymaim was discovered in 2013. At that time it was only a dropper used to distribute TorrentLocker. In February 2016 it became popular again after incorporating leaked ISFB code, dubbed Goznym. This incarnation of Nymaim was interesting for us because it gained banking capabilities and became a serious threat in Poland. Because of this, [...] Read more

Introduction

Nymaim was discovered in 2013. At that time it was only a dropper used to distribute TorrentLocker. In February 2016 it became popular again after incorporating leaked ISFB code, dubbed Goznym. This incarnation of Nymaim was interesting for us because it gained banking capabilities and became a serious threat in Poland. Because of this, we researched it in depth and we were able to track Nymaim activities since then.

However a lot of things have changed during the last two months. Most notably, Avalanche fast-flux network (which was central to Nymaim operations) was taken down and that struck a serious blow to Nymaim activity. For two weeks everything went silent and even today Nymaim is a shadow of its former self. Although it’s still active in Germany (with new injects), we haven’t observed any serious recent activity in Poland.

Obfuscation

This topic is really well researched by other teams, but it’s still interesting enough to be worth mentioning. Nymaim is heavily obfuscated with a custom obfuscator – to the point that analysis is almost impossible. For example typical code after obfuscation looks like this:

But with some effort we can make sense of it. There are a lot of obfuscation techniques used, so we’ll cover them one by one:

First of all, registers are usually not pushed directly onto the stack, but helper function “push_cpu_register” is used. For example push_cpu_register(0x53) is equivalent to pushing ebx and push_cpu_register(0x50) is equivalent to pushing eax. Constants are not always the same, but registers are always in the same order (standard x86 ordering).

The constant used in the example is 8CBFB5DA, but there’s nothing special about it – it’s a random dword value, generated just for the purpose of obfuscating this constant. The only thing that matters is the result of the operation (0x25 in this case).

Additionally there other similar obfuscating functions are used sometimes – for example sub_*_from_eax and add_*_to_eax.

Last but not least, the control flow is heavily obfuscated. There are a lot of control flow obfuscation methods used, but all boil down to simple transformation – call X and jmp X are transformed to at least two pushes. This obfuscation is in fact very similar to previous one – instead of jumping to 0x42424242, malware calls function detour with two parameters: 0x40404040 and 0x02020202. The detour adds it’s parameters and jumps to the result. In pseudoasm instead of:

we have:

There exists also a slight variation of this method – instead of pushing two constants, sometimes only one constant is pushed and machine code after a call opcode is used instead of a second constant (detour uses return address as a pointer to the second constant).

To sum up, previously pasted obfuscated code should be read like this:

With this in mind, we created our own deobfuscator. This was quite a long time ago and since then other solutions have shown up. Our deobfuscator probably isn’t the best, but is easily modifiable for our needs and it has some unique (as far as we know) features that we need, for example it imports recovery and decrypting encrypted strings stored in binary. Other deobfuscators include mynaim and ida-patchwork Nevertheless, with our deobfuscator we are able to untangle that messy code to something manageable:

When it comes to Nymaim obfuscation capabilities it’s not nearly over. For example external functions are not called directly, instead of it an elaborate wrapper is used:

This wrapper pushes hash of function name on the stack and jumps to the next dispatcher (even though call opcode is used, this code never returns here):

A second dispatcher pushes hash of a dll name on the stack and jumps to the helper function:

And finally real dispatcher is executed:

Additionally, real return address from API is obfuscated – return address is set to call ebx somewhere in the ntdll (real return address is somewhere in ebx by then, of course). Most tools are very confused by it. Let’s just say, it’s very frustrating when debugging and/or single stepping.

But wait, there’s more! As we have seen, short constants are obfuscated with simple mathematical operations, but what about longer constants, for example strings? Fear not, malware authors have a solution for that too. Almost every constant used in the program is stored in a special data section. When Nymaim needs to use one of that constants, it is using special encrypted_memcpy function. At heart it is not very complicated:

Inner workings of memcpy_and_decrypt are not that complicated either. Our reimplementation of the encryption algorithm in Python is only few lines long:

We only need to extract constants used for the encryption (they differ between executables) – they are hidden in these portions of code:

(These functions are not obfuscated, so extraction can be done with simple pattern matching).

But encryption of every constant was not good enough. Malware authors decided that they can do better than that – why don’t encrypt the code too? That’s not very often used, but few critical functions are stored encrypted and decrypted just before calling. Quite an unusual approach, that’s for sure. Ok, let’s leave obfuscation at that.

Static Configuration

After deobfuscation, the code is easier to analyze and we can get to interesting things. First of all, we’d like to extract static configuration from binaries, especially things like:

C&C addresses

DGA hashes

Encryption keys

Malware version

Other stuff needed for communication

How hard can that be? Turns out that harder than it looks – because this information is not just stored in the encrypted data section.

Fortunately, this time the encryption algorithm is rather simple.

We just need to point nymaim_config_crypt to the start of encrypted static config and everything will just work.

How do we know where static config starts? Well… We tried few clever approaches (matching code, etc), but they weren’t reliable enough for us. Finally, we solved this problem with a simplest possible solution – we just try every possible index in binary and try to decrypt from there. This may sound dumb (and it is), but with few trivial heuristics (static config won’t take 3 bytes of space, neither will it take 3 megabytes) this is quite fast – less than 1s on typical binary – and works every time.

Despite this, after decrypting static config we get a structure, which is is quite nice and easy to parse. It consists of multiple consecutive “chunks”, each with assigned type, length and data (for those familiar with file formats, this is something very similar to PNG, or wav, or any other RIFF).

Graphically this looks like this:

And chunks are laid consecutive in static config block:

So we can quickly traverse through all chunks of a static config with a simple five-liner:

Snippet from process_chunk (hash == chunk_type):

After initial parsing the static config looks like this:

(By the way, in this article chunk types are usually represented byte-order, i.e. big endian)

And in a more human readable form with most interesting chunks interpreted:

Infection timeline

There is more than one “kind” of Nymaims. As of now we distinguish between three kinds:

dropper – first Nymaim that gets executed on the system. This is the only type distributed directly to victims.

payload – module responsible for most of the “real work” – web injects for example

bot_peer – module responsible for P2P communication. It tries to become supernode in the botnet.

These are all one kind of malware and all of them share the same codebase, except few specialized functions. For example our static config extractor works on all of them, just like our deobfuscator and they all use the same network protocol.

Dropper role is simple. It performs few sanity checks – for example:

Makes sure that it’s not virtualized or incubated

Compares current date to “expiration time” from static config

Checks that DNS works as it should (by trying to resolve microsoft.com and google.com)

If something isn’t right, the dropper shuts down and the infection doesn’t happen.

The second check is especially annoying, because if you want to infect yourself Nymaim has to be really “fresh” – older executables won’t work. Even if you override check in the binary, this is also validated server-side and the payload won’t be downloaded.

If we want to connect to a Nymaim instance, we need to know the IP address of peer/C&C. Static config contains (among others) two interesting pieces of information:

DNS server (virtually always it’s 8.8.8.8 and 8.8.4.4).

C&C domain name (for example ejdqzkd.com or sjzmvclevg.com).

Nymaim is resolving that domain, but returned A records are not real C&C addresses – they are used in another algorithm to get a real IP address. We won’t reproduce that code here, but there is a great article from Talos on that topic. If someone is interested only in the DGA code, it can be found here:

DGA

Payload is very different from dropper when it comes to network communication:

No hardcoded domain

But has DGA

And P2P

The payload’s DGA algorithm is really simple – characters are generated one by one with simple pseudo-random function (variation of xorshift). Initial state of DGA depends only on seed (stored in static config) and the current date, so we can easily predict it for any given binary. Additionally, researchers from Talos have bruteforced valid seeds, simplifying the task of domain prediction even more.

P2P

First of all, few examples why we suspected from the start that there is something else besides DGA:

We have taken one of our binaries that hadn’t behaved like the payload, unpacked it, deobfuscated and reverse engineered it. But even without in-depth analysis, we’ve found a lot of hints that P2P may be happening. For example we can find strings typical for adding exception to Windows Firewall (and of course – that’s what malware did when executed on a real machine).

Another suspicious behavior is opening ports on a router with help of UPNP. Because of this, infected devices from around the world can connect to it directly.

And finally something even more outstanding. As we have seen, the malware presents itself as the Nginx in the “Server” header. Where does this header come from? Directly from the binary:

We implemented tracker for the botnet (more about that later) and with the data we obtained, we concluded that this probably is a single botnet, but with geolocated injects (for example Polish and US injects are very similar). Distribution of IPs we found is similar to what other researchers have determined (we have found more PL nodes and less US than others, but that’s probably because the botnet is geolocated and we were more focused on Poland).

49.9% (~7.5k) of found supernodes were in Poland, 30% (~4.5k) in Germany and 15.7% (~2.2k) in the US.

Network protocol

And now for something more technical. This is an example of a typical Nymaim request (P2P and C2 communication use the same protocol internally):

Host header is taken from the static config

Randomized POST variable name and path

POST variable value = encrypted request (base64 encoded)

User-Agent and rest of the headers are generated by WinHTTP (so headers are not very unique and it’s impossible to detect Nymaim network requests by using only them).

Typical response:

This isn’t really Nginx, just pretending.

Everything except the data section is hardcoded

Data = encrypted request

Encrypted messages have very specific format:

A lower nibble of the first byte is equal to a length of the salt and a lower nibble of the second byte is equal to the length of the padding. Everything between the salt and the padding is the encrypted message. To decrypt it, we need to concatenate the key with the salt – and use that password with the rc4 algorithm.

It can be easily decrypted using Python (but we had to reverse engineer that algorithm first):

After decrypting a message, we get something with a format very similar to the static config (i.e. a sequence of consecutive chunks):

Each chunk has its type, length and raw data:

We can process decrypted message with almost exactly the same code as code for static config:

And this is the basic code used for parsing the message. Each chunk type needs to be processed a bit differently. Interestingly, parsing message is recursive, because some chunk types can contain other lists of chunks, which in turn can contain other lists of chunks, etc. Unfortunately, important chunks have another layer of encryption and compression. At the end of an encrypted chunk we can find special RSA encrypted (or rather – signed) header. After decryption (unsigning) of the header, we can recover a md5 hash and length of the decrypted data and most important of all – a Serpent cipher key used to encrypt the data.

After the decryption we will stumble upon another packing method – decrypted data is compressed with APLIB32. This structure is very similar to the one used by ISFB – firstly we have magic ‘ARCH’, then length of compressed data, length of uncompressed data and crc32 – all of them are dwords (4 bytes).

With this function we finally managed to hit the jackpot. We decrypted all of the interesting artifacts passed over the wire, most importantly additional downloaded binaries, web filters and injects.

Communication

An example request, after dissection, may look like this:

As we can see, quite a lot of things is passed around here. There are a lot of fingerprinting everywhere and some information about current state.

Responses are often more elaborate, but for the sake of presentation, let’s dissect a simple one:

An infected machine gets to know its public IP address, IP addresses (and listening ports) of its peers and the active domain. Additionally it is usually ordered to sleep for some time (usually 90 seconds when some files are pending to be transmitted and 280 seconds when nothing special happens).