How RTF malware evades static signature-based detection

History

Rich Text Format (RTF) is a document format developed by Microsoft that has been widely used on various platforms for more than 29 years. The RTF format is very flexible and therefore complicated. This makes the development of a safe RTF parsers challenging. Some notorious vulnerabilities such as CVE-2010-3333 and CVE-2014-1761 were caused by errors in implementing RTF parsing logic.

In fact, RTF malware is not limited to exploiting RTF parsing vulnerabilities. Malicious RTF files can include other vulnerabilities unrelated to the RTF parser because RTF supports the embedding of objects, such as OLE objects and images. CVE-2012-0158 and CVE-2015-1641 are two typical examples of such vulnerabilities – their root cause does not reside in the RTF parser and attackers can exploit these vulnerabilities through other file formats such as DOC and DOCX.

Another type of RTF malware does not use any vulnerabilities. It simply contains embedded malicious executable files and tricks the user into launching those malicious files. This allows attackers to distribute malware via email, which is generally not a vector for sending executable files directly.

Plenty of malware authors prefer to use RTF as an attack vector because RTF is an obfuscation-friendly format. As such, their malware can easily evade static signature based detection such as YARA or Snort. This is a big reason why, in this scriptable exploit era, we still see such large volumes of RTF-based attacks.

In this blog, we present some common evasive tricks used by malicious RTFs.

Common obfuscations

Let’s discuss a couple different RTF obfuscation strategies.

1. CVE-2010-3333

This vulnerability, reported by Team509 in 2009, is a typical stack overflow bug. Exploitation of this vulnerability is so easy and reliable that it is still used in the wild, seven years after its discovery! Recently, attackers exploiting this vulnerability targeted an Ambassador of India.

The root cause of this vulnerability was that the Microsoft RTF parser has a stack-based buffer overflow in the procedure parsing the pFragments shape property. Crafting a malicious RTF to exploit this vulnerability allows attackers to execute arbitrary code. Microsoft has since addressed the vulnerability, but many old versions of Microsoft Office were affected, so its threat rate was very high.

The Microsoft Office RTF parser lacks proper bounds checking when copying source data to a limited stack-based buffer. The pattern of this exploit can be simplified as follows:

Because pFragments is rarely seen in normal RTF files, many firms would simply detect this keyword and the oversized number right after \sv in order to catch the exploit using YARA or Snort rules. This method works for samples that are not obfuscated, including samples generated by Metasploit. However, against in-the-wild samples, such signature-based detection is insufficient. For instance, the malicious RTF targeting the Ambassador of India is a good sample to illustrate the downside of the signature based detection. Figure 1 shows this RTF document in a hex editor. We simplified Figure 1 because of the space limitations – there were plenty of dummy symbols such as { } in the initial sample.

Figure 1. Obfuscated sample of CVE-2010-3333

As we can see, the pFragments keyword has been split into many pieces that would bypass most signature based detection. For instance, most anti-virus products failed to detect this sample on first submission to VirusTotal. In fact, not only will the split pieces of \sn be combined together, pieces of \sv will be combined as well. The following example demonstrates this obfuscation:

We can come up with a variety of ideas different from the aforementioned sample to defeat static signature based detection.

Notice the mixed ‘\x0D’ and ‘\x0A’ – they are ‘\r’ and ‘\n’ and the RTF parser would simply ignore them.

2. Embedded objects

Users can embed variety of objects into RTF, such as OLE (Object Linking and Embedding) control objects. This makes it possible for OLE related vulnerabilities such as CVE-2012-0158 and CVE-2015-1641 to be accommodated in RTF files. In addition to exploits, it is not uncommon to see executable files such as PE, CPL, VBS and JS embedded in RTF files. These files require some form of social engineering to trick users into launching the embedded objects. We have even seen some Data Loss Prevention (DLP) solutions embedding PE files inside RTF documents. It’s a bad practice because it cultivates poor habits in users.

<objtype> specifies the type of object. \objocx is the most common type used in malicious RTFs for embedding OLE control objects; as such, let’s take it as an example. The data right after \objdata is OLE1 native data, defined as:

<data>

(\binN #BDATA) | #SDATA

#BDATA

Binary data

#SDATA

Hexadecimal data

Attackers would try to insert various elements into the <data> to evade static signature detection. Let’s take a look at some examples to understand these tricks:

a. For example, \binN can be swapped with #SDATA. The data right after \binN is raw binary data. In the following example, the numbers 123 will be treated as binary data and hence translated into hex values 313233 in memory.

If we try to call atoi or atol with the numeric parameter string marked in red in the table above, we will get 0x7fffffff while its true value should be 3.

This happens because \bin takes a 32-bit signed integer numeric parameter. You would think that the RTF parser calls atoi or atol to convert the numeric string to an integer; however, that’s is not the case. Microsoft Word’s RTF parser does not use these standard C runtime functions. Instead, the atoi function in Microsoft Word’s RTF parser is implemented as follows:

b. \ucN and \uN
Both of them are ignored, and the characters right after \uN would not be skipped.

c. The space characters: 0x0D (\n), 0x0A (\r), 0x09 (\t) are ignored.

d. Escaped characters
RTF has some special symbols that are reserved. For normal use, users will need to escape these symbols. Here's an incomplete list:

\}
\{
\%
\+
\-
\\
\'hh

All of those escaped characters are ignored, but there’s an interesting situation with \’hh. Let’s look into an example first:

Obfuscated

{\object\objocx\objdata 341\’112345 }

Clear

{\object\objocx\objdata 342345}

When parsing \’11, the parser will treat the 11 as an encoded hex byte. This hex byte is then discarded before it continues parsing the rest of objdata. The 1 preceding \’11 has also been discarded. Once the RTF parser parses the 1 right before \’11, which is the higher 4-bit of an octet, and then immediately encounters \’11, the higher 4-bit would be discarded. That’s because the internal state for decoding the hex string to binary bytes has been reset.

The table below shows the processing procedure, the two 1s in the yellow rows are from \’11. It’s clear that the mixed \’11 disorders the state variable, which causes the higher 4-bit of the second byte to be discarded:

e. Oversized control word and numeric parameter
The RTF specification says that a control word’s name cannot be longer than 32 letters and the numeric parameter associated with the control word must be a signed 16-bit integer or signed 32-bit integer, but the RTF parser of Microsoft Office doesn’t strictly obey the specification. Its implementation only reserves a buffer of size 0xFF for storing the control word string and the numeric parameter string, both of which are null-terminated. All characters after the maximum buffer length (0xFF) will not remain as part of the control word or parameter string. Instead, the control word or parameter will be terminated.

In the first obfuscated example, the length of the over-sized control word is 0xFE. By adding a null-terminator, the control word string will reach the maximum length of 0xFF, then the remaining data belongs to objdata.

For the second obfuscated example, the total length of the “bin” control word and its parameter is 0xFD. By adding their null-terminator, the length equals 0xFF.

f. Additional techniques

The program uses the last \objdata control word in a list, as shown here:

Obfuscated

{\object\objocx\objdata 554564{\*\objdata 4444}54545} OR

{\object\objocx\objdata 554445\objdata 444454545}

{\object\objocx{{\objdata 554445}{\objdata 444454545}}}

Clear

{\object\objocx\objdata 444454545}

As we can see here, except for \binN, other control words are ignored:

Obfuscated

{\object\objocx\objdata 44444444{\par2211 5555}6666} OR

{\object\objocx\objdata 44444444{\datastore2211 5555}6666} OR

{\object\objocx\objdata 44444444\datastore2211 55556666} OR

{\object\objocx\objdata 44444444{\unknown2211 5555}6666} OR

{\object\objocx\objdata 44444444\unknown2211 55556666}

Clear

{\object\objocx\objdata 4444444455556666}

There is another special case that makes the situation a bit more complicated. That is control symbol \*. From RTF specification, we can get the description for this control symbol:

Destinations added after the 1987 RTF Specification may be preceded by the control symbol \* (backslash asterisk). This control symbol identifies destinations whose related text should be ignored if the RTF reader does not recognize the destination control word.

Let’s take a look at how it can be used in obfuscations:

1.

Obfuscated

{\object\objocx\objdata 44444444{\*\par314 5555}6666}

Clear

{\object\objocx\objdata 4444444455556666}

\par is a known control word that does not accept any data. RTF parser will skip the control word and only the data that follows remains.

2.

Obfuscated

{\object\objocx\objdata 44444444{\*\datastore314 5555}6666}

Clear

{\object\objocx\objdata 444444446666}

RTF parser can also recognize \datastore and understand that it can accept data, therefore the following data will be consumed by \datastore.

3.

Obfuscated

{\object\objocx\objdata 44444444{\*\unknown314 5555}6666}

Clear

{\object\objocx\objdata 444444446666}

For an analyst, it’s difficult to manually extract embedded objects from an obfuscated RTF, and no public tool can handle obfuscated RTF. However, winword.exe uses the OleConvertOLESTREAMToIStorage function to convert OLE1 native data to OLE2 structured storage object. Here’s the prototype of OleConvertOLESTREAMToIStorage:

The object pointed by lpolestream contains a pointer to OLE1 native binary data. We can set a breakpoint at OleConvertOLESTREAMToIStorage and dump out the object data which has been de-obfuscated by the RTF Parser:

The last command .writemem writes a section of memory to d:\evil_objdata.bin. You can specify other paths as you want; 0e170020 is the start address of the memory range, and 831b6 is the size.

Most of the obfuscation techniques of \objdata can also apply to embedded images, but for images, it seems there is no obvious technique as OleConvertOLESTREAMToIStorage. To extract an obfuscated picture, locate the RTF parsing code quickly using data breakpoint and that will reveal the best point to dump the whole data.

Conclusion

Our adversaries are sophisticated and familiar with the RTF format and the inner workings of Microsoft Word. They have managed to devise these obfuscation tricks to evade traditional signature-based detection. Understanding how our adversary is performing obfuscation can in turn help us improve our detection of such malware.

Acknowledgements

Thanks to Yinhong Chang, Jonell Baltazar and Daniel Regalado for their contributions to this blog.