Malware Analysis: The Final Frontier

Pages

Wednesday, 7 March 2018

Quick SummaryBuild Version: 0.0.1(alpha)Change Type: new featureAffected Components: API, UIShort Description: API side code logic(parser) has been added to allow for RTF files processing. Currently, the new parser provides basic data extraction capabilities. UI side 'Submission' and 'About' pages have been updated to reflect the new changes.Outstanding Tasks: Second development iteration.Known Issues: Some data obfuscation types are not supported.Detailed SummaryNew code logic has been added to IRIS-H to allow for Rich Text Format (RTF) files processing. The 'Submission' page will now accept RTF file upload and pass it for further processing which includes the following:

extract document metadata

identify and parse embedded objects

extract font table

detect languages used in the document

provide description for all extracted data

Currently, the parsing module only provides essential processing. The module was tested with a good number of malicious RTF files and seems to be relatively stable handling the majority of obfuscation techniques. Thanks to @James_inthe_box for providing the samples!

Monday, 5 February 2018

Quick SummaryBuild Version: 0.0.1(alpha)Change Type: new featureAffected Components: API, UIShort Description: API side code logic has been added to allow submitting ZIPed files. Industry standard password 'infected' is supported. UI side 'Submission' and 'About' pages have been updated to reflect the new changes.Outstanding Tasks: NoneKnown Issues: ZIP files created with Ubuntu 'Archive Manager' throw an error.Detailed SummaryThe code logic has been added to IRIS-H to allow handling file extraction from ZIP archive files. The 'Submission' page will now accept ZIP file upload and perform the following operations with it:

identify if the file is a Microsoft Office document in OOXML format

identify the number of files in the archive

identify if the password is set

identify the unpacked size of the compressed file contained in the archive

identify if the archive file is 'nested'

The following restrictions and limitations are applied:

ZIP file must contain a single file

if ZIP file password is enabled it must be set to 'infected'

unpacked size of the compressed file contained in the archive must not exceed 10MB

ZIP 'nesting' must not exceed 2 levels (ZIP-in-a-ZIP)

ZIP file size must not exceed 4MB*

* 4MB ZIP file size limit is enforced by the underlying technology employed to handle the file extraction. More on this in the following section.

What's under the hood?

Disclaimer: The choice of the technology used to implement ZIP files support was mainly driven by a will to learn it. Another contributing factor though is the lack of good NodeJS libraries that provide password protected ZIP files handling.

IRIS-H API and UI components are written in different flavours of JavaScript. Originally, I was looking to implement ZIP files support using a JS library, but to my surprise I couldn't find the one with proper support for different compression and encryption types. I realized it would have to be implemented in a different programming language, but the integration with the rest of the service and its infrastructure seemed challenging until I decided to look into using AWS Lambda.

AWS Lambda supports a number of programming languages including C# with .NET Core 2.0. This opens up a good number of possible solutions. The choice stopped with SharpZipLib. This library supports most of the compression and encryption methods. Building an AWS Lambda function turned out to be a rather easy task. The most challenging part was dealing with the 'RequestResponse' size limitations enforced by 'Invoke' function. The only solution I could find was to apply the ZIP file size limit at the submission time. It's currently set to 4 MB due to the lambda's set limit of 6 MB. 2 MB difference goes toward 'base64' conversion the submitted ZIP file is a subject to when sent to the lambda function.

Testing it with ZIP files of different sizes shows that it takes about 10 seconds on average to process a 4 MB ZIP file. Those under 1 MB are processed almost with no delay.

Like the rest of the service, this new feature is experimental and requires more thorough testing. I'd appreciate any feedback.

Sunday, 10 December 2017

Quick SummaryBuild Version: 0.0.1(alpha)Change Type: feature updateAffected Components: APIShort Description: Parser for LNK files "Console Data Block" structure has been added. The parser will attempt to extract all relevant data stored in "Console Data Block" structures. The information about Console Window is stored in these structures.Outstanding Tasks: NoneDetailed SummaryIRIS-H Shell Link (.LNK) file parser has been updated to include data extraction routine for "Console Data Block" structures. The ConsoleDataBlock structure specifies the display settings to use when a link target specifies an application that is run in a console window. Below are just some examples of data stored in these structures:

foreground and background text colors in the console window.

foreground and background text color in the console window popup.

console window buffer size.

console window size.

console window origins coordinates.

font information.

cursor information.

edit settings.

Below screenshot show an example of "Console Data Block" data extracted by IRIS-H.

Monday, 27 November 2017

Quick SummaryBuild Version: 0.0.1(alpha)Change Type: feature updateAffected Components: API & UI (clear browser cache for 'iris-h.service' to see the changes)Short Description: Parser for LNK files has been updated. Command line arguments string deobfuscation and URL extraction code have been added. UI Report page has been updated to display the new data.Outstanding Tasks: NoneDetailed SummaryIRIS-H Shell Link (.LNK) file parser has been updated and now attempts to deobfuscate the command line arguments string. When the command line arguments string is present, the service will attempt the following:

Quick SummaryBuild Version: 0.0.1(alpha)Change Type: new featureAffected Components: APIShort Description: Parser for OOXML "Footer Part" has been added. The parser detects and extracts text content including special field characters.Example: https://iris-h.malwageddon.com/report/380710e90e15242de982aede9a62c66eOutstanding Tasks: NoneDetailed Summary"Footer Part contains the information about a footer displayed for one or more sections. Each Footer part is the target of an explicit relationship in the part-relationship item for the Main Document. Each footer has a corresponding 'ftr' element in a Footer part, which contains the text of the footer." - ECMA-376 Part 1 (section 11.3.6)A new parser for OOXML 'Footer Part' has been added to IRIS-H. The parser will detect and extract text content including special field characters. The extracted content can be found in a new panel under 'Individual Components' section on the report page. See an example below:

Example of a Footer Part panel showing extracted text content.

If the extracted content includes special field characters, they will be analysed for presence of blacklisted field character command and if any detected, the findings will be populated in 'Malicious Findings' panel on the report page. Below is the corresponding findings panel:

Thursday, 9 November 2017

Quick SummaryBuild Version: 0.0.1(alpha)Change Type: new featureAffected Components: API & UI (clear browser cache to see the changes)Short Description: Parser for OOXML "Relationships" file has been added. The parser detects and extracts hyperlinks to external sources.Outstanding Tasks: NoneDetailed Summary"Relationships are represented in XML in a Relationships part. Each part in the package that is the source of one or more relationships can have an associated Relationships part. This part holds the list of relationships for the source part." - ECMA-376 Part 2 (section 9.3.3)

Relationships file example

A new parser for OOXML Relationships file has been added to IRIS-H. The parser is configured to read every Relationship in the Relationships file and extract hyperlinks pointed at external sources. See below for an example of a Relationship that will be detected:

Wednesday, 8 November 2017

Quick SummaryBuild Version: 0.0.1(alpha)Change Type: feature improvementAffected Components: APIShort Description: Parser for Field Characters used in OLE and OOXML documents has been updated to improve detection. QUOTE, SET, REF field characters have been added to the reporting.Outstanding Tasks: NoneDetailed SummaryField Character extraction and parsing code has been improved to allow for decoding QUOTE command arguments. The change was motivated by McAfee's blog post today referencing OOXML document used in an APT type of attack. Document's XML code snippet below show an example of what field characters are used and how they are present in the code.

QUOTE field character usage example

DDE field character and the way its arguments are assembled

Unlike previous instances of DDE and DDEAUTO field character usage in malicious documents, this document doesn't expose the command arguments that normally contain indicators of compromise. Instead, a combination of other field characters is used to store and assemble the command arguments.SET command is used to store the value produced by QUOTE command and later passed to DDE command through REF field character. Below is an example of that:

'c' variable now holds the output (character string built from the array of character codes) from QUOTE command. Later 'c' is referenced in DDE command call as one of the arguments.

DDE REF c

When DDE command is called, the value of 'c' variable will be used as its argument.

IRIS-H field character handlers have been updated to be able to extract the character codes array associated with QUOTE command and decode it. If extraction and decoding is successful the report page will contain the output similar to the one below.

Example of QUOTE command evaluation

This method of using field characters presents new challenges, especially around reconstructing the original text in the same sequence as it appears in the document when it's opened with its corresponding host application. IRIS-H will still attempt to extract all the text fields, but the original text appearance sequence cannot be guarantied.