YARA For Everyone: Rules Will Be Rules

Posted on 2019-08-30 by William MacArthur

In our previous article in the series YARA For Everyone: Sharing is Caring we did a quick installation of YARA on multiple platforms. We designed a rule template and filled out our first rule and tested it against a file that manually was created to find our name within files via an ascii string match. The rule template that was created is basic in order to get our feet wet and start with the process of creating a foundation to move onto the next steps of the learning process.

Since YARA rule creation is a highly valuable skill set we approach the lessons slowly, think of "baby steps" from the movie "What About Bob?" as the approach. In keeping the spirit of the process, we feel that the next natural step to take is to learn about the different components that make up the rules and focus on how they are constructed. We are starting off showing a condensed version of what can be found in the YARA documentation.

The image below is a nice breakdown without getting overly complex/granular in one image. We feel that this is nice to have as a reference point while going through this article.

Rule Naming Convention:

The naming of a rule should follow common YARA standards but be unique to your use case in order to easily identify the purpose of the rule. This can be set in any sort of format that you and your team choose as long as the following keywords are not used as well as rules not being allowed to start with digits.

Rule name:

Start the new rule by typing "rule" and based on the decisions from the team finish the first part of the rule naming convention by having NO spaces and underscores _ in the title of the rule.

Tags:

These are used as identifiers when running YARA from the command line and the options you choose. They also can be used for some more advanced functionality like having another tool trigger to start work after seeing a specific tag for automation purposes. The rule name and tags are on the same first line and separated by a space and a colon and another space for visual aesthetics.

Personal preference on spacing, but creating newlines gives an excellent looking separation from the rule name and the following parts of the rule. It is however required to put in a left curly brace "{" to keep within the proper format for the YARA engine to read the rule.

Metadata:

Descriptive data that pertains to the properties of the rule. This is one of the favorite features, as there are essentially no limitations for the use of metadata. The old saying of "less is more" is not true when it comes to metadata. If a rule is started and something else takes priority, the rule is put on hold and we need to jump back into creating it. Descriptive metadata makes it easier to pick up where you left off on a half-completed rule.

Author:

An individual's name, company name, group name, or nickname should be added as metadata to help keep track of authors of the rule. If there is more than one author, simply add in author1 on one line and author2 on another line.

Description:

The description is for identifying the purpose of the rule in a human-readable format for easier understanding

Created_Date:

The date of the rule's inception

Updated_Date:

The date when the rule was last updated. This will help with tasks such as auditing later down the line..

Comments:

Add notes specific to the rule. This can come in handy by providing context to the last revision

References:

This metadata section is good to have so that any references to the content related to the detection, project, paper, website for example

Version:

This section is useful for keeping track of any changes that have been made to the rule. It is good practice to update the version each time the rule is revised

If you are following along, then you will notice that we have now made enough progress to start seeing a proper representation of the completed rule example int he first image.

This next portion of rule creation is where most of the time should be spent during creation time. This is the point in rule creation where having multiple examples (we suggest 5 at least) will come in handy for proper testing. If we go back to the first post in this series YARA For Everyone: Sharing is Caring we can use the same concept of getting up and running and adjust the rule manually untilwe match what we desire.

Strings:

Strings are used to search within the content and match. There are three main types of strings that we will be using for our detection logic: hexadecimal, regular expressions, and text strings. Every string will start with a dollar sign "$", which indicates the start of that particular string. All three string types can be represented in multiple forms, which will be covered later in the "YARA For Everyone" series. Keep in mind that Yara rules don't need to be super complex. Just make sure you are using your creative mind and think of all the ways the content can be matched as you go through this section.

Hexadecimal (raw byte) strings:

These are strings that are used to find a series of bytes that match the content you are looking for and are enclosed with curly brackets with 2 digit numbers that represent the hex values.
You can use a tool such as cyberchef to construct some hex strings for testing purposes or can dump a file's content out utilizing the command line interface and find a hex code sequence to search as well as using your command line to search for the desired Hex content you want to match.
{ 69 6e 71 75 65 73 74 } is an example and is in the rule which represents inquest in hexadecimal form.

Regular Expressions (regex):

The use of regex within rule content is useful as it gives us the ability to use special strings that can find patterns in large amounts of code easily.
Regex in strings is identified in strings section by starting with a forward slash / and ending in another forward slash /
Regex in YARA is what makes a lot of folks want to learn how to make rules as it is super powerful to be able to utilize it for carving through the file content.
If you are not familiar with regex, a good site is regex101[.]com to practice

Text Strings:

The Ascii Encoded text strings are what we used in the first post to show that the YARA engine can understand normal words that are searchable.

Help out both your team and yourself by adding comments to your strings. YARA comments are denoted by two forward slashes "//". Comments are ignored by the YARA engine but are useful for providing context or explanation for the purpose of a string.

Rule Modifiers:

Rule modifiers can be used with strings matches or (regex) regular expressions and are used to help adjust the rule, tighten the rule or loosen it up to match on more or less content. We will focus on the ones in the rule and demonstrate more in other rules we will create.

Nocase:

Adding the "Nocase" keyword to a string will cause it to be case insensitive, meaning uppercase, lowercase, or a mix of both.

Wide:

The "Wide" keyword is often used with malware research and consists of strings that are encoded with two bytes per character.

Ascii:

Using the "Ascii" keyword allows us to search with both ascii strings or can be used alongside the WIDE modifier to search for both character strings.

Fullword:

This is used when you are looking for the actual word you desire to be matched only if it is not accompanied by non-alphanumeric characters. In other words, it will match only if it is accompanied by one of the following few examples:
- . $ * -+ ~ ; ? = & ! / ( \ ) ‘

Conditions:

Conditions are boolean expressions that we use to make the rule logic function as desired. Using strings alongside conditions allow us to unleash our full creativity while writing rules. Since we just created strings we can get into the conditions and make them work for us by telling the YARA engine how to handle the matches it finds during execution time. There are ways to make a rule so and show some of our favorites. They will be shown in future blog posts within the "YARA For Everyone" series.

Filesize:

This is one of our favorites to use in order to keep the YARA engine running optimally.
Example showing that we only want to see anything under 2MB
filesize < 2MB

Sets of strings:

This is used when you want to match a string or a set of strings more than once.
Example
2 of them
3 of ($hexadecimal_string)

Comment Section:

This is an additional special section that people fail to utilize. Comments are added in order to assist in any additional information that would be useful for deciphering what the rule was designed to match on. In order to identify where the Notes Section is located is by looking for a forward slash and a star symbol "/" which is then closed when you see the opposite position with the symbols with a star and then a forward slash "/" in order to end the notes section.

This is ideal as the level of complexity in a rule can be extensive and it wastes time if you need to read the rule and find some content to match it to get a visual representation of the content.

The placement of detection specific notes/code snippets/hashes/threat tracking is cleaner when placed at the bottom of the rule so it can be added to the "master rule file" and have proper separation between the rules.

Conclusion:

We sure have gone over a lot on the structure of the rule and gave reference when creating detection content with multiple options for one's creative mind to test out. It is beneficial to add a block of time on your calendar to create rules. We are moving along nicely with the series and look forward to getting into some addition rules and moving deeper into YARA.

If you are a little ahead of where we are at that is great. There is a new resource that we made that has some YARA tools that will be useful for some folks. Check it out at Https://Labs.InQuest.net and give us any feedback. Happy hunting!