Suricata and Snort Signatures 101

The following is a set of tips to help you write good rules, avoid common mistakes, and understand the process of bringing a threat from discovery to signature. Please feel free to edit and add to this page!

General Things to Remember

Write to the Vuln, NOT the Exploit

It's not always possible, but always strive to understand the vulnerability and write the signature to detect that. Do NOT cut and paste from an exploit and expect that to be a good signature. Exploits change, are easily modifyable. Vulnerabilities, if understood properly, are static.

However in the event the vulnerability is not understood writing to the exploit for a TEMPORARY signature is often acceptable.

Write to Eliminate

When writing a signature you should try to use protocol options, payload size, ports and other modifiers to ELIMINATE traffic that you are certain is not of interest. This will make your rules more efficient and accurate.

For example, if you know your target packet isn't ever over 100 bytes, put a dsize:<100; in there. It'll let the engine eliminate a lot of packets before going to matching.

Test Test Test

When troubleshooting a signature change or add a single rule option at a time and test. For example; if you wanted to match upon BAD1 & BAD2 within FTP write to match a single item at a time. So start with something like:

Once you have this test it on a pcap to make sure it fires then add in each option at a time (such as flow, depths, within, direction and port numbers and so on) and test each change to make sure it still works. In doing this you know that if the rule suddenly breaks or doesn't work the issue lies within the last change and this makes it considerably easier to troubleshoot by only adding in small parts a time and testing then testing the complete rule.

PCRE

PCRE is great, but dangerous. You MUST use some other less expensive match to prequalify packets before applying PCRE. A key practice is to identify some portion of the target payload that you can catch with one or more content matches, followed only then by the pcre match(es). The net effect is that the expensive PCRE engine should only be invoked on packets that have passed through previous content inspection, thus significantly reducing the amount of packets the engine spends time on. This technique can/should be combined with the above guidance around "writing to eliminate."

A second set of considerations for reducing performance impact from PCRE matches is to craft a pcre match which itself performs efficiently. Some high-level guidelines follow:

Restrict use of the /i modifer unless it is required. The /i modifier specifies for the regular expression engine to ignore character case when matching, and this can be useful in some cases (such as when you have verified that mixed case occurs throughout target payload strings). This modifier does however increase the cost of the PCRE, and even slight performance penalties can add up in high throughput environments. Alternatives to the use of /i include character class matches (for example, /[0-9A-Fa-f]+/ is equivalent to /[0-9a-f]+/iand performs better), or simply to avoid case insensitivity if it is not required.

Use modifiers that target specific sections of payload. The pcre keyword supports several useful options that allow rule writers to limit the overhead of PCRE inspection by targeting specific inspection buffers rather than the entire packet. A commonly used example is the /U modifier, which restricts the pcre match to the normalized URI buffer that is populated by the HTTP preprocessor. Other such modifiers allow the rule writer to granularly restrict PCRE processing and overhead to other significant portions of the packet.

Attempt to be more specific. Where possible, restrict the use of constructs in your PCRE that may cause the engine to do extra work such as backtracking needlessly. An example of an expression that can cause this is /abc.*xyz/ - regex engines default to greedy match semantics when encountering specifiers that match "anything" and thus, in the example given, the engine would match through to the end of the expression and then have to start backtracking (e.g. giving back match elements, gradually moving backward) until it could satisfy the need to match xyz after other stuff. Other elements which can amount to more specific matches include the use of string anchors (i.e. ^ and $) to restrict string matches to the beginning of the packet or buffer that is passed to the PCRE engine. With a solid understanding of the threat you're attempting to detect, you will frequently find it possible to use more specific constructs (such as more narrow character classes, digits, alphabetic characters, and limiters such as {n,n}.

It is often beneficial when attempting to craft PCREs for a given target threat to gather as many resources describing the threat and its network communications as possible. Having a larger sample reference set reduces your chances of introducing false positives and/or false negatives because of not seeing certain niche cases that may not have lined up with initial information you have. For example, when attempting to write a rule designed to detect C&C communications for a malware sample you've encountered, try to identify other sources of data about the same threat. Public sandboxes such as ThreatExpert and Anubis often provide valuable sample data and may be searched either directly or via a search engine. Other resources include malware reports from antivirus vendors, or malicious host/URL intelligence databases. Additionally, as you learn more about the threat and see more instances of how it looks on the wire from a multitude of sources, you are likely to gain more confidence in adding in elements to a PCRE which more precisely match attributes of the traffic and tighten the regex down, often increasing the performance of the rule when the PCRE engine is invoked.

The above rule is second version of the exploit (not for the Vulnerability) we have added clsid and distance modifier for the id and createstore..
This rule also fires sometimes for the vulnerability description but lesser than first one.

Now This rule work for both Exploits...
We have to read the Vulnerability description thoroughly to develop a rule which work for Vulnerability not for exploit.. here we have added only one method there may be other methods which are vulnerable.
If we have another Vulnerable method call 'Retrivestore' what could be the next version of the rule??

This is unfortunately a very computationally expensive rule. We have included two PCREs. This signature will be best split into two. Content matches are far less expensive than PCRE.

Common Mistakes

Distance

Distance applies to the previous content match, and is relative to the end of the match before that. So a distance modifier can come only after at least the second content match and modifies the two prior.

Normally every content match is evaluated from the beginning of the packet. NOT from the end of the last match. if you want to specify that a string should occur only after the last match you should use:

content:"match1"; content:"match2"; distance:0;

This means the second match cannot start until after the first.

Flow

Always use flow when possible. It'll help the engine eliminate a lot of traffic.

References

Do NOT put http:// into a reference string. It is assumed when you use url.

User-agent references

Before writing a signature for a reference please consult the following ilnk to make sure it's unique:

HTTP Rules

Proxy vs. Direct

Direct connections typically use a relative URI ("GET /open/ HTTP/1.1\r\n"), and proxy connections use an absolute URI ("GET http://rules.emergingthreats.net/open/ HTTP/1.1\r\n)". Therefore, rules written with a content match of "GET /..." will fail to match on a proxy connection. Split this out into separate http_method and http_uri matches instead.

Rules written like "...|0d 0a|Connection: Keep-Alive" will fail to match on traffic between client and proxy where "Proxy-Connection: Keep-Alive" will be used instead.

HTTP Library Variation

Some malware uses its own coded HTTP routines which may make for unique headers; others use the Windows API. If the Windows libraries are used, then some of the headers will vary based on OS and IE versions. It is therefore important not to assume that the headers are constant based on a single observed infection. In particular, older clients will use "HTTP/1.0" and "Pragma: no-cache" whereas newer clients will use "HTTP/1.1" and "Cache-Control: no-cache".