How to avoid ClamAV matches on bundled Snort rules

22 August, 2010

Recently a psad user notified me via email that the psad-2.1.7.tar.gz tarball
was flagged by ClamAV as being infected with Exploit.HTML.MHTRedir-8.
After checking a few things, it turns out that ClamAV is triggering on a Snort rule in the
Emerging Threats rule set which is bundled in both
psad and fwsnort. The following analysis shows
exactly what ClamAV is detecting and why, and also provides some guidance for how to avoid
this for any software projects that distribute Snort rules. Similar logic would apply to
other software engineering efforts - including commercial intrusion detection systems - that
are (by their nature) looking for malicious artifacts on the filesystem or within network
traffic.

----------- SCAN SUMMARY -----------
Known viruses: 816934
Engine version: 0.96.1
Scanned directories: 41
Scanned files: 405
Infected files: 1
Data scanned: 12.55 MB
Data read: 6.41 MB (ratio 1.96:1)
Time: 8.446 sec (0 m 8 s)
Intuitively, this makes sense. That is, given that ClamAV is out to identify nasty things
within files, and given that Snort rules are designed to identify nasty things as they
communicate over the network, it stands to reason that there might be some overlap. This
overlap is not an indication of something wrong in either the Snort rules or in ClamAV.
Now, let's find out specifically which Snort rule within the emerging-all.rules
file is triggering the ClamAV match. We first take a look at the Exploit.HTML.MHTRedir-8
signature:
$ cp /var/lib/clamav/main.cvd .
$ sigtool --unpack main.cvd
$ grep Exploit.HTML.MHTRedir-8 main.ndb
Exploit.HTML.MHTRedir-8:3:*:6d68746d6c3a66696c653a2f2f{1-20}2168
The last line above is the entire ClamAV signature, and the pattern 6d68746d6c3a66696c653a2f2f
is the key. The ":3:" part identifies the signature as type "normalized HTML", so ClamAV
matches the pattern 6d68746d6c3a66696c653a2f2f against the "normalized HTML" representation
of each processed file. We can decode the pattern as follows:
echo 6d68746d6c3a66696c653a2f2f | xxd -r -p
mhtml:file://
So, within the emerging-all.rules file, we are interested in any Snort rule that
contains the string mhtml:file://. There is also the "{1-20}2168" criteria
which says to match the hex bytes 2168 anywhere from 1 to 20 bytes after the first
pattern match.
$ grep mhtml psad-2.1.7/deps/snort_rules/emerging-all.rules
alert tcp $HOME_NET any -> $EXTERNAL_NET $HTTP_PORTS (msg:"ET MALWARE Bundleware Spyware CHM Download"; flow: to_server,established; content:"Referer\: ms-its\:mhtml\:file\://C\:counter.mht!http\://"; nocase; content:"/counter/HELP3.CHM\:\:/help.htm"; nocase; classtype: trojan-activity; sid: 2001452; rev:4;)
Sure enough, sid:2001452 "ET MALWARE Bundleware Spyware CHM Download" has the
keyword content:"Referer\: ms-its\:mhtml\:file\://C\:counter.mht!http\://". Even
though there are escaping backslashes, the normalized HTML processing in ClamAV takes
this into account and matches the pattern anyway from the ClamAV signature.

So, how can we keep the original Snort rule, but change it so that ClamAV not longer
flags it?

Fortunately, ClamAV does not interpret the Snort rules convention of specifying non-printable
bytes between "|" characters within content fields, so we simply need to change one of
characters to hex notation. Snort will still offer the same detection if network traffic
matches the rule, and ClamAV won't flag it. So, let's just change the "m" in "mhtml\:file\://"
to its hex equivalent, like so: "|6d|html\:file\://". Once we make this change and save the
psad-2.1.7/deps/snort_rules/emerging-all.rules file, we rerun clamscan:
$ clamscan -r -i psad-2.1.7

In conclusion, if you are involved in any software engineering effort that distributes
or makes use of Snort rules, it is probably a good idea to run distribution packages
through ClamAV and see if there are any matches. If so, it may be possible to take
advantage of Snort rule syntax options to still achieve the same signature coverage while
not having ClamAV flag anything.