Monday, July 02, 2012

toolsmith: Collective Intelligence Framework

Prerequisites

Linux for server, stable on Debian Lenny and Squeeze, and
Ubuntu v10

Perl for client (stable), Python client currently
unstable

Introduction

As is often the case when plumbing the depths of my feed
reader or the Dragon News Bytes mailing list I found toolsmith gold. Kyle
Maxwell’s Introduction to the Collective IntelligenceFramework(CIF) lit up on my radar screen. CIF parses data from sources such as ZeuS and
SpyEye Tracker, Malware Domains, Spamhaus, Shadowserver, Dragon Research Group,
and others. The disparate data is then normalized into repository that allows
chronological threat intelligence gathering. Kyle’s article is an excellent starting point
that you should definitely read, but I wanted to hear more from Wes Young, the
CIF developer, who kindly filled me in with some background and a look forward.
Wes is a Principal Security Engineer for REN-ISAC whose mission is to aid and
promote cyber security operational protection and response within the higher
education and research (R&E) communities. As such the tenor of his feedback
makes all the more sense.

“The CIF project
has been an interesting experiment for us. When we first decided to transition
the core components from incubation in a private trust-based community, to a
more traditional open-source community model, it was merely to better support
our existing community. We figured, if things were open-source, our community
would have an easier time replicating our tools and processes to fit their own
needs internally. If others outside the educational space benefited from that
(private sector, government sector, etc), then that'd be the icing on the cake.

Years later, we
discovered that ratio has nearly inverted itself. Now the CIF community has
become lopsided, with the majority of users being from the international public
and private spaces. Furthermore, the contribution in terms of testing,
bug-fixes, documentation contributions and [more importantly] the word-of-mouth
endorsements has driven CIF to become its own living organism. The demonstrated
value it has created for threat analysts, who have traditionally had to
beg-borrow-and-steal their own intelligence, has become immeasurable in
relation to the minor investment of adoption.

As this project's
momentum has given it a life all its own, future roadmaps
will build off its current success. The ultimate goal of the CIF project is to
create a uniform presence of your intelligence, somewhere you control. It'll
read your blogs, your sandboxes, and yes, even your email (if you allow it), correlating
and digging out threat information that's been traditionally locked in plain,
wiki-fied or semi-formatted text. It has enabled organizations to defend their
networks with up to the second intelligence from traditional data-sources as
well as their peers. While traditional SEMs enable analysts to search their
data, CIF enables your data to adapt your network, seamlessly and on the fly.
It's your own personal Skynet. :)”

Readers may enjoy Wes’ recent interview
on the genesis of CIF, available as a FIRST 2012 podcast.

You may also wish to take a close look at Martin Holste’s
integration of CIF with his Enterprise Log Search and Archive (ELSA) solution,
a centralized syslog framework. Martin has utilized the Sphinx full-text search
engine to create accelerated query functionality and a full web front end.

Installing CIF

The documentation found on the CIF wiki
should be considered “must read” from top to bottom before proceeding. I won’t
repeat what’s also been said (Kyle’s article has some installation pointers
too), but I went through the process a couple of times to get it right so I’ll
share my experience. There are a number of elements to consider if implementing
CIF in a production capacity. While I installed a test instance on
insignificant hardware running Debian Squeeze, if you have a 64-bit system with
8GB of RAM or more and a minimum of four cores with drive space to grow into,
definitely use it for CIF. If you can also install a fresh OS, pay special
attention to your disk layout
while configuring partition mapping during the Large Volume Manager (LVM)
setup. Also follow the postgres database configuration steps closely if working from a fresh install. You’ll be changing ident sameuser to trust in pg_hba.conf for socket connections. On weak little
systems such as my test server, Kyle’s suggestion to update work_mem to 512MB and checkpoint_segments to 32 in postgresql.conf is a good
one. The BIND setup
is quite straightforward, but again per Kyle’s feedback, make sure your
forwarder IP addresses in /etc/resolv.conf match those you configure in /etc/bind/named.conf.options.

From there the install steps on the wiki can be followed verbatim.
During the Load Data phase of configuration you may run into an XML parsing
issue. After executing time
/opt/cif/bin/cif_crontool -f -d && /opt/cif/bin/cif_crontool -d -p
daily && /opt/cif/bin/cif_crontool -d -p hourly you may receive an error. The
cif_crontool script is similar to cron, as I hope you’ve sagely intuited for
yourself, where it calls cif_feedparser to traverse and load CIF configuration
files then instructs cif_feedparser based on the configs. The error, :170937: parser error : Sequence ']]>' not
allowed in content, crops up
when cif_crontool attempts to parse the cleanmx feed definition in /opt/cif/etc/misc.cfg. You can resolve this by simply commenting
out that definition. Wes is reaching out to clean-mx.de to get this fixed,
right now there are no other options than to comment out the feed.

To install a client you need only follow the Client Setupsteps,
and in your ~/.cif file apply the apikey that you created
during the server install as described in CIF
Config. Don’t forget to
configure .cif to generate feed as also described in
this section.

A final installation note: if you don’t feel like spending the time to
do your own build you have the option to utilize a preconfigured Amazon EC2
instance
(limited disk space, not production-ready).

Using CIF

You should set the following
up, per the Server Install, as a cron job but for manual reference if you wish
to update your data at random intervals, run as sudo su - cif:

1)PATH=/bin:/usr/local/bin:/opt/cif/bin

2)Pull feed data:

a.cif_crontool -p daily -T low

b.cif_crontool -p hourly -T low

3)Crunch the data: cif_analytic
-d -t 16 -m 2500 (you can up
–t and –m on beefier systems but it my grind your system down)

4)Update the feeds: cif_feeds

You can run cif from the
command line; cif –h will give
you all the options, cif –q where query string is an IP, URL, domain, etc. will get you
started. Pay special attention to the –p
parameter as it helps you define output formats such as HTML or Snort.

I immediately installed the
Firefox CIF toolbar, you’ll find details on the wiki under Client | Toolbars | Firefox
as it make queries via the browser, leveraging the API a no-brainer. See WebAPI on the wiki under API. Screen shots included hereafter
will be of CIF usage via this interface (easier than manually populating query
URLs).

There a number of client
examples
available on the wiki, but I’m always one to throw real-world scenarios at the
tool du jour. As ZeuS developers continue to “innovate” and produce modules
such as the recently discovered two-factor authentication bypass, ZeuS
continues in increased usage by cybercriminals. As may likely be the common
scenario, an end user on the network you try desperately to protect has called
you to say that they tried to update Firefox via a link “someone sent them” but
it “didn’t look right” and that they were worried “something was wrong.” You
run netstat –ano on their system and see a suspicious connection, specifically 193.106.31.68. Ruh-roh, Rastro, that
IP lives in the Ukraine. Go figure. What does Master Cifu say? Figure 1 fills
us in.

FIGURE 1: CIF says “here be dragons”

I love mazilla-update.com, bad guy squatter
genius. You need only web search ASN 49335 to learn that NCONNECT-AS Navitel
Rusconnect Ltd is not a good neighborhood for your end user to be playing in.
Better yet, cif –q AS49335 at the command line or drop AS49335 in the Firefox
search box.

Figure 2 is a case in point, Navitel
Rusconnect Ltd is definitely the wrong side of the tracks.

FIGURE 2: Can I catch a bus out of here?

ZeuS configs and binaries,
SpyEye, stolen credit card gateway, oh my.

This is a good time for a
quick overview of taxonomy. Per the wiki, severity equates to seriousness,
confidence denotes faith in the observation, and impact is a profile for
badness (ZeuS, botnet, etc.).

Our above mentioned user does
show mazilla-update.com in their browser history, let’s query it via CIF.

Figure 3 further validates
suspicions.

FIGURE 3: Mazilla <> Mozilla

You quickly discern that your
end user downloaded bt.exe from mazilla-update.com. You take a quick md5sum of
the binary and drop the hash in the CIF search box. 756447e177fc3cc39912797b7ecb2f92
bears instant fruit as seen in Figure 4.

FIGURE 4: CIF
hash search

Yep, looks like your end user
might have gotten himself some ZeuS action.

With a resource such as CIF
at your fingertips you should be able to quickly envision value added when
using a DNS sinkhole (hello 127.0.0.1) or DNS-BH from malwaredomains.com where
you serve up fake replies to any request for the likes of mazilla-update.com. Bonus! Beefy server for CIF: $2499. CIF
licensing: $0. Bad guy fail? Priceless.

In Conclusion

Check out the Idea List in the CIF Projects Lab; there is
some excellent work to be done including a VMWare appliance, further Snort
integration, a Virus Total analytic, and others. This project, like so many
others we’ve discussed in toolsmith, grows and prospers with your feedback and
contributions. Please consider participating by joining the CIF Google Group and
jumping in. You’ll also want to check out the DFIR Journal’s CIF discussions, including
integration with ArcSight, as well as EyeIS’s CIF incorporation with Splunk. These
are the same folks who have brought us Security Onion 1.0 for Splunk, so I’m
imaging all the possibilities for integration. Get busy with CIF, folks. It’s a
work in progress but a damned good one at that.

Ping me via email if you have questions (russ at
holisticinfosec dot org).

ASJA Awards Prize Winning Article

Subscribe To HolisticInfoSec

About Me

Russ McRee works for Microsoft's Operating Systems Group (OSG). He writes toolsmith, a monthly column in ISSA Journal. Russ has spoken infosec events such Defcon, Black Hat, RSA,and FIRST and has published in the likes of Information Security, Linux Magazine, (IN)SECURE, and SysAdmin. As an advocate of a holistic approach to information security, Russ' website is holisticinfosec.org.
He also serves as a volunteer handler for the SANS Internet Storm Center.