MTK is a place to share data forensic tips learned throughout the course of loud keyboard banging.
PGP Key http://bit.ly/UckgPW

Wednesday, September 10, 2014

Using Curl to Retrieve VirusTotal Malware Reports in BASH

If
you are in the DFIR world, there is a good chance that you often find yourself
either submitting suspicious files to VirusTotal (VT) for scanning, or
searching their database for suspicious hashes.For these tasks and other neat features, VT offers a useful web
interface were you can accomplish this.If submitting one file or searching one hash at a time is enough for
you, then their web interface should suffice for your needs.Find the web interface at www.virustotal.com.

If
you are looking for a little bit more functionality or the ability to scan a
set of suspicious hashes, you may want to look into using their public
API. VirusTotal's public API, among
other things, allows you to access malware scan reports without the need to use
their web interface. Access to their API
gives one the ability to build scripts that can have direct access to the
information generated and stored by them.

To
have access to the API you need to join their community and get your own API
key. The key is free and getting one is
as simple as creating an account with them.
After joining their community you can locate your personal API key in
your community profile.

In
this article we will go through the process of communicating with their API
form the Bourne-Again Shell (BASH) using the program curl. The chosen format to communicate with the API
is HTTP POST requests. We will discuss a
few curl commands, and once we become familiar with the commands, we will then
incorporate the commands into a script to automate the process. The command’s and the script were courtesy of
a tip that I got from my co-worker John Brown.
He gave me permission to talk about his tip and permission to publish
his script.

BASH
is the default terminal shell in Ubuntu.
For the purposes of this article I used a VmWare Player Virtual Machine
with Ubuntu 14.04 installed on it.

Installing the tools:

All
of the tools that we will use are already included in Ubuntu by default. You will not need to download and install any
other tools. If you want to follow
along, make sure that you have your VT API key available. Also, we are going to need suspicious hashes. Feel free to use your own hashes, or copy
these two md5 hashes that I will use for the article, e4736f7f320f27cec9209ea02d6ac695
and 7f16d6f96912db0289f76ab1cde3420b. One of the hashes belongs to a fake antivirus
piece of malware that I use for testing, and the other one is a hash of a text
file that contains no malicious code.
One of the hashes will return hits the other one will not. Let's get started.

The test:

Open
a Terminal window, In Ubuntu you can accomplish this by pressing Ctrl-Alt-T at
the same time or by going to the Dash Home and typing in “terminal”.

In
order to communicate with the VT database to retrieve a file scan report we are
going to need two things. First we need
to know the URL to send the POST request to.
That URL will be “https://www.virustotal.com/vtapi/v2/file/report.” And second we will need to feed the curl
command some parameters, your API key and a resource. The API key will be the key that was given to
you upon joining the VT community and the resource will be the md5 hash of the
file in question.

Curl is the command that we will use to
send the POST request to the specific VT URL.
The -s tells curl to be silent, to not print the progress bar. The -X tells which request we want it to
send, which in this instance is a POST request.
--form apikey= will be your API key, and --form resource is the MD5 hash
of the aforementioned fake antivirus file.
These are my results.

It
looks like our fake
antivirus file’s hash was located in the database and the file had been
previously scanned by VT. Lots of data with positive hits was
returned. The scanned file report
currently contains no line breaks, so it was sent to our terminal window in a
format that is difficult to read. Let's
see if we can fix that. Notice that
results from each individual antivirus solution start after each combination of
a curly brace and a comma “},” Armed
with this information let’s add a new line character at the end of each one of
those lines to separate the output so that we can see it better. Run
the same command as above, but this time let’s pipe it to sed 's|\},|\}\n|g'

The
sed command is changing our standard output by switching every }, for a }n which
is the curly brace followed by a newline.
The \ in the sed command is to escape the braces and the newlines so
that the sed command can interpret these characters as literal characters and
not as strings. These are my results.

We
can now start to see information that we can work with. From here you can redirect this data to a
file or continue using grep, sed, awk and/or any other command line magic that
you can throw at this output to continue editing it to your needs. Personally, I am interested in the bottom
area of the screen, the part that says "positives": 23,. This tells me that this hash was recognized
by 23 different antivirus engines on the VT database. This is the data that I may need to pay
attention to during an investigation. That
sed command was just an example of how to manipulate the output.

The
next command will incorporate a combination of awk and sed pipes to filter the
output to a final set of data that we felt comfortable working with. We chose to filer the data with this
combination of awk and sed commands.

The
first awk command use the "positives" string as a field delimiter and
tells it to print the string “VT Hits” followed by the second field, which is
the 23 instances of positive hits. The
second awk command uses a space as a delimiter and tells it to print the first,
second, third, sixth and seventh column to extract the string md5 and the md5
hash of the file from the output. The
last sed command is simply to remove any quotes and curly braces from the
resulting output. These are my results.

VTHits23,md5:e4736f7f320f27cec9209ea02d6ac695

The
end result is that we get a string of data that tells us the amount of
antivirus solutions that recognize the file as being malicious plus the md5
hash of the file, so that we know which file is the suspicious file.

If
by now you are thinking that was way too long of a command to remember, or even
wish to type again, then you are more like me.
For this reason, John has made a script available that automates this
exact process, and is extremely easy to use.
Find the script here.

After
making the script executable, run the script and give it a hash value as an
argument. It will use the same command as
above and will search the VT database for the hash that you fed it as an
argument. Run the script like this.

$ carlos@vm:~$
./grabVThash.sh e4736f7f320f27cec9209ea02d6ac695

These
are my results.

Same
results as above. The script automates
the process of sending hash values to VT and sends the results to the
screen. It even has the ability to take
a file containing multiple hashes as its input.
It will send 4 hashes per minute to VT as this is a limitation set by VT
for its public access of the API. You
will need to add your API key to the script.

Conclusion:

VT gives us access to its database by allowing us to build scripts that can have
direct access to the information generated and stored by them. The script that we published is just one of
many ways that we can add ease of access to the data stored by VT.

If this procedure helped your investigation,
we would like to hear from you. You can
leave a comment or reach me on twitter: @carlos_cajigas