In details an AMAS following my concept meets the following requirements: • cheap • runnable on a standard desktop system • fully customizable • target-system freely choose able • easily evaluate able • easy to set up • ready for use as fast as possible • as fast as possible • as autonomic as possible • runtime-unpacked/-decrypted samples must not be a hurdle • data/analysis/samples/… kept confidential • infections must be kept isolated from the public
. NOTE: Do not treat any of the components of the finally built malware analysis station as trusted! Escapes can always happen. how to build up CERT. strategic and methodological aspects.at’s automated malware analysis station closing with an easy to follow step-by-step tutorial. As primary goal the reader of this paper should be able to build up her own specific installation and configuration while being free in her decision which components to use.at.at’s implementation for your own use. So I decided to publish my best practice as a way which can be easily followed by everyone that is in the same need. so position your malware analysis station wisely and handle it with care!
2 Motivation
Thinking about the triggers that made me building my own type of an “automated malware analysis station” (AMAS) it occurred to me that many others – especially in my business – might be in a comparable situation. So feel free to skip parts. My trigger for such an AMAS is a project that needs some specific analysis of a not manually manageable quantity of malware samples. The first part of this document will cover all the theoretical. The second part is focusing on the practical aspects by diving into CERT.1 Introduction
This paper outlines the relevant steps to build up a customizable automated malware analysis station by using only freely available components with the exception of the target OS (Windows XP) itself. Regarding this my solution had to comply with the following requirements: • logging of registry-keys being read • logging of files being read • cheap • fast • as far as possible autonomic • keep the analysis internal Actually the basic concept of my solution for these requirements – which I got running after just three days of work (including research) – meets much more needs. Further a special focus lies in handling a huge amount of malware samples and the actual implementation at CERT.

2.. Let’s add the loop for analyzing more than one malware sample into the picture: INIT
Figure 2
1
Proband . guinea pig. The monitoring tools have to be started and confirmed to work properly. so let’s bring this manualactivity-chain into a big picture in the assumption of doing this for just one sample:
Figure 1
So the primary steps of one malware sample to be analyzed are quite easy to sum up: 1. The sample to be analyzed has to be transferred to the proband. eq. That’s just the way I started. All logs that have been obtained need to be transferred back to the researcher. The proband1 has to be preconfigured to meet all our desires (i. but as I like guinea pigs I preferred calling the test machine „proband“. Once monitoring is running the sample is being executed. bringing the resulting activity-chain to an automatable concept afterwards.. The monitoring logs can now be interpreted in any desired way. 3. 6. monitoring).3 Fundamental Approach
The fundamental approach to solve a quantity-problem with machine-power is to analyze what you would do manually to fulfill the “exercise”.
EXIT
LOOP
.e. 4. 5.

the receipt of the monitoring-logs and the control of the proband’s state (i. which are logically done after all of the available malware samples have been monitored. At the end of every loop cycle our proband has to be disinfected.e. be well advised to do the interpretation on exit. Therefore.e. But on the other hand. let’s take a close look at the components it needs to have to work properly.) or you are in the need of instantaneous reports. the approach in this paper (as in CERT. On the proband’s side there are the steps of receiving. I would recommend using a virtual machine (guest) for the proband’s side itself and the native machine (host) for the researcher’s side. executing and handling a sample. What firstly occurs to us is that we have two “subjects” or in better words “locations” where actions will take place: • the researcher’s side and • the proband’s side. However. as already noted. This gives you full freedom in decision which evaluation should be next without the need to repeat monitoring over and over again every time you’re interested in something new. A reliably method of disinfecting our proband is shown in the next section. disinfection). the disinfection of the proband. As Figure 2 shows. But even if you need some kind of i.
4 Identifying Components
Now that we have a distinct view on all the steps our AMAS will be able to perform. regardless of the periodicity of samples available. On the researcher’s side there are all of the steps regarding the management of the samples. web based businessservice the relating adjustments should be easy. because it does not make sense doing this over and over again. just as it’s shown in our picture. So the automation of monitoring thousands of samples is our primary goal. a new action item got added. Furthermore we have a step that pushes (saves) the monitor-logs back to the researcher. especially if you have an unpredictable quantity of samples (because of them being uploaded by people using your website i.At a closer look we see that we had to split our working-steps in three phases: • INIT … activities that have to be done before looping starts • LOOP … steps that are part of the core-functionality of analyzing one sample • EXIT … actions that happen after all samples ran through monitoring At this point it should be noted that the interpretation of the monitoring logs can either happen after the whole looping until all samples have been monitored or as last activity of each loop-cycle. the right way for you is to do the interpretation as last action of each loop-cycle. the action of preconfiguring the proband moved up to the initial phase of our whole working process. As combining these two locations in one physical location simplifies our pursuit of building an AMAS. the primary characteristics of virtual machines come in handy for that.e. otherwise the next samples could execute in an improper way and our logs would probably become meaningless. If you have a bulk of samples that you want to have monitoring logs for.at’s AMAS) focuses on studies/research and therefore statistics.
. The way you should do it depends on your case’s circumstances.

cpp = control process proband):
Figure 4
As our latest conclusion made its way to the picture there are still components that need to be declared. but as Windows XP has no ftpserver included by default the way we plan it just needs fewer thing to do). So let’s add control processes (cpr = control process researcher. etc.to avoid a reinvention of the wheel . Adding them to the picture will lead to the following:
.Figure 3
Further observations of Figure 2 lead to the discovery that any of the currently declared locations has some “autonomous entity” that is able to manage. transfer.we’ll choose one of the easiest protocols around … FTP. As FTP (just like other protocols) always likes to have a server and at least one client in the game let’s pick the native machine for server. Therefore .and the virtual machine for client-purpose (you can also use the proband as server and the researcher as client. The control processes need a way to communicate with each other.
Figure 5
For saving reports and providing data in terms of samples there’s a need for two collections: • the malware Samples and • the monitoring logs archive. things.

What again comes in handy here is FTP between those two machines (and processes). “Timing” is the magic word. it’s “how long” which comes to our minds. speed and even quality/integrity regarding the monitoring-logs. So in other words. the identified components need to communicate with each other:
Figure 7
As Figure 7 shows. one runs on the native machine and the other in the virtual machine. that you can’t define events (that trigger the next step) in every possible kind of situation. Most of the problems you will encounter (or at least I did) have more or less a nexus to timing. Having said this you can trust me when I say. there’s always one process waiting for the other.1 Defining communications
Finally. FTP is used for both transfer-goals: • the samples and • the monitoring-logs.
5 Problems of Automation
A typical problem of automation is synchronization between the host control process and the proband control process.
l
s
.Figure 6
4. because the smaller your timeouts are the shorter the time per sample and the higher the risk for missing malware activity. And actually it’s the decision of the timeout values that have direct impact to efficiency. Let’s reveal a solution for that problem. Sometimes you just have to have timeouts. The existence of files on the server side is used to synchronize. So the timeout values are the real brains of any AMAS you’re planning to build. And when we talk about waiting. Besides that you have to realize that these two processes we want to synchronize aren’t on the same machine – remember.

shuts it down or reboots it. This timeout is really some kind of safety-break. That could be i... But actually there are a lot more things that can navigate the virtual machine in an unpredictable state. when the executed malware sample crashes the OS of the virtual machine. The timeout of the proband’s control process (timeout_cpp) exists for letting the executed malware sample have a maximum amount of time until the monitoring stops and the researcher’s control process (cpr) of the native machine is being sent its (triggering) files. It exists only for saving our automatism from endless looping if something weird happens in the virtual machine..native machine (host) cpr ftp-root ftpd
Figure 8
Looking at this diagram and giving the first visual shock some time to disappear we see. In all these cases our safety-break (timeout_cpr). Let’s focus on the timeout of the researcher’s control process (timeout_cpr).gnirotinom trats
)ppc_tuoe mit(
/tixe-elpmas rof tiaw
sgol TUP
elif-ydaer TUP
elpmas TEG
dnaborp trever+pots
el p m a s et el e d
dnaborp trats
) r p c _t u o e mit(
elpmas ypoc st si x e elif-ydaer lit n u ti a w . will save our day by acting as if the expected trigger-condition had occurred. that we have two positions where we should define timeouts (coherently named timeout_.
`
proband – vm (guest) ftp cpp
TIME
.e.).
st si x e el p m a s lit n u ti a w
el p m a s et u c e x e .researcher ..

However. VBoxManage startvm uuid The command above will startup the machine uuid if it’s not running.
. Each of the reported virtual machines has a unique identifier which is called UUID. otherwise you get an error. which the most of malware defense-techniques don’t care about. But. we will now take a close look behind these scenes. Usually any virtual machine product has the ability to manage the virtual machines via command-line tools.e. VBoxManage controlvm uuid poweroff This removes the power from the virtual machine uuid as if you pulled out the power cable out of a physical machine. So. …). so you should firstly identify yours. which should be our clean start point. in other words. the good news is that the things we need are relatively elementary and should be easily figured out. Anytime you want to handle “your” virtual machine you do this using the UUID. It trashes the actual state of the virtual machine and brings the latter back to its last saved state.: VMware. VirtualBox has a handful of decent command-line tools though the mightiest one of them is “VBoxManage”. how to deal with the virtual machine(‘s state) itself. Actually VBoxManage is all we need for our AMAS – as long as we chose VirtualBox.
7 CERT.So. the values of the timeouts are of heavy influence and setting them in an unwise way could in the worst case ruin our whole automation. but for now I’ll just focus on VirtualBox. Actually there are so many gears and switches that can be altered via command-line tools that it can sometimes confuse you. In CERT.at. as I said before.
6 Controlling the Proband
Since I haven’t told anything about the magic “controlling the proband on the whole”. feel free to choose any other alternative of your taste (i. Let’s take a look what VBoxManage can do for us and how it’s told to … VBoxManage list vms You always start your command with a heading “VBoxManage” getting more specific with every word that follows. So this example causes all virtual machines managed by VirtualBox being listed. anything (including infections) that occurred since the last startup of the virtual machine uuid is being “forgotten”.at’s Implementation
Now that you have all the basic information to build up an AMAS on your own let’s take a look at a practical example – the AMAS of CERT.at’s implementation I used Sun’s VirtualBox which is one virtual machine. QEMU. or in other words. VBoxManage snapshot uuid discardcurrent -state The one above is already the last command that’s of interest to us.

my favorite programming language when I want to have results in a minimum of time.3 Proband’s Software
• • • Microsoft Windows XP SP3 ProcessMonitor (from Sysinternals) minibis-cpp
Notes: As the proband’s control process just has to receive a sample and start it.1.2 Researcher’s Software
• • • • • Xubuntu 8. So just before the proband-VM is started.at’s AMAS has the name Minibis which I chose because of its lightweight characteristics in correlation to “Anubis” which is one of the big automatic binary analyzers.04 (hardy) SUN’s VirtualBox proftpd zip minibis-cpr
Notes: Of course there is a lot more software around brought by Xubuntu but I just focused the parts that are in direct need for the AMAS. catches a screenshot on exit and does the communication-stuff regarding FTP using Window’s own FTP-client. ProcessMonitor from Sysinternals was the best solution for the information we wanted to gather from our samples. It makes Registry-.
7.
. CERT. all files it wants to retrieve have to be in their place. Feel free to use your own favorite language. but if you followed this paper closely you know that the abilities of the proband’s process could be enhanced at all times keeping efforts low. It’s a binary executable which is written in Purebasic.First of all. What is minibis-cpr? That’s the researcher’s process that manages the VM and all the communications with the proband’s process. asking for a sample over FTP.1. I know that our (CERT.1 Hardware
• • • Dell OPTIPLEX 745 Intel Core 2 Duo 6400 2 Gigs of RAM
Notes: What we have here is an average Desktop machine.1 AMAS Setup
7.
7. its startup-state (which is gone back to by reverting) has the proband’s process minibis-cpp already running.at’s) AMAS could do much more than it actually does.and most relevant API-activities transparent which can be saved as CSV-file.
7. It also manages the contact with ProcessMonitor. File.1.

stopping and reverting the VM.1.s = "" ftp_path.s = "" find.pref” is also transferred to the proband itself it carries the configuration of all components regarding our AMAS.s = "" dlls.l = 0
. It iterates through all samples transferring one by one to the proband waiting after each transfer for the results of the proband to receive.3 Source Code (minibis-cpr. It also handles starting. Let’s take a look at the parameters in “minibis.7.3. this default-values – especially the ones regarding paths and VMid (uuid.at/static/downloads/minibis/minibis-cpr
7.3. Though. As “minibis.1. when it’s not existing minibis-cpr creates one using default-values. However. minibis-cpr is configured by a file called “minibis.3.1 Description
minibis-cpr is a Linux-binary.s = "" sample.1.s = "" vmid.3.s = "" timeout_result. remember?) – won’t perfectly fit. you should have an easier job doing the right configuration afterwards. As minibis-cpr can’t predict what the attributes of minibis-cpr’s use will be. by having this default-values.2 Download
If you are interested in using our implementation of an AMAS feel free to download minibiscpr from our server: http://www.pref” … Group [status] • last … the last sample that has been analyzed (position to start after on rerun) • quit … switch to halt minibis-cpr in a controlled way Group [cpr] • ftp_path … ftp-user’s home • vmid … this is the uuid of the proband virtual machine • find … the parameters of the find-command that lists all samples • dlls … also use dll-based samples? • zip … which file-extensions of the result-files shall be archived • timeout_result … emergency break for the result-files latency in seconds • timeout_stopvm … emergency break for the vm-stopping latency in seconds • timeout_revertvm … emergency break for the vm-reverting latency in seconds Group [cpp] • timeout_sample … emergency break for the sample latency in seconds • time_after_good_exit … extra-seconds after sample has exited in case of injections Important: As minibis-cpr writes files in the ftp-user’s home the user that runs it needs to have the appropriate permissions there!
7.pb)
Global Global Global Global Global Global Global md5.cert.pref” which is an INI-file and is expected to be located in the home-directory of the actual user.1 Researcher’s Control Process (minibis-cpr)
7.

25. 18. Download “http://www. Start procmon once and accept EULA. Install zip (via “apt-get install zip”). Select the physical machine that shall become the hull of your AMAS. 12. Start up “minibis-cpr” once again in the console.at’s AMAS. with which you will start “minibis-cpr”.pref”) which you find under your user’s home by overwriting the value of the key vmid with the recently found uuid.com/Files/ProcessMonitor. 11. Create a new virtual machine (VM) in it using Windows XP as operating-system. Create a user “minibis” (password “minibis”). Now we’re ready to go! 24. Close procmon after that. Disconnect the CD-Rom via the menu-bar (this is necessary to prevent unwanted popups to occur).at/static/minibis/minibis-cpp. Close the VM using the option to revert to the last taken snapshot. 7. 16. 21. are you shy or not? So.zip” and unzip it to the desktop. 6.e.cert.e.sysinternals. Start up the VM if it’s not already running. Adjust the preference-file (“minibis. 19.exe” in the VM and answer the firewall question to NOT BLOCK this application. If any problems occur. Decline Autoupdate features when you get asked. Give your own user. let’s dive into this … 1. 17. Bring your samples into Xubuntu’s filesystem (i. 5.exe“ and save it to the desktop. for those being a little bit shy in finding their way through installations and configurations here comes a chronologically step-by-step walkthrough to build up CERT. Install the latest version of Xubuntu on it. by unplugging the network cable! 15.
. Execute “minibis-cpr” once in the console to get the example preferences-file and cancel it by pressing CTRL+C. 14.7. 4. Install proftpd (via “apt-get install proftpd”). 8. 9. 23. Lay back and enjoy watching the show. 20. solve them. Check out if you can connect to the host ftp-daemon by using Window’s ftp-client. 10. Download “minibis-cpr” to your own user’s home. Execute “minibis-cpp. Create a VM-snapshot of this state. I admit that not all of the following steps are carved into stone regarding their chronological occurrence but … hey.4 Setting it up Step by Step
Ok. 2. 3. 13. 22. Figure out the uuid of your just built VM using “VBoxManage list vms” in console (Note: Snapshots do also have an uuid so select the one uuid that belongs to the virtual machine itself!). Install SUN’s VirtualBox (via “apt-get install virtualbox”). Disconnect from the Internet/Intranet i. full permissions to the home of “minibis” and verify that you can write to “/home/minibis”. by mounting a CD-Rom). Download “http://download. All default settings for the machine and the OS are fine for us.

But all of them were solvable by me.7.000 samples had something to do with the virtual machine being in some strange state or VirtualBox-daemons zombieing around.5 Problems
The only problems I encountered through analyzing 15.
8 Screenshots
Screenshot 1: The virtual machine (proband) is being started.
. myself and I … and I’m not a VirtualBox-pro.

Screenshot 2: ProcessMonitor from Sysinternals has just been started.
.

As the abilities of our AMAS will definitely grow.e.. this paper will evolve as well. My future plans are. check what’s the favourite auto-run variant of malware .. sooner or later you should have lots of report-files as well as screenshots (see Screenshot 8).at’s control processes so that they might meet more of your special needs to relieve you of necessarily creating these two processes by yourself. bringing more flexibility into configuration of CERT. or even find out which activities are periodicly done by the OS. So keep an eye on the latest version of this paper.
9 Epilogue
So.Screenshot 8: Viewing results after a few samples. Now that you made it all through this paper I hope you enjoyed it and I wish you success building your AMAS. find out malware that injects itself into other processes.
10 Credits
There are some people I would like to thank for reading this paper for correction and verifying purposes: Klaus Darilion Tarmo Randel Lenny Zeltser Thank you!
. “grep” over all of the CSV-files to find out how many malwares try to detect VMware by reading certain Registry keys. You can now i. if everything works fine.