Thursday, 17 July 2014

Android Has Some Words With Monkey

Be advised ... Here thar be Squirrels!

The recentNIST Mobile Forensics Webcast and SANS FOR585 poster got monkey thinking about using the Android emulator for application artefact research. By using an emulator, we don't need to "root" an Android device in order to access artefacts from the protected data storage area (eg "/data/data/"). As an added bonus, the emulator comes as part of the FREE Android Software Development Kit (SDK). Hopefully, this post will help encourage further forensic research/scripts for Android based apps.
So now we just need a target app to investigate ... "Words With Friends" (WWF) is a popular scrabble type game with chat functionality. For this post, we'll be focusing on using an Android emulator to retrieve in-game chat artefacts and then create a script to parse them ("wwf-chat-parser.py"). It's a fairly long post so you might want to take that potty break now before we begin ;)

Android has an official SDK install guide here which also continues on here.
Installation was as simple as unzipping the downloaded archives and launching the relevant executable.
Lazy monkey just unzipped the archives to his home directory (ie "/home/cheeky/").
Here's a quick guide:
- Go to http://developer.android.com/sdk/index.html and download the 32 bit linux ADT bundle (includes both the eclipse IDE and Android SDK tools)
- Double click the zip file and use the Ubuntu Archive Manager to unzip the bundle to "/home/*username*" (eg unzips to "/home/cheeky/adt-bundle-linux-x86-20140702/")
- Use the Nautilus File Exporer GUI to navigate to the eclipse sub-directory (eg "/home/*username*/eclipse/")
- Double click on "eclipse" icon to launch it (or you could launch it from the command line via "/home/*username*/eclipse/eclipse")
- Go to the "Window" ... "Android SDK Manager" drop menu item and launch it. Some packages are installed by default but if you want to run an emulator with a specific/previous version of Android you need to download/install that specific SDK Platform (eg 4.2.2 SDK platform) and a corresponding Hardware System Image (eg ARM for a Nexus 7 tablet).
- Unzip the downloaded dex2jar zip file contents to "/home/*username*" (eg "/home/cheeky/dex2jar-0.0.9.15/")
- Unzip the downloaded jd-gui zip file to "/home/*username*" (eg "/home/cheeky/"). Note: we only need to extract the "jd-gui" exe.

To make things a bit easier, I also setup a soft link (ie alias) so we can just type "adb" without the preceding path info to launch the Android Debug Bridge.cheeky@ubuntu:~$ sudo ln -s /home/cheeky/adt-bundle-linux-x86-20140702/sdk/platform-tools/adb /usr/bin/adb

1. Getting the .apk app install file

Android .apk install files are zip archives. You can download them from the GooglePlay store by using a Chrome plugin or via the apk-downloader website. For this experiment however, I wanted to test the specific version from my Nexus 7 tablet (WWF 7.1.4), so I decided to use the Android Debug Bridge (adb) method.
Excellent adb instructions are available from the official Android Dev site here.

To prepare my Nexus 7 (1st gen c.2012) for the .apk file transfer, I attached it to my PC via USB cable. I then enabled the tablet's "Developer options" from the "Settings" menu by tapping the "About tablet" ... "Build number" several times. Next, I went into "Developer options" and enabled the "USB Debugging" and "Stay awake" options.

From our Ubuntu VM we can now check for connected devices/emulators ...cheeky@ubuntu:~$ adb devicesList of devices attached *serialnumber_of_device* device

cheeky@ubuntu:~$

Note: I have redacted the serial number of my Nexus 7. Just imagine a 16 digit hex value in place of *serialnumber_of_device" ...

So now we know that the adb has recognized our physical device, let's try connecting to it ...cheeky@ubuntu:~$ adb -s *serialnumber_of_device* shellshell@grouper:/

At this point, we type "exit" to logout from the physical device.
From the output of the "pm list packages -f -3" command, we know that the WWF .apk file is "/data/app/com.zynga.words-1.apk".
So we can use the "adb pull" command to copy it to our local Ubuntu VM.cheeky@ubuntu:~$ adb -s *serialnumber_of_device* pull /data/app/com.zynga.words-1.apk wwf.apk1252 KB/s (23648401 bytes in 18.442s)cheeky@ubuntu:~$
The above command copies "/data/app/com.zynga.words-1.apk" to the current directory and names it as "wwf.apk".
I didn't feel like typing the whole long .apk filename each time (lazy monkey!) so just called it "wwf.apk".
Anyhow, a copy of the WWF apk is now stored as "/home/cheeky/wwf.apk".

2. Create/Launch emulator

Now we create an Android emulator and fire it up ...
The Android website has some detailed instructions about creating/running the emulator here, here and here.
Here's the quick version ...
- Assuming you still have the eclipse IDE open, go to the "Window" ... "Android Virtual Device (AVD) Manager" menu item and create a new device similar to the following:

Test emulator device specs

Note: Be sure to tick the "Use Host GPU" checkbox to improve emulator speed. It also helps to ensure your VM has plenty of RAM.
- Start the device emulator by selecting the "testtab" device AVD and clicking "start". Alternatively, you can launch the AVD Manager GUI from the command line instead of via eclipse ...cheeky@ubuntu:~$ /home/cheeky/adt-bundle-linux-x86-20140702/sdk/tools/android avd

The emulator can take a minute to boot but eventually you should see something like this:

Emulator at startup

Now we can see if our emulator is recognized by typing "adb devices". Note: Our physical Nexus 7 device has been disconnected from the PC so it doesn't appear now.cheeky@ubuntu:~$ adb devicesList of devices attached emulator-5554 device

cheeky@ubuntu:~$

Update 2014-07-26:
To research Google product artefacts such as GoogleMaps and Hangouts, you can use an emulator with a Google APIs target set (eg "Google APIs - API Level 19") instead of an Android target as shown previously.

According to the official documentation, Google Play services can only be installed on an emulator with an AVD
that runs a Google APIs platform based on Android 4.2.2 or higher. To be able to use a Google API in the emulator, you must also first install the target Google API system image (eg "Google APIs - API Level 19") from the SDK Manager.

There should now be a WWF icon in the emulator's "App" screen which we can launch by clicking on it.

WWF is now installed!

We should now see a login screen for WWF where we can provide an email address and start playing games / chatting with others (ie create lots of juicy artefacts!).
By default, the emulator retains data between emulator launches so you shouldn't lose much/any app data if/when the emulator closes (eg after a crash).

4. Capture artefact data (via adb pull and DDMS)

For squirrels and giggles, let's connect to the emulator and see what our access privileges are (remembering that they were limited on the Nexus 7 device) ...cheeky@ubuntu:~$ adb -s emulator-5554 shellroot@android:/ #

Now we can search for WWF chat artefacts in "/data/data/" ... Note: Because the Nexus 7 doesn't have a removable SD card, we don't have to worry about checking "/mnt/sdcard/" for app artefacts. But just keep it in mind for other devices which may support app data storage on the SD card.

Update 2014-07-26:By using the "lsof" command on the emulator whilst our target app is running, we can see a list of currently open files .This can then be used to locate application artefact files (eg databases). For example, we can type:lsof | grep com.zynga.wordswhich should also lead us to various files open in "/data/data/com.zynga.words/databases/".

We type "exit" to logout for now.
Now we can "pull" these files of interest from the emulator for further analysis. For example, I'm now going to skip ahead and pull the file where I eventually found the WWF chat artefacts ...cheeky@ubuntu:~$ adb -s emulator-5554 pull /data/data/com.zynga.words/databases/WordsFramework1111 KB/s (159744 bytes in 0.140s)cheeky@ubuntu:~$

Alternatively, we can use the eclipse IDE and the Dalvik Debug Monitor Server (DDMS) tool to "pull" files. If our emulator is running (first, you better go catch it!), DDMS should connect to it automagically.
To launch the DDMS tool, use the following eclipse drop down menu - "Window" ... "Open Perspective" ... "DDMS"
The DDMS tool allows us to a bunch of cool things such as:
- pull/push files to the emulator
- dump process heap memory
- spoof phone calls (logs connections only / not capable of voice transmission/reception)
- send SMS text messages to the emulator
- set the GPS location of the phone
More information on DDMS is available here.

OK you should see the emulator on the LHS under "Devices" and the "File Explorer" tab on the RHS.

Dalvik Debug Monitor Server running in eclipse

Under the "File Explorer" Tab, browse to "/data/data/com.zynga.words/databases"
Then select the "WordsFramework" file and click the floppy disk icon to "pull" the file onto your Ubuntu box.

For squirrels and giggles, we can also dump the WWF process heap memory and later search it for interesting strings.
To do this, on the LHS, select the "com.zynga.words" process, toggle the "Update Heap" button (making DDMS continuously monitor the heap) and then click on the "Dump HPROF file" button (looks like cylinder with red arrow pointing down).

Dumping the process heap memory via DDMS

Next select the "Leak Suspects Report" and wait ... FYI we're not interested in the report as much as the accompanying HPROF dump.
After a while, the title bar changes to reflect that the .hprof is stored with a ridiculously long numeric filename in "/tmp/".

Obtaining the HPROF file

We can now run "strings" against this file and review ...
For example, I checked the validity of the word "twerk" using the emulator's WWF "Word Check" and then dumped the process heap. I then ran the "strings" command and piped the output to a separate text file ("8058l.txt") for easier analysis/searching. cheeky@ubuntu:~$ strings --encoding=l /tmp/android1924397721207258058.hprof > 8058l.txt
From the text file output, I was able to observe a bunch of little endian UTF-16 strings (possibly a dictionary update) contained in memory. Opening the .hprof file separately in a hex editor also confirms this.

Viewing the HPROF in a Hex Editor shows something squirrelly ...

Entering some selected words from this list into the WWF "Word Check" confirmed that they are acceptable words.
I think we just found us a squirrel! Who's a good monkey eh? Purr ...

DDMS also has a "Network Statistics" tab view available but this was not working with my emulator. It apparently can be hit and miss according to this Google Android Issue Tracker notice.
Anyway, there's more than one way to get network stats ...
First, we login to the emulator (without WWF running) and take a baseline list of network connections via "netstat"cheeky@ubuntu:~$ adb -s emulator-5554 shellroot@android:/ # root@android:/ # netstatProto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 127.0.0.1:5037 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:5555 0.0.0.0:* LISTEN tcp 0 0 10.0.2.15:5555 10.0.2.2:52782 ESTABLISHEDtcp6 0 1 ::ffff:10.0.2.15:39776 ::ffff:74.125.237.134:80 CLOSE_WAITroot@android:/ #

Obviously ESTABLISHED means theres a connection between our local host 10.0.2.15 and a remote host.TIME_WAIT means there WAS a connection but it is in the process of being closed. Thus you can also sometimes see previous TCP connections.
These netstat listings are snapshots at a particular time so you have to be quick and/or take several samples.

Now we can do a "whois" lookup on these IP addresses and find out a bit more info. Admittedly, these connections could be from ANY process on the emulator but as WWF is the only installed app, it's likely that it's the process responsible for these communications.
Without getting into too much detail, some of these remote IP addresses resolved back to companies such as Google, Voxel.net / Twitter, Yahoo, Amazon, Facebook, Softlayer Technologies and Akamai.

You could also fire up Wireshark or tcpdump and capture packets for analysis but depending on your jurisdiction this may be illegal (ie intercepting communications without consent). Anyhow, monkey won't be doing that - his delicate simian features wouldn't survive prison!

To use the DDMS Call / SMS / GPS functionality ...
Select "Window" ... "Show View" ... "Other" ... and then start typing "Emulator Control".
This should bring up a new tab where you can send SMS / simulate a voice call / set the emulator's GPS location.
Making voice call logs and sending SMS worked well with my emulator. However, I have not been able to verify that the GPS functionality works. Launching Googlemaps in the emulator's web browser and clicking on the crosshairs icon results in Googlemaps reporting "Your location could not be determined" ... *shrug*
FYI A couple of times, using the emulator control functionality also made the emulator laggy / caused a crash.

DDMS also allows users to grab screenshots from the emulator via the camera button.
However, we can also do screenshots from our emulator window via the Alt-PrintScreen button combo in a VMware environment.

There will now be a "words-classes-dex2jar.jar" in the current direcotry (eg "/home/cheeky/dex2jar-0.0.9.15/")
- Run JD-GUI (either from command line or by double-clicking the extracted exe) and open the "words-classes-dex2jar.jar" file.
Be aware that not all of the code may be easily understandable due to obfuscation (eg class "a" has non-descriptive method/function names such as "a", "b" etc)
Anyhoo, now you can have fun basking in all things Java ... smells wonderful huh? ;)

6. Other .apk activities

We can use the "aapt" SDK tool to determine the Android permissions for the "wwf.apk" we pulled earlier ...Note: the "aapt" tool is installed by default with the bundle and located in the latest android version sub-directory (android-4.4W in our case).cheeky@ubuntu:~$ /home/cheeky/adt-bundle-linux-x86-20140702/sdk/build-tools/android-4.4W/aapt dump permissions wwf.apkpackage: com.zynga.wordsuses-permission: android.permission.INTERNETuses-permission: android.permission.ACCESS_WIFI_STATEuses-permission: android.permission.ACCESS_NETWORK_STATEuses-permission: android.permission.READ_PHONE_STATEuses-permission: android.permission.READ_CONTACTSuses-permission: android.permission.SEND_SMSuses-permission: android.permission.WRITE_EXTERNAL_STORAGEuses-permission: android.permission.VIBRATEuses-permission: android.permission.RECEIVE_BOOT_COMPLETEDuses-permission: com.android.vending.BILLINGuses-permission: com.android.launcher.permission.INSTALL_SHORTCUTuses-permission: com.android.launcher.permission.UNINSTALL_SHORTCUTuses-permission: android.permission.GET_ACCOUNTSuses-permission: android.permission.AUTHENTICATE_ACCOUNTSuses-permission: android.permission.USE_CREDENTIALSuses-permission: android.permission.NFCuses-permission: android.permission.ACCESS_FINE_LOCATIONuses-permission: com.google.android.c2dm.permission.RECEIVEpermission: com.zynga.words.permission.C2D_MESSAGEuses-permission: com.zynga.words.permission.C2D_MESSAGEuses-permission: com.sec.android.provider.badge.permission.READuses-permission: com.sec.android.provider.badge.permission.WRITEcheeky@ubuntu:~$
For a listing of possible Android permissions, see here.
For more details on the "aapt" tool, see here.

It can also be handy to explore what other files are included in an .apk.
For example, the "res" directory holds resources such as pics, sounds.
See here and here for further details on the apk archive structure and the apk building process.

Opening the "wwf.apk" file using Ubuntu Archive Manager, we also note that under the "res/raw/" directory there exists the "dict" file - that sounds pretty squirrelly eh?
Unfortunately, opening it in a Hex editor shows that it's encoded somehow :'(
So no free WWF word list for you! Hey, it was worth a shot ...

7. Creating a Chat Extraction script

OK we're both losing the will to go on, so I'll start finishing up by mentioning the WWF chat artefacts and describing the accompanying extraction script.

Where are the chat artefacts stored?
Under "/data/data/com.zynga.words/databases/" there is a "WordsFramework" file.

I discovered this by "pulling" all the files from the "databases" directory and looking at them using the Firefox SQLite Manager or the Bless hex editor.
We can also use the Linux "file" command to figure out what type of file it is.cheeky@ubuntu:~$ file WordsFrameworkwwf/WordsFramework: SQLite 3.x database, user version 220cheeky@ubuntu:~$

Opening it up in Firefox SQLite Manager, we can see that theres 2 tables of interest - "users" and "chat_messages"
Here's a diagram showing how the 2 tables go together ... well, it's been abbreviated down to what monkey considers the important fields anyway ...

WWF chat schema

From our previous adventures in Python SQLite (eg Facebook Messenger post), we know how to query/extract this kind of stuff. Here's the query that gets us the chat artefacts ...SELECT chat.chat_message_id, chat.game_id, chat.created_at, users.name, chat.message, chat.user_id, users.email_address, users.phone_number, users.facebook_id, users.facebook_name, users.zynga_account_id FROM chat_messages as chat, users WHERE users.user_id = chat.user_id ORDER BY chat.created_at;

The script ("wwf-chat-parser.py") opens the specified "WordsFramework" file, runs the above query and prints out the chat messages in chronological order.
If there's multiple game conversations going on, the analyst can (manually) use the "game_id" to filter out conversations from the TSV output.

It has been developed/tested for Python 2.7.3 on a 32-bit Ubuntu 12.04 LTS VM.
It can be downloaded from Github here.

Here's what the command line output looks like when using the "WordsFramework" file from our emulated chat ...cheeky@ubuntu:~/wwf$ ./wwf-chat-parser.py -d /home/cheeky/WordsFramework -o wwf-output.tsvRunning wwf-chat-parser v2014-07-11

Extracted 9 chat records

cheeky@ubuntu:~/wwf$

With that many columns returned by the query, outputting to the command line just looked too crappy/confusing.
So instead, the script creates a Tab Separated (TSV) file for the output instead.
Here's what the TSV output file would look like if imported into an LibreOffice Calc spreadsheet ...

TSV output of "wwf-chat-parser.py"

Some comments/observations:
- Data is fictional and is included just to illustrate which fields are extracted. Any id numbers and names have been changed to protect the Simians.
- The "created_at" times appear to be referenced to GMT (Don't trust me ... verify it for yourself).
- The "zynga_account" and "email_address" fields are only populated for the device owner (ie "emulator-monkey"). The opponent's corresponding details aren't populated.
- The "phone_number" field does not appear to be populated at all (FYI the emulator's "Settings" ... "About phone" ... "My phone number" is 1-555-521-5554).
- The highlighted rows 5-6 are from a chat between "emulator-monkey" and "3rd-party-monkey" and show how it is possible to use the "game_id" value to determine conversation threads.
- Monkey does not use Facebook so the Facebook ID and name are included for completeness but not tested. It is apparently possible to use your Facebook login to login in to WWF.
- This script relies on allocated chats (ie chats from active games). There might be chat strings still present in the "WordsFramework" database from previous completed/expired games but I haven't had time to research this area. Running "strings" on the WordsFramework file and/or opening it in a Hex editor might help in that regard.

8. Resources

Here are some resources that I found useful while researching for this post ...
The Official Android SDK documentation (eg how to install the SDK, run the emulator, install the app etc.)http://developer.android.com/sdk/index.html

9. Final Words

We have been able to use the increased privileges of the Android emulator to uncover and harvest Android application artefacts.
This can be used by researchers to develop forensic extraction scripts without requiring actual rooted physical devices (or expensive commercial forensic tools).

Hopefully, this post will help to address the mobile device "app gap" that currently exists between commercial forensic tools and the sheer number of apps available on the GooglePlay market.
But if not, at least we got to chase some squirrels and learn some new things about WWF chats.

Some initial research shows that Microsoft's Windows Phone emulator cannot currently be used in a similar manner because the MS emulator does not allow you to load Windows marketplace apps onto it. It seems like it's mainly for testing apps that you write yourself. It has trust issues apparently.
As for iOS, monkey doesn't have access to OS X or any Apple devices (shocking!) so can't say whether the iOS emulator supports loading apps from the store and/or is "jailbroken" by default.

OK due to time/space constraints and a tired monkey ("What do you mean Red Bull doesn't come in Banana flavour?!"), we're gonna stop here ... There are probably a few squirrels that escaped but at this point, something is better than nothing eh?

G'day mate,Thank you very much for the sharing. It's fantastic job! I've got a lot of valuable information from your sharing. I reckon that using the AVD will took very long time to execute. It's been suggested to use Intel as the CPU/ABI, instead the ARM and install the Intel x86 emulator accelerator (HAXM installer) on the extras packages. It will get faster process.Once again. Thank you very much.

Search This Blog

About Me

Cheeky4n6Monkey loves learning about all things digitally forensical. He also enjoys creating new scripts that can help the forensics community.
If you're contacting via email, please use a relevant subject line and introduce yourself (name or pseudonym) in the first line - otherwise you could be spam filtered!
I can be contacted at: cheeky4n6monkey(a/t)gmail(d0t)com
or (less reliably) on Twitter: @Cheeky4n6Monkey