Internal

RoT-1 Chapter Status Report - 2011

ORGANIZATION

RoT-1 (aka the Texas Honeynet Project) is a young chapter, we were just officially formed in January 2011. That said, our original founding members have been active members of The Honeynet Project since its inception in the late 1990’s, so we’re not as young as it may seem.

Our rough center of mass, and primary meeting point is Austin, TX, though we have members in Houston and a few new recruits from the Dallas-Fort Worth area as well.

Our group was formed as an invitation-only chapter, however we encourage people near our bases of operation (Austin, Houston, DFW) to reach out to us and build a reputation and trust and solicit their own invitation. Anyone interested in contributing to our ongoing projects is welcome to contact us as well, and we’ll review these on a case-by-case basis, based on a level of trust and merit.

The following is the current list of active members and their focus areas:

DEPLOYMENTS

As of this report, the RoT-1 Chapter does not have any active server-style deployments, which is a bit of departure from previous chapter deployments. At our first all-hands meeting, in January 2011, we decided that for at least the time being our efforts would be better spent developing and deploying client side technologies, then focus on the scalability and data management issues that come along with that. We toyed with the idea of looking for malicious javascript, PDF, Flash, Office files, etc, but eventually settled on focusing our efforts on Android applications.

Our first deployment, whose development was sponsored and funded by Praetorian, was of the Scalable Tailored Application Analysis Framework (STAFF), developed by Ryan W Smith and Adam Pridgen. The goal of this framework was to provide large-scale static analysis for Android applications, to provide high level analytics, statistics and patterns. Our initial data processing was completed in May 2011, and consisted of the static analysis of over 50,000 applications from both the official Android Marketplace as well as third party marketplaces. We were able to extract data such as: Manifest values, permissions, receivers, interfaces, Dex bytecode, methods implemented, methods called, objects defined, control flow graphs, URLs contacted, etc. Many of the modules to extract these values used one or more third party tool, which was integrated into our modular framework.

We were able to provide a high level picture of certain attributes such as permissions requested, and libraries used, however it became clear that in order to compute the more complex aggregate information that we intend to that we would need to address certain scalability and data management issues. We are currently in the process of rewriting STAAF to be more modular, to use aggressive parrallelization (including an EC2 deployment), to further reduce processing and data redundancies, and to use a much more scalable and distributed database implementation. We plan to complete and release this new framework under the Apache 2 license later this year.

RESEARCH AND DEVELOPMENT

Scalable Tailored Application Analysis Framework (STAAF):

Description:

STAAF is designed to allow large scale distributed Android application analysis, and achieves this with aggressive parallelization of analysis tasks, de-duplication of processing efforts and data storage, as well as efficient data storage and recall. Because applications can be processed independently of each other, we are able to distribute the load of processing tasks for each application, which showed a marked improvement over the serialized application analysis. Additionally, rather than feeding every individual analysis tool the raw application we extract and process the required resources once, and then we feed the processed results into the analysis tools that require those resources. Furthermore, certain aspects of the application, such as library code (e.g. advertising libraries), and certain resources, are often reused between applications. Rather than analyzing these shared resources multiple times for each application that includes them, many tasks can ignore these shared resources, significantly cutting down on the amount of redundant data processing. Finally, we have designed the system using a distributed noSQL database solution. This database design provides low latency storage and recall, and also allows us to transparently include additional remote third party analysis databases for collaborative analysis and data sharing.

Availability:

v0.2 - Available upon request, on a case-by-case basis. Note that v0.2 is a python implementation and does not provide the distributed or scalability enhancements.v0.3 - Will be available later this year under the Apache 2 license, and will include all the distributed and scalability features listed above.

Integration:

STAAF is a modular framework of analysis tools, and leverages many other open source Android analysis tools such as: androguard, apktool, baksmali, axmlprinter2, etc. These tools provide the fundamental Android data parsing and interpretation, which we then use to provide the higher level data and results. A lot of credit goes to Anthony Desnos, for a lot of quick feedback and new features from the Androguard Project.

STAAF is designed to allow the integration of new “native” tools to extract new features or data very quickly and easily. That said, we are interested in collaborating with and integrating any tools that do data analysis or reverse engineering on Android applications.

Collaboration Request:
We also plan to extend our framework to other mobile platforms as well, so we would appreciate any expertise, experience, or tools for iPhone, Window7, Blackberry, or even apps for Chrome or browser extensions.

Resource Request:
We are also in need of a free or low-cost cloud solution, or simply resources for hosting a large number of virtual machines to test and deploy the STAAF framework and future distributed data analysis projects. We are currently using EC2, but our servers/services are generic Java services on each host, and we’re using a multi-instance, multi-database CouchDB solution for the database, so it’s generic enough to move out of EC2 with no modifications.

GSOC 2011: Android Static Analysis UI

Full details can be found at http://www.honeynet.org/gsoc/slot6. This project is being mentored by Ryan W Smith and implemented by student Cong Zheng. The intent of the project is to produce a free GUI that’s context-aware for Android Analysis. Currently Cong has made great progress and has implement modules for APK information browsing, module/ method context menus, Smali code view (contextual), and CFG view (contextual). After the midterm he plans to implement features such as notes, annotations, and contextual jumps between cfg and smali code, as well as the ability to save and share analysis notes and views.

Lives On The Line: Defending Crisis Maps in Libya, Sudan, and Pakistan, George Chamales, Blackhat 2011

Collaboration Request:
We would like a organize and lead a KYE paper on Android Malware and Android Analysis. I know there are plenty or other members with different perspectives and expertise, so I’d like to recommend that we pool our knowledge, pick some focus areas, and release a KYE from our collective experience.

GOALS

Being that this is the first year, we can’t judge past performance. That said, for our first year (2011), our goals are:

Release v0.3 of STAAF under the Apache 2 license.

Release a KYE from our collective experience in Android/mobile malware.

Host a continuously running implementation of STAAF (contingent on getting resources) to provide continuous updates from various sources.

Complete GSOC 2011 project with student and release a useable beta version of the Android Static Analysis UI.

Continue to improve and maintain the Android Static Analysis UI after GSOC ends.

Recruit on or two more trusted members that can become active contributors.