COW 37

The whole 2 day workshop was live blogged by Dr Sue Black, you can read through the blog below.

Details of views for the workshop from all over the world are shown below by country.

Views of our CREST COW 36 workshop by country

App stores provide a rich source of information for software engineering research: It is, of course, possible to extract technical information as with other software systems. However, we can also readily obtain information relating to customer reviews, pricing and popularity. Never before in history of software engineering has so much information been available concerning so many, and so disparate, facets of software systems. Increasingly, the users of apps and app stores are relying on the software they provide for highly nontrivial activities, making app store analysis a pressing concern. This workshop will bring together software engineers to discuss and develop the emerging research agenda in App Store Analysis.

The CREST COW Twitter account is @CRESTCOW and the hashtag is #UCLCOW36

We are a new community, some may say that this is not software engineering. There is a lot of resistance to this topic in software engineering, but looking around the room there are a lot of smart people here 😉

In the late 90s, people thought we shouldn’t analyse web apps, in 1982 people said the same about Micros…

Everyone now introduces themselves to the group, there are about 35 people attending from around the UK and the world. More details at the bottom of the CREST COW 36 webpage

Our first talk this morning:

Studying and Enabling Reuse in Android Apps

Denys Poshyvanyk, Computer Science Department, The College of William and Mary, USA

William and Mary College is the 2nd oldest academic building in the US, founded in 1693.

We have 1.3 million apps, real and fake markets, 1000s of open source apps, its a fast growing economy with lots of people and companies making lots of money.

This talk concentrates on one issue: apps are built using APIs and there are some specific issues related to that. There are issues related to the maintenance of APIs.

(Sorry our photos are so dark :()

Research Q: APIs evolve rapidly, does instability of APIs affect the success of Android apps?

5848 apps analysed, which belonged to 30 domain categories, 68 third party libraries, only those with repositories, because needed all the changes and all the bugs.

Discussion: Mobile apps is a young industry so that’s why apps are basically hacked together. Think back to web apps early on, market share is paramount at the beginning, newbies are writing the code. Doesn’t that explain why there is no “design” and apps are hacked, this will change later on.

The top 200 highest grossing apps generate 60-80% of total market revenue

Monetization: 75% of apps are free to download

The mobile monetization global landscape is massive and elaborate, but all depends on ad libraries. There can be as many as 28 ad libraries in any one app. 65% of apps have only 1 ad library, 17% have 2 ad libraries.

Why do some apps have so many libraries? Because the fill rate is less than 18%

The number of ad libraries doesn’t impact the star rating but using the wrong libraries can do.

Ad maintenance: 14% of releases are just to update ad code. It is a serious software engineering challenge.

Clusters: used K means algorithm and probability of features belonging to topics (apps grouped together according to NL description)

APIs: analysis to extract all Android API calls, but too many in framework, so focused on a subset of the APIs-those governed by particular permissions -> could get the main feature of the cluster eg “travel” and see if there is any malicious intent

For 26% of the apps under study there was some anomalies, there was a large use of covert behaviour, mainly ad libraries, some had dubious behaviour, eg the Yahoo mail app was sending text messages, it didn’t have permission to do this.

Eg Soundcloud had uncommon behaviour, and there were also some benign outliers, most poker game apps were also spyware. Only one poker game was not invasive.

Classified apps into

Malicious vs benign

Predicted as malicious vs predicted as benign

Results: malware or malicious apps can be detected in an efficient way, can’t detect malware if no clustering analysis done

Why not use the Android store categories? Doesn’t give as good results, categories not as exact.