It shouldn't take a robotic finger to point out what's wrong with an Android app, but at the CeBIT technology trade fair in Germany last week a group of researchers used exactly that to highlight the difficulties some developers encounter in testing their apps.

Students and teachers from Saarland University's computer science department used a robotic arm to attract attention to a software tool they've been working on called DroidMate. The researchers claim DroidMate will be able to assess an app by using a robot-like algorithm that has been specially optimized to mimic the way a human being would use an app. In doing so it will perform any of the steps involved in playing a game, operating a social networking service or other functions in a given app, ensuring that any and all errors would be caught before it lands in Google Play.

Saarsland University showed off a robotic arm that takes a similar approach to app testing as its new DroidMate offering.

"The Android system is more open and less controlled than the iOS or BlackBerry apps, which increases the possibilities of applications interacting, but also increases the risk of applications interfering with each other," said Andreas Zeller, a professor at Saarsland University who is involved with the DroidMate project. "We don't see Android apps being specifically prone to errors."

"Android, without question, has the most diverse ecosystem of any phone OS ever," said Trent Peterson, co-founder of AppThwack in Portland, Ore. "There are multiple levels to consider: hardware, base OS version, OEM OS modifications, carrier OS modifications, and finally carrier networks and the end-users' environment. There's also a growing number of users that choose not to upgrade for one reason or another, meaning developers that want to cover the large majority of the market have to test a long-tail of various hardware versions running older OS versions."

AppThwack offers automated testing of Android apps based on the number of devices tested. Peterson said the obvious best practice is to think about your demographic and the devices they're most likely to use first, then test accordingly. However developers could also choose to test on hundreds of devices and, even if some of them aren't in their target market, they can at least know that there's an issue and make an informed decision about how to deal with the problem.

"In our experience it's not uncommon for new companies to start on a subset of devices--the top 10, for example--and, as their audience grows and their testing becomes more sophisticated, generally through automation, they migrate to a higher tier and test more devices," he said, adding that most larger organizations, especially those with their own automated tests, immediately start testing on all devices. "At the end of the day, the market does not discriminate against those with less popular devices, meaning that users of those devices write reviews that carry just as much weight as those with top-of-the-line devices."

Zeller said DroidMate, which will be released in beta this summer as an open source offering available to all developers, starts with random interactions (clicks, strokes, inputs) at first. The algorithms then learn which interaction is required to obtain which functionality, systematically exploring the app's functionality--not unlike a human user. He likened the machine learning aspect of DroidMate to human genetics, in the way it can improve over time.

"Let's think of a music player featuring a 'play' button and a 'pause' button, of which only one is visible. The 'pause' button is not visible initially. This makes it a test target," he said. "The genetic algorithm starts with a random sequence of interactions. Activating 'play,' however, makes the 'pause' button visible, which is a prerequisite for executing it. Hence, the algorithm will retain all inputs containing 'play,' and start producing inputs that all contain 'play' at some point, in the hope that 'pause' will eventually be produced, too."

Multiple solutions to the testing problem

Gustafsson

Some developers have taken a different route entirely. Dennis Gustafsson is the co-founder of Mediocre AB, a developer based in Malmo, Sweden, who helped produce Sprinkle, a popular brain teaser-style game on Android. He said the firm partnered early on with graphics chipmaker Nvidia, which provided some testing hardware and gave the firm considerable advice.

"The game was only released on (Nvidia's) Tegra at first, but a few months down the line we opened a beta program where more people could sign up and try the game. We were obviously looking for as many different phone models as possible," he said. "Testing went on for a few weeks before we released. Now all our games are based on the same foundation, so we are already quite confident they will work without massive testing on different devices, but we still have a dozen or so devices that we use for testing regularly. We only support Android 2.3-plus, and from a gaming perspective there isn't much of a difference between different Android versions after that, which makes it a lot easier."

Whether a given app will face additional challenges because of the complex Android ecosystem largely depends on the type of app and how it's developed, according to AppThwack's Peterson. Apps that depend on hardware components like Bluetooth and GPS, for example, or games that require low-level access to the GPU are much more prone to issues than, say, a simple note-taking app. That said, layout problems are especially common in just about all apps simply because of how many different screen sizes are in the market today, Peterson said. He added though that Google isn't to blame for the testing challenges in Android.

"There are many variables outside of their (Google's) control like hardware specs, carrier networks, customer environments, and so on," he said. "They've done a great job of creating an adaptable and open OS and getting unprecedented adoption among hardware manufactures and carriers. Companies like AppThwack are a product of its (Android's) success, not its failure."