Search This Blog

The Great search for Analytics services

In the spirit of The Great search for Syslog services that I posted last year, I've decided to talk about another very important area at Newstex that we've been struggling to find a good solution to: Analytics. First, it's important to note that I'm not an analytics person; I don't like building reports or going through logs to find out retention rates, conversion rates, or analyzing A/B testing results. It's not my core business, and it's not something that I want to devote a lot of time to. Every time I have to build a custom report for something because our analytics engine can't answer a question someone at Newstex has, I'm wasting valuable time that could be better spent implementing new features that users care about.

I don't like writing reports. I don't want to be an analytics engineer. I have better things to do.

When we started out building Mobile Applications, we knew we would need to have good reporting behind our application, not only to see what users were doing, but to report to our clients and pay our publishers. Reporting was always important, and we needed to find a solid solution to our problem.

At first we turned to Google Analytics. We use them for newstex.com, and for our internal web-based applications, so we decided to turn to them for mobile as well. The problem with Google Analytics is that it's very tied to Web applications, and it didn't really handle the different types of things that happen on a Mobile device. On a web application, you navigate through pages, and occasionally do things. On a mobile application, almost everything is an action; events are the core of analytical tracking. While Google Analytics for web did support even tracking (although very limited at the time), it only supported this for Web applications, the event library wasn't available for Mobile. We were building a native application, we knew the limitations of HTML5 and we wanted to make something without those limitations, an application capable of handling the thousands of stories per day that some of our applications require. We couldn't simply use the JavaScript version of Google Analytics, and without really good event tracking, it wouldn't work for us.

We began searching for replacements; we saw a lot about Flurry, but our biggest concern there was that there was very little actual support, and the product didn't seem to have any paid or premium services. We then found Localytics, which seemed to be very similar to Flurry in many respects, but they also offered premium services and support. Being new to the analytics game and not really wanting to figure everything out all on our own, we decided to give Localytics a try. Although incredibly expensive, they seemed to be like the best, and most supported, option we had.

Unfortunately, Localytics made us sign a rather long-term contract, and decided that they wanted to make their product more complicated, then sell "consulting services" on top. Not exactly how we wanted to go, and it took us quite a while to find something better. The problems with Localytics spread out even deeper, answering the simplest of questions seems to require you to download the export of the event log and build a custom report... exactly what we were trying to avoid by rolling out our own solution!

Then we discovered MixPanel. Immediately when looking at the UI I knew something was different, but I couldn't quite put my finger on what. Digging through the documentation we found the underlying differences in the core system: they expose the API on every aspect. It's not a service that then built APIs on top, it's a true cloud-service from the bottom up. This had the added advantage of not being locked into a specific platform. It didn't matter if it was web, mobile, desktop, or even backend applications, we could dump all of our data into MixPanel and see it all in one easy-to-use UI.

Obviously as a true proponent of cloud services I quickly dove in deeper to see if this really was a solution for us. What I found was that this was only the start. Not only did they allow us to push all these events into a unified location, but we could also do advanced segmentation and reporting, all based on whatever we saved to the system. They have ways to track unique users across events, and even tag users with human-readable names.

I quickly found the very nice import API which allowed me to copy all of the events I had in Localytics out to MixPanel. This was huge, it meant that while we're evaluating MixPanel, we could import our live actual data from our existing system. I quickly wrote up a script to import the last months worth of data, and then added a script to run nightly to copy over our event data from Localytics into MixPanel. While this did mean we wouldn't get the "live" events from MixPanel, it at least meant we could really start to evaluate the system.

Next I started looking for support. It wasn't hard to find, in fact they found me, and I attended one of their weekly 101 Webinars that helps you quickly understand the power that you get out of MixPanel. What's more, their support is free, but that's because you probably wont even need it. The system is just so intuitive. After getting a bunch of data into the system, I invited my boss to take a look. He was creating funnels, viewing segmentation, and finding out answers to questions he'd been asking for a long time. He figured this out all on his own, without having to get support, or asking me any questions. That's a win in any scenario.

Lets take a simple example of the Funnels. These funnels are the coolest feature of MixPanel that I've discovered so far. It's a quick way of telling what percentage of your user base does a certain sequence of events. What's more, you can drill down into almost any report, including the funnels, to find out more details. In this simple funnel, I wanted to see how many users went from installing the app to viewing a story.

Not only could we quickly see that only 57% of our user base continues on to view a story, but we could see the actual breakdown by platform (an attribute on every event which we track, which MixPanel refers to as a "Super Property"). In our case, 68% of iPad users actually went on to view a story after installing the app, but only 52% of iPhone users did the same. We could also break this down by Application version, or even filter so that it only shows a specific application version. This was huge; all of these reports previously had to be run by hand.

So what we've found so far is that MixPanel is way more then just an Analytics system, it's a question resolver. It does far more then just tracking your events, it allows anyone to view them in a smart and intuitive way. What's more, the amazing support is there if you need it, and out of the way when you don't.

The home-run of course was when we showed this to our financial and reporting guy and his exact response was "This is how I would have built it if I had designed it myself".

You know the service is good when you can see there's lots of usage but you don't get any questions about it.

Popular Posts

Ever wonder how sites like battle.net support things like this in Google Chrome?

Well I did, so I did a little bit of digging. It turns out Google Chrome supports an open standard called Open Search. This format is relatively simple, and very easy to add to your own site. I just added it to some of our systems in under 5 minutes.

Adding OpenSearch to your site is incredibly simple, you just have to add a simple tag to your index HTML page, and add a simple XML file that it points to. The link tag looks like this:
<link rel="search" type="application/opensearchdescription+xml" href="http://my-site.com/opensearch.xml" title="MySite Search" />

For a while, I have been creating command line tools provided right with boto which I used to manage AWS. Recently, others have become interested in these tools as well, and I've seen several other contributors adding to these tools to make them even more useful to others. One recent submission by Ales Zoulek added some nice features to my list_instances command, which I use on a regular basis to list out the instances that are currently active for my account in EC2.

Amazon now lets you add Tags to EC2 objects such as Instances and Snapshots. This allows you to actually "Name" your EC2 instance, as well as add some metadata that could be used for AMI initialization, etc. Ales added the ability to list these tags by name within the list_instances command line application:

Last week, Amazon announced the launch of a new product, DynamoDB. Within the same day, Mitch Garnaat quickly released support for DynamoDB in Boto. I quickly worked with Mitch to add on some additional features, and work out some of the more interesting quirks that DynamoDB has, such as the provisioned throughput, and what exactly it means to read and write to the database.

One very interesting and confusing part that I discovered was how Amazon actually measures this provisioned throughput. When creating a table (or at any time in the future), you set up a provisioned amount of "Read" and "Write" units individually. At a minimum, you must have at least 5 Read and 5 Write units partitioned. What isn't as clear, however, is that read and write units are measured in terms of 1KB operations. That is, if you're reading a single value that's 5KB, that counts as 5 Read units (same with Write). If you choose to operate in eventually consistent mode, you'r…