Google Analytics is a useful tool for measuring website usage -- everything from simple page views to the kind of complex ad campaign tracking marketers might need. However, I find the user interface to be, well, less than ideal. The good news is that Google Analytics provides a robust API that enables you to tap into your data programmatically, meaning you can conveniently pull and package data in ways that might not be as easy to do on the Web.

Google has tutorials that cover how to use this feature with Java, Python, PHP and JavaScript, but I prefer to tap into Google Analytics with R, a language that's specifically designed for data visualization and graphical analysis. Versions of R are available for Windows, Mac OS X, and Unix, and you can also get add-on packages for R that can streamline a lot of data work. (If you want to learn R basics, head to Computerworld Beginner's Guide to R.)

You don't need to know R to follow along with the steps here. In fact, after extracting data, you can save it to a CSV file to use in Excel, if you prefer.

Like ganalytics, rga resides on GitHub. To easily install any of the Google Analytics packages from GitHub, first install and load the R package devtools by typing the following commands into the R console window:

(You only have to run the first three commands once per machine, but you need to load library(rga) each time you open R.)

Step two: Allow rga to access your Google Analytics account

On a Mac, authentication is as easy: Create an instance of the Google Analytics API authentication object by typing the following in your R console window:

rga.open(instance="ga")

That will open a browser window that asks you to give rga permission to access your Google data. When you accept, you'll be given a code to cut and paste back into your R console window where it says, "Please enter code here."

In Windows, I find that adding a line of code before opening an rga instance helps with any authentication errors:

Next, you need to find the profile ID for your Google account, which is not found in the tracking code that you add to a website to allow Google Analytics to monitor your site. Instead, on your Google Analytics Admin page, go to View Settings and you'll see the ID under "View ID."

in your R terminal window to get a list of all available profiles in your account; the profile ID will be listed in the first column.

Whichever way you find it, save that value in a variable so you don't have to keep typing it. You can use a command like:

id <- "1234567"

(Replace the number with your actual ID, and make sure to put it between quote marks.) This stores your profile ID as the variable "id."

Step 3: Extract data

Now we're ready to start pulling some data using the ga instance we just created. The getData method will actually extract data from your Google Analytics account that you can then store in another new R variable. If you want to see all available methods for your ga object, run:

ga$getRefClass()

You can query the Google API for metrics and dimensions. Metrics are things like page views, visits and organic searches; dimensions include information like traffic sources and visitor type. (See Google's Dimensions & Metrics Reference for full details.)

In addition, you can focus your query by criteria like visits from search, visits with conversions (assuming you've set that up in Google Analytics beforehand) and even visits just from tablets, by including segments in a query. Finally, you can also create your own filters to narrow your results.

Google's Query Explorer helps you figure out what data is available and how to structure a query.

Google has created a Query Explorer for the Google Analytics API. It's a great resource to help you figure out what data is available and how to structure a query. If you're new to the Google Analytics API, play around with Query Explorer for a bit to see what data you can extract and the variables you need to pull the data you want. Further information on the terms to use for various queries is available in the API documentation.

Once you decide on what you'd like to include in your query, here's the syntax for using R to get the data:

You fill in information for your specific query between the various quotation marks, of course. Note that dates are in the format yyyy-mm-dd, such as "2013-10-30."

Here's a specific example: Say I want to see the top ten referrers for visits to my site in September. My start date is September 1 and my end date is September 30. My metric is visits -- called "ga:visits" by the API -- and my dimension is their sources -- called "ga:source."

I used =~ rather than == because the latter would set the filter to only those referrals that exactly equal news.google.com. By using the =~ operator instead, it uses more powerful regular expression searching, which in this case would match anything containing news.google.com. (Regular expressions allow much more robust pattern searching.)

As before, for each of these queries, type

myresults

(or the appropriate results variable) at the prompt in your R window to see what's returned.

The query has been refined to show the visits that came from Google News each month for a year.

Step 4: Manipulate your data

Now that you've got your data, what can you do with it?

If you're not an R enthusiast, the easiest thing is to save the results to a CSV file. R's write.csv() function first lists what you want to save and then the file name. To save the myresults variable to a file called data.csv, type:

write.csv(myresults, file="data.csv", row.names=FALSE)

The optional row.names=FALSE eliminates an extra column with the row numbers, just to keep the file uncluttered. The resulting file looks something like this (but hopefully with many more visits):

You can then use that data in the spreadsheet or graphing program of your choice.

You can also analyze your data right within R, of course, without exporting to a spreadsheet. Let me first pull some real data -- visits and page views -- from a personal site I set up years ago that I no longer tend to but that still gets occasional visitors:

You can use R's str() function to find out how the mydata object is structured.

This shows how the mydata object is structured.

Like the other results above, it's an R data frame with character strings as the month number and numbers for the data. That makes it easy to run simple analyses and generate basic graphs within R, such as

You can generate basic graphs within R, such as this one, which shows the number of visits to a site for each month.

The R barplot() command above uses the number of visits for the graph's y axis values (you can refer to a specific column in a data frame with the syntax dataframename$columnname) and names.arg as names on the x axis. The command main specifies the graph title, xlab is the x-axis label and col=rainbow(9) tells R to choose nine colors from its rainbow palette to color the bars. The nonintuitive command las=1 tells R to set both the x- and y-axis labels horizontally (0 makes them parallel to the axis, 2 perpendicular to the axis, and 3 vertical).

Google Analytics is a powerful tool, but the Web interface is not always easy to navigate. If you'd like more customizable tools to extract data -- and easier automation of data requests -- consider using a programmatic approach with the Google Analytics API. And if you don't already have a favorite language for API work, R is a good choice.

Copyright 2019 IDG Communications. ABN 14 001 592 650. All rights reserved. Reproduction in whole or in part in any form or medium without express written permission of IDG Communications is prohibited.