Friday, December 21, 2012

There are many ways you can analyse your Google Analytics data. In this post I use GA Traffic Sources exported csv data to try out Incanter, Clojure and MongoDB. This post is part of Loganis GAckup tools. All clojure codes can be used under the same license as Clojure's Eclipse Public License 1.0

Preparing data

Log into Google Analytics (GA) and go to Traffic Sources - All Traffic . Select date range or other options like Show rows: according to your need. Then click Export - CSV above Explorer tab.

core.clj

Edit src/mediana/core.clj file. We will use Incanter interactively in an Emacs swank clojure REPL session using M-x clojure-jack-in. If you do not have a configured Emacs environment, please visit Emacs-Live which is easy to configure. In this post we use c5 database name for GA data. We use 1 namespace and 3 clj files. First core.clj that loads util.clj and desktop.clj. util.clj contains helper functions. desktop.clj contains a nopfn or No Operation function where we can evaluate clojure code.

util.clj

Here I prepared some helper functions. iof: index of an item in a collection, sreplace: string replace, parse-int, avg: average of a row, p2n: percent to number, s2n: string to number, t2s time string to seconds.

util.clj init db

Read comments in source. First we call 00import2mongo.sh script that cleans and import GA csv into c5 database m1 collection. Then we parallel process imported collection into an integer/float only m2 collection using pmap.