Why do you need BigQuery within web analytics?

If you have access to Google Analytics 360, you have a great opportunity to easily connect your account to Google BigQuery – Google’s serverless data warehouse. This is one of the biggest advantages of GA360 – you can export ALL your Analytics data in near real-time. Then you can query tremendous sets to get better insights about your data.

Benefits of integration with BigQuery

Google Analytics, even an enterprise version – Analytics 360, has some thresholds of data sampling. Theoretically, the analysis of some subset of data gives you an insight of the most important features about the whole set of the data. But sometimes you want to have a bigger picture of the whole set. Then it is possible with the use of connection Google Analytics 360 to BigQuery. That allows you to export all, unsampled data in near-real time. Moreover, after setting up the linking to BigQuery, you can retrieve your historical data. BigQuery provides the export of the smaller of 13 months or 10 billion hits from one view (so in that case that the view has less than 10 billion hits, the export include all the data from the beginning). The first data appear within 24 hours, the completement may take up to four weeks. After that you can start to query massive sets of historical data and get very fast response. Additionally, due to the flexibility of created queries, you can construct completely new metrics or even whole reports.

About BigQuery

BigQuery is fully managed, large-scale data warehouse, that is one of the Google Cloud Platform components intended to process big data sets. With the use of intuitive SQL dialects (standard or legacy SQL) it enables to query terabytes of data in just a few seconds.

Important information

If you want to combine your Google Analytics 360 account to the BigQuery, there are some important issues you should be aware of.

Pricing

Some actions are burdened with certain costs. Operations like queries, storage or inserts are paid, while loading, copying or exporting data are free of charge. However, you should not worry about them, they are quite low. Below there are monthly costs of operations:

Querying – the first 1TB is free, later $5 per 1TB of data

Storage – the first 10GB is free, later $0.02 per 1GB of data.

Moreover, for customers of Google Analytics 360, there is granted a credit of $500-per-month for the use of BigQuery. However, it is hard to predict how much operations there will be done in BigQuery within one month and if the credit will be utilized or not. Hence, it is obligatory to have a valid billing account in the Google Cloud Platform.

Notice: If the payment becomes invalid while exporting data, there is no more chance to re-export them for that time period.

Format of the data

Analytics data are transferred to BigQuery in the form of tables, that are separate for each day. There is established the format and schema of the data that is imported to BigQuery. Each row of the table refers to one session from the Google Analytics 360. While within one session may occur multiple hits, visited pages, etc., tables may have nested and repeated fields. If you want to query such table, you should familiarize yourself with the relevant functions like FLATTEN or WITHIN.

In the case of nested tables, they cannot be downloaded or exported as the CSV file, you can only do it using JSON format.

Calculation of metrics

Google Analytics data, imported to BigQuery are in the form of structured database. In BigQuery you do not see metricsch as total sessions, unique users, total events, etc., that are available in the Analytics interface. You can get them using simple SQL queries. There is a lot of websites, where you can check how to query basic metrics, e.g. here.

Maybe the presented method of calculating main statistics of the Google Analytics data is not the easiest one. However, the ability to decide what you want to calculate is a huge advantage. Due to the high flexibility of created queries, you can create various new metrics, that are not available in the GA interface.

Export data to file

If you would like to download some of your Analytics data from BigQuery, first you have to transfer it to the Google Cloud Storage. Data are stored in buckets – so if you haven’t create your bucket yet, do it by following that official guide. Once you have your data in the Cloud Storage, you may download or export it anytime.

Setting up the connection to BigQuery

Before you set up a connection between Google Analytics 360 and BigQuery, remember about important issues:

Check your Billing account: ensure that it is enabled for the project which you use. To get a mentioned credit of $500-per-month, go to the cloud.google.com/redeem and enter your code received via email from your Account Manager

Ensure the service account (analytics-processing-dev@system.gserviceaccount.com) has EDIT permissions. You can check it under IAM & Admin. Otherwise, you won’t be able to export data. Here you can also grant some permissions like view or edit to other people.

Now you can combine GA360 and BigQuery. To set up it correctly, follow the instruction of Analytics Help experts.

First data should be available within 24 hours. While waiting, I encourage you to look at the publicly available in BigQuery Analytics data from Google Merchandise Store. You will find it under `bigquery-public-data`.

There is data from the time interval of one year – each day in new table. Under Schema, Details and Preview you can find the basic information about tables and get familiar with nesting.

Think what information would you like to get from these datasets and start with querying that set.