User-level pathing is a useful tool in the Analytics toolkit when it comes to undestanding the complete journey for your customers from the moment they enter to the moment they leave. Unfortunately, Google Analytics allows you to do a very limited level of user-level pathing out of the box. With a little bit of tweaking, however, we can get some pretty solid user journey pathing reports.

This is a two-part series where I'll cover the following topics:

Setting up Google Analytics to get every page path that a user touches along with the timestamp

Using Python to ingest the GA page pathing data to aggregate user journeys to understand which ones are the most familiar. Bonus: understanding which pages contribute to conversions in the journey!

End Result

Let's start with what we want out of this and work backwards. Ideally we would have a report that lets us see, at the user level, each page they visited and whether they converted or not.

user_path

occurances

homepage > exit

102

homepage > category_a > exit

85

homepage > category_a > product_b

25

In order to derive the above table we need a few things. We need the timestamp of each pageview of each user. Next we need to sequence each page by timestamp for each user. In order to do this we need to create two custom dimensions: Cookie ID and Timestamp.

Step 1

Go to Admin > Property > Custom Dimensions and create two new custom dimensions. One for Timestamp (MS) and one for CookieID Remember the dimension IDs.

Step 2

Go to GTM and create three new variables. The trigger for each of these should be a pageview so they fire on each and every page the user visits.

Now you have the base setup finished. After publishing your GTM container you should start to see the data populate for users containing CookieIDs and timestamps. You will have to wait 24 hours before you see anything so don't hit refresh over and over again.

Step 4

The easiest way to see this data is to create a custom report. This will give you the raw list of each users' pageviews along with a milisecond timestamp associated with it.

You know have everything you need in order to start manipulating and pathing user data across your websites. Stay tuned for part 2 where we'll be going over how to import this data into Python in order to derive the table at the top of the article and start to analyze the most popular aggregate page paths.