Tag Management: data layer/DOM-scraping pros & cons

Is the data layer the be all and end all of tag management? In this article, we explore the benefits and caveats of using a data layer versus DOM-scraping techniques.

As Justin Cutroni puts it “…a data layer is a JavaScript variable or object that holds all the information you want to collect in some other tool, like a web analytics tool. You add the data layer to every page on your site and then have the container pull data from the data layer… This insulates the data collection tools by separating the data from the page structure. No matter what happens to the structure of the HTML page the data layer will always be the same.”

This is brilliant and easy enough! But is there something looming in the darkest corner of your perfect implementation? You bet there is!

“The formulation of the problem is often more essential than its solution, which may be merely a matter of mathematical or experimental skill.” – Albert Einstein

Data Layer aficionados

Pros:

Your data layer specification is an integral part of the implementation strategy and as such, it formally describes the value-pairs and acceptable domain of values for each.

Formalization of the data structure makes for a cleaner/consistent data collection process regardless of the presentation layer.

The resulting implementation is simpler since there is no need to to DOM traversal to get the element you want.

Cons

“Standardization” and “independence” are fallacies often raised by paid TMS vendors. Presumably, once you have defined the data layer you could reuse it with any TMS. The reality is each vendor implement the data layer concept in a slightly different way. So until the W3C Customer Experience Digital Data proposed standard is adopted, your data layer will be very unique to you and the TMS you use.

What makes a good implementation is the rigour in the process, not the supporting format of it’s output – be it a data layer or HTML5 data attributes. Since you have to define a taxonomy, might as well define HTML5-compliant data attributes. Most likely, when your presentation layer change, the data points you want to collect will change too.

The extensive use of the data layer requires to go back to your web development team and ask them to populate exactly what is required. After those updates, if you find anything wrong you have to go back and ask for another round of changes. There’s also the likelihood something will eventually change and someone will “forget” about the data layer and break it, which essentially brings you back to square one – being dependent on IT.

DOM-scraping master

Pros

If we consider a web page is a structured document, we can “scrape” any of its elements by using standard JavaScript and knowledge of the DOM (Document Object Model). In fact, all tag management solutions include features to retrieve specific elements based on their tag, class, id or attributes.

You gain a lot of flexibility and agility – if it’s on the page, you can track it – without waiting on your web development team to expose it through the data layer.

Cons

This is somewhat of a dirtier approach that puts more responsibility and control on the tag management side, but alleviate most dependencies to IT. I say “most” because if a data element isn’t on the page, you still have to find a way to get it there (i.e. talk to your friend in IT).

This approach requires better knowledge of JavaScript and DOM concepts – which are front-end skills rather than back-end web development skills typically required to populate the data layer. Marketing usually have more control of the front-end and far less on the back-end.

An optimal and realistic approach

A data layer have the benefit of decoupling the presentation layer from the data collection needs, but it also makes those data elements more “disconnected” from their context. For aspects of the website that deals with IT systems (such as a purchase confirmation), a rigorous data layer approach should be privileged. For general front-end tracking such as content category, a data attribute approach might be easier and more flexible.

In the end, I don’t think there’s a clear cut for one approach over the other. My philosophy, akin to “user experience comes first”, is “instrumentations adapts to the web, not the other way around”, so I often use a DOM-scraping approach. Once the TMS bootstrap/container is deployed (and through some techniques, even when it’s not), it allows me to speed up the implementation process and have complete control and agility to make adjustments without bugging our friends in web development.

What’s your experience? Have you thought about the differences in those approaches?