Why are the counts different from other tools?

April 03, 2019 23:15

Updated

Question

Why are number of visitors/impressions and/or the number of events different when I compare Split to other tools?

Answer

Most companies have a variety of tools to count and track user visits and interactions on their web and mobile applications. While it can be helpful to compare the numbers between, say, Split and Google Analytics, it is not unusual or unexpected that the numbers are different. There can be a number of reasons for this:

Sampling and Configuration Settings

Sampling: Tools such as Google Analytics sample data in common configurations, sending a representative subset of the data instead of the entire data set. This obviously results in a potentially dramatic difference in what you see when comparing numbers. Google Analytics will warn you if you the data is being sampled. You can configure the precision/speed ratio or reduce the timeframe over which data is sampled to avoid it all together.

Filtering: Many tools allow you to set filtering criteria to include or exclude specific traffic, perhaps blocking internal traffic, spam, bots, excluding time ranges or IP addresses, etc. Make sure to use the same filtering logic across all tools, or at least account for the differences.

Time Zones and Time Windows: Some analytics tools use the the user's location while others may default to UTC or some other time zone. This affects the day boundary for reports. Also, the start time of an experiment may not coincide neatly with the output from another tool. Make sure you are looking at the same window of time when comparing data.

Attribution and Exclusion

Because different attribution logic is used by various tools it's not uncommon for conversion rates to be off by 10-15%. It's important to understand how things like omni-channel conversions are handled. For example, a user may get an impression/treatment on one device, perhaps an ad on a phone, and then convert (or perform some other tracked action) from a browser.

When experimenting, Split will exclude users that might pollute the results of a test. For example, a user that flips between multiple treatments within the same version of an experiment. The incidence of users changing treatments is usually insignificant, if it happens at all, but there are cases where a test design could cause many users to flip.

Details describing how Split's attribution logic and how it's handled when users change treatments and/or rules within a version can be found in the Attribution and exclusion article in the Help Center.

Implementation on the browser or mobile device can impact the collection of data. This is exacerbated by the relative lack of control over user interaction. Abruptly closing a browser window or a mobile app can impede data from being captured.

Also, content blockers are becoming more common as users seek to avoid ads and more attention is placed on privacy concerns. These blockers can impact a wide range of client-side trackers, not just ads. Depending on what's blocked, they could cause a difference between the results returned by various analytic tools.

Server side splits: One way to mitigate the issues posed by the lack of control over the end user, and create greater consistency in the user experience, is to move Splits to a back end service for evaluation.

Be aware that moving splits to the back-end may exacerbate the difference in counts if content blockers come into play since client-side content blocking doesn't impact server-side splits.

When using the JavaScript or mobile SDKs, configuration options (such as these for JavaScript) can be tuned to ensure you'll capture the greatest possible number of impressions and/or events. In particular, the RefreshRates can have a significant impact when lowered.

A number of articles in the Help Center describe why you may be missing or getting improper impression counts in Split, and how to avoid some of these issues:

Split has robust data pipelines and attribution logic. If you do find a mismatch in numbers that is greater than the expected variance between tools, we're happy to work with you to surface the reasons.