Alas, the fourth and final installment of the Analytics test series: the regular expression portion. While the Google Analytics test might seem daunting, with a little studying, it really isn’t bad at all. Yesterday, Jessica covered event tracking. Previously Erin covered e-commerce tracking and Steve the different types of Google Analytics cookies and how they are each used to track activity. In this post, we’ll cover the different functions of each regular expression character, along with examples.

Regex can be a lot to take in but when it’s broken out in simplified form, it’s not half as bad as its reputation would lead you to believe. (I hope you’ll agree!) Regex was a fairly small portion of the Analytics test, with maybe 3-4 questions, but is a pretty handy knowledge set to maintain, even if you aren’t worried about taking the test.

Regex can be used for several things within Google Analytics, such as:

Setting up goals funnels

Tracking equivalent pages

Filtering data within reports

Profile filters

Finally, without further adieu:

Regex Character Guide

Regex Wildcards

. is a wildcard for any single character.

Act . matches Act 1, Act 2, Act 3, etc. but does not match Act 10

Act .. matches Act 10, Act 11, Act, 12, etc.

Note: If you want to use a period as an actual period, not as a wild card you’ll need to use a backslash before the period, as a qualifier. You would want to do this if excluding an I.P. address in Analytics. The same rules apply for question marks, which can also be used in regular expressions, as we’ll discuss later.

U\.S\. matches U.S.

163\.212\.171\.123

? matches 0 or 1 of a previous item (Use / as discussed above, if using a question mark in the literal sense).

51? matches 5 or 51

AB? matches A or AB

+ matches 1 or more of a previous character.

51+ matches 51, 511, 5111, 51111, etc.

AB+ matches AB, ABB, ABBB, ABBBB, etc.

* matches 0 or more of previous item.

51* matches 5, 51, 511, 5111, 51111, etc.

AB* matches A, AB, ABB, ABBB, ABBBB, etc.

{} quantifies the number of the previous item.

51{2} matches only 511 (the 2 means that there are two of the previous item, which was a 1)

51{1,3} matches 51, 511, 5111 but does not match 51 or 51111

Match Set

[] matches one item in a character set.

[uU]\.[sS]\. matches u.s. and U.S.

[1-9] matches any number between 1 and 9

^ negates the set.

^[uU] will not match u or U

^[1-9] will not match any number between 1 and 9

() allows you to group contents as an item, using | to separate grouped items.

(U\.S\.|US| u\.s\.|us) matches U.S., US, u.s., or us

Regex Anchor String

Match a string of characters using ^ to start the series and $ to mark the end of a string

^US matches ‘US Holiday’ but does not match ‘Monday is a US Holiday’ because it does not start with ‘US’.

Holiday$ matches ‘US Holiday’ but does not match ‘US Holiday Dates’ because it does not end in ‘Holiday’.

^US Holiday$ only matches US Holiday

Regex Shorthand:

\d matches any number just like [0-9]

\s matches any white space

\w matches any number, letter, or underscore like [A-Za-z0-9_]

Now let’s try something a little more comprehensive.

What does \d{1-5}\s\w* match?

1234 Johnson

Johnson

132344 Johnson

123

If you guessed ‘a’ or ‘d’ then you are correct. Here’s why:

\d means that there is a string of numbers, which knocks out choice b, and {1-5} means that the string of numbers can only be one to five characters long, which knocks out choice c. These are the kinds of questions that you will need to prepare for. \s represents the space and \w represents the word ‘Johnson’, however, as we know * means there could be an infinite number of \w, which matches any character, or it can mean that the last character may not be present, so choice ‘d’ works also.

Although it isn’t an extensive part of the test, be prepared to answer a couple questions on regex when you’re taking the Analytics test. Much like this last question, you will need to be able to identify what a sequence of characters and numbers will match. As I said earlier, it is helpful to have a guide so that when you are setting up profiles, filters, goals, etc. you can be sure to do it correctly, without accidentally skewing your data.