by Doug

26/07/2019

BigQuery, Cloud, Google Tag Manager

GTM Monitor v2.0

Hands up who’s browser history looks like this:

Yep, lots of Simo-treasure!  The most recent post was a fine collaboration between Simo Ahava and another community pillar, Mark Edmondson describing how to build and GTM monitor using GCP components and a great new GTM feature.

I like this a lot – I’ve followed the article to the letter and saw how the solution comes together very nicely.

As Simo and Mark suggested in closing in their post:

Let us know in the comments what you think of this, and how you would improve it if you had the chance!

I’ve done just that and I’ll share my customisation with you here.

Start Here

1: Go here – https://www.simoahava.com/analytics/google-tag-manager-monitor/#step-1-build-the-custom-template

2: Follow the instructions and get the solution built

Now modify

I wanted MOAR DATA.  More meta data regarding the tag type, and as we use zones (non-360 zones too!), I wanted to capture the zone in which the tag fired.

In GTM, within the Additional Tag Metadata section, I’ve added additional key value pairs:

That’s the easy bit. We need to modify the tag template to be able to send this data to the cloud function.  This is also quite simple. In the tag template code, the lines below are added:

tags.forEach((tag, idx) => {
  const tagPrefix = '&tag' + (idx + 1);
  payload +=
    tagPrefix + 'id=' + tag.id +
    tagPrefix + 'nm=' + tag.name +
    tagPrefix + 'st=' + tag.status +
    tagPrefix + 'et=' + tag.executionTime + 
    tagPrefix + 'cat=' + tag.category + 
    tagPrefix + 'type=' + tag.type +
    tagPrefix + 'zone=' + tag.zone;
});

Make sure the fields referenced match the keys added in each tag, category, type, and zone.

You can save the template now – that’s all we need to change here.  The Cloud Function doesn’t change at all.  Similarly, the raw_data table in BQ doesn’t change either.

The BQ view does need to change.  Here’s the new schema:

Here’s the view SQL:

SELECT
  timestamp,
  eventName,
  eventTimestamp,
  (SELECT REGEXP_EXTRACT(tagString, "&tag\\d+id=([^&]+)")) as tagId,
  (SELECT REGEXP_EXTRACT(tagString, "&tag\\d+nm=([^&]+)")) as tagName,
  (SELECT REGEXP_EXTRACT(tagString, "&tag\\d+st=([^&]+)")) as tagStatus,
  (SELECT REGEXP_EXTRACT(tagString, "&tag\\d+et=([^&]+)")) as tagExecutionTime,
  (SELECT REGEXP_EXTRACT(tagString, "&tag\\d+cat=([^&]+)")) as tagCategory,
  (SELECT REGEXP_EXTRACT(tagString, "&tag\\d+type=([^&]+)")) as tagType,
  (SELECT REGEXP_EXTRACT(tagString, "&tag\\d+zone=([^&]+)")) as tagZone
FROM (
  SELECT
    timestamp,
    eventName,
    eventTimestamp,
    tagStringUnnest as tagString,
    pt
  FROM (
    SELECT 
      timestamp,
      (SELECT REGEXP_EXTRACT(URI, "eventName=([^&]+)")) as eventName,
      (SELECT REGEXP_EXTRACT(URI, "eventTimestamp=([^&]+)")) as eventTimestamp,
      (SELECT REGEXP_EXTRACT_ALL(URI, "&tag\\d+id=[^&]+&tag\\d+nm=[^&]+&tag\\d+st=[^&]+&tag\\d+et=[^&]+&tag\\d+cat=[^&]+&tag\\d+type=[^&]+&tag\\d+zone=[^&]+")) as tagStringRegex,
      DATE(_PARTITIONTIME) as pt
    FROM
      `YOUR_GCP_PROJECT.CWGTMMonitor.raw_data`
  )
LEFT JOIN UNNEST(tagStringRegex) as tagStringUnnest
)
order by timestamp desc

You probably don’t need the order by – I popped it in to help debug the SQL.  There are a few changes here. Let’s explore the details.

By now, you’ll start to see the pattern.  For each meta data field, you need to add the field into the payload that GTM sends to the Cloud Function. You also need a line per extra field in the outer SQL:

(SELECT REGEXP_EXTRACT(tagString, "&tag\\d+cat=([^&]+)")) as tagCategory,
(SELECT REGEXP_EXTRACT(tagString, "&tag\\d+type=([^&]+)")) as tagType,
(SELECT REGEXP_EXTRACT(tagString, "&tag\\d+zone=([^&]+)")) as tagZone

The querystring parameter names used in the payload need to match here.  Extraction of the tag string in the inner SQL has extra regex clauses per extra field too:

SELECT REGEXP_EXTRACT_ALL(URI, "&tag\\d+id=[^&]+&tag\\d+nm=[^&]+&tag\\d+st=[^&]+&tag\\d+et=[^&]+&tag\\d+cat=[^&]+&tag\\d+type=[^&]+&tag\\d+zone=[^&]+")

That’ll solve the data collection and view creation.  I also modified the DS report.  I went for two pages.  First, the tag execution timeseries chart uses this config:

And on page 2 I placed the tag execution time chart:

I’ve added in filters for zone, category, and type in addition to the date and tag filters.  I made the header report level:

Wrap up

That’s it really.  The pattern to add fields in GTM, tag template, BQ, and DS is near enough a copy and paste as required.

Worth noting, if you use zones, you’ll need to include a GTM monitor template in each zone.

There’s more to come though. I’d like to parse a container export and flag non-firing tags too – that’s WIP though.   See Update below.

Hopefully this post is useful in extending and scaling Simo’s and Mark’s original technique to fit exactly what you need for your business.

Update

It’s 16:00 on a Friday.  I will be brief.  We’re going to import JSON into BigQuery.

Start by exporting your GTM container.

Edit the container so it is only an array of tag JSON objects (I wish I had more time to script this….). Delete from the top as shown below and then remove from the triggers to the end of the file.

You then have a JSON file that needs to be converted to ndjson containing only your tags.

Install the jq JSON parser if you don’t have it already:

https://stackoverflow.com/questions/33184780/install-jq-json-processor-on-ubuntu-10-04

Then execute this command:

cat EDITED_GTM_EXPORT_TAGS_ONLY.json | jq -c '.[]' > ndTags.json

Now you can import that into BQ:

Change values as required.

Now you want to create a view containing tags from your recently imported export(s) where their names are not in the GTM Monitor view:

SELECT 
  name 
FROM 
  `YOUR_PROJECT.YOUR_GTM_MONITOR_DATA_SET.gtm_*` 
WHERE 
  NOT REGEXP_CONTAINS(name, r'GTM Monitor') AND 
  name NOT IN (
    SELECT 
      distinct tagName 
    FROM 
      `YOUR_PROJECT.YOUR_GTM_MONITOR_DATA_SET.YOUR_GTMMonitorView`
    WHERE 
      tagName IS NOT NULL
)

Having created that view, now you can show a table of tags that have not fired.

That completes my Friday afternoon hacking.  It could be cleaner…it will be one day but you get the idea.

Update 2.0

Rather than just uploading the tags json in nld format to BigQuery, we’ve uploaded tags, triggers, and variables.  With the following query (a bit ugly maybe, but I’m no SQL guru), we can identify unused variables.  Finding unused triggers is easy.  Hopefully this will help find unused variables:

WITH
  used_variables AS (
  WITH
    fpval AS (
      SELECT
        fp.value AS fpval
      FROM
        `YOUR_PROJECT.YOUR_DATASET.YOURTABLE`,
      UNNEST(filter) AS f,
      UNNEST(f.parameter) AS fp
    ),
    pval AS (
      SELECT
        p.value AS pval
      FROM
        `YOUR_PROJECT.YOUR_DATASET.YOURTABLE`,
      UNNEST(parameter) AS p
    ),
    plmval AS (
      SELECT
        plm.value AS plmval
      FROM
        `YOUR_PROJECT.YOUR_DATASET.YOURTABLE`,
      UNNEST(parameter) AS p,
      UNNEST(p.list) AS pl,
      UNNEST(pl.map) AS plm
    )
    SELECT
      STRING_AGG(fpval," , ") AS used_variables
    FROM (
      SELECT
        *
      FROM
        fpval
      UNION ALL
      SELECT
        *
      FROM
        pval
      UNION ALL
      SELECT
        *
      FROM
        plmval
    )
    WHERE
      REGEXP_CONTAINS(fpval,r"{{(.+?)}}") 
  ),
  available_variables AS (
    SELECT
      name AS available_variable_name
    FROM
      `YOUR_PROJECT.YOUR_DATASET.YOURTABLE`,
    WHERE
      variableId IS NOT NULL
  )
  SELECT
    *
  FROM
    available_variables
  WHERE
    NOT REGEXP_CONTAINS((SELECT * FROM used_variables),available_variable_name)

To explain how this works at a high level, having uploaded the bits of the container we’re interested in, we find mentions of variables in parameter.value, parameter.list.map.value, and filter.map.parameter.value….there may be more but you get the idea.  The mentions are munged together so we can reg ex match against the list of known, available variables and those that aren’t mentioned aren’t used.

We’d also like to find tags that have an expired schedule.  Look for the scheduleStartMs and scheduleEndMs values – that’d be handy.

TBH, the table upload is a bit of a faff with the editing, stripping and handling failures.  I’d like to see an “Export to BigQuery” option in GTM to facilitate this kind of work.

That’s it for now folks.

Comments

one response to ' 1 responses to “GTM Monitor v2.0” '.

Leave a comment

Your email address will not be published.