Blog

New Release: GTM/GA Debugger 0.4.0
It’s been a long time since the last post, even more since the last extension update. To be exact it took me around 1 year to have this new version ready.

The main reason for this delay was that I switched how the extension is built at least 5 times. I don’t consider myself a developer which implies that many times I end choosing not the best stack I should. Anyway this has been a real opportunity for my to learn a lot of new technologies/frameworks I didn’t know about or just I never was able to understand, just to mention some: React / Svelte , WebPack, Rollup, Git, Gulp, Trevis. So at this point I’m really “happy” of all the time “wasted” on refactoring the extension so many times.

In case you’re interested after these so many changes, I ended building the extension using Vue.js 2 and Bulma as the CSS Framework. This has allowed me to build an extension that it’s faster, it’s build on top of some good tecnhologies ( instead of having thousand of non-efficient JS code lines ).

I know that for most people most of the changes won’t be noticiable, mostly because I tried kept the UI as it was in the previous version, but internally everything is different, while also como new features where added.

In the following video, I’m showing an overview of what the new extension has to offer:

GTM/GA Debugger Features
- GTM/GTAG Debug Support
- Multiple dataLayer Support ( View all the dataLayer pushes and current state )
- View all Universal Analytics Hits being sent
- View all GA4 (App+Web) Hits being sent
- Filter out the hits by the type or property/stream ids
- Filter out the dataLayer pushes by their type ( core, ga4, custom, etc )
- Parse Hits payload to see a human.friendly keys translation
- Enhanced Ecommerce Report ( based on GA/GA4 hits )
- All Reports are in Real Time
- Copy any Hit/dataLayer push info to the clipboard in a friendlyu format within a mouse click
- Trace any Hit/dataLayer push
- Real Time GA hits Payload debugging
- More …
I really lost track all everything that was added on this specific release, so I’m providing a quick Changelog

Changelog
- [NEW] – Now it’s based on Vue2.js + Bulma
- [NEW] – GA4 Hits Full Support
- [NEW] – GA4 Ecommerce Support
- [NEW] – Multiple dataLayer Support
- [NEW] – Multiple GTAG/GTM Containers support
- [NEW] – Copy hits as string
- [NEW] – Hits Stack Trace Reporting
- [NEW] – Hits Debug ( run the hits againts official GA debug endpoint )
- [NEW] – DataLayer Pushes Stack Trace Reporting
- [NEW] – GTM Preview Enhancer
- [ENHANCEMENT] – Debugging can be started clicking on a button rather than needing to press F5
- [ENHANCEMENT] – Improved GTM/GA/GA4 detection – Faster detection delay
- [ENHANCEMENT] – Improved GTM/GA/GA4 detection – Better accuracy
- [ENHANCEMENT] – Improved SSR/SPA/PWA pages debugging.
- [ENHANCEMENT] – Pushes/Hits timing are now correct and are shown in the real order they are triggered
- [ENHANCEMENT] – UI is now more responsive, showing a better interface when using it in the sidebar
- [FIX] – All bugs reported ( sites where the tool was not working properly ) has been addresses . Thanks to everyone that helped on reporting
- [FIX] – Incogonito Mode Support
- [FIX] – GA detection for hits non-send to GA endpoints
- [FIX] – GTM detection locally served containers
- [FIX] – +40 Tickets og bugs.
As you may noticed some tools are gone: the Data Attributes Inspector and the Profiler Tab Report, I removed this feature for this release in order to focus on the tool reliability, they will be added back on the next releases.

More news about the extension is that it will be available for Firefox, Opera and Edge ( as soon as I can’t have it approved on their marketplaces )

Now I’m looking for some betatesters that will help me on identifying issues on some new releases. Yay!.

Last big new is that hit the 40.000 users this past week. Yeah, according to Chrome Store data, the extension is being used by more than 40K users weekly, I’d never thought the tool was end having these many users, but also this created some “responsability” at my side that I’m currently not sure how to handle it.

In the last year I declined all the extension puchase offers and also I didn’t accept any offer for adding ads within the tool, I really want to keep this tool free of ads, but it really takes lot of time. Because of this I decided to start accepting donations via Ko-Fi, Getting some help this will allow to publish updates more regularly. This is some totally opcional, I’ll keep working on the extension anyway, but some people in the past asked for being able to help.

Click on the button below if the extension has been helpful for your work:

Now if you are not still using the extension you can get it for free in the following link: INSTALL EXTENSION
December 20, 2020

Tracking user’s IP Autonomous System Number and Organization details to prevent the spam

Around end of 2019, Google Analytics dropped the Network Domain and Service Provider dimensions support from their reports making an official announment in February about it.

These 2 dimensions, where widely used to fight the spam in Google Analytics and there have been a lot of posts going around this topic in the last months. Simo Ahava wrote about how to collect the ISP data third party service in you want to check it.

On this post we’ll learning what’s an Autonomous System and how we could use this info to try to fight the spam. And coolest part is that we’ll be able to use a free database for this. Continue reading 🙂

There are some other services and commercial databases that will provide this details, but let’s be honest there’re some big handicaps:

If you use a free services, you will hit the limit quota quickly
If you have a high traffic website this is not going to be cheap

There’re basically 3 different types of subscriptions, SaaS ( they host the app and the database, DB ( you host the Database and the query system ), WebService.

I’m attaching a list of some of the providers available, in case you want to check them.

	SaaS	DB	WebService	Updates
MaxMind	✅	✅	✅	Weekly/Monthly
IP2Location	✅	✅	✅	Monthly
IPStack	❎	❎	✅	Hourly
ip-api	❎	❎	✅	?
ipgeolocation.io	❎	✅	✅	Weekly
db-ip	❎	❎	✅	Monthly/Daily
Abstract IP Geolocation	✅	❎	✅	Daily

In any case there are a lot of posts around this topic on the web, and I’m trying to give this issue a new solution.

MaxMind provides their GEO LITE databases for Free , these database are updated weekly ( on Tuesdays to be exact ) and they provide info about:

Countries
Cities
ASN

The main difference on this databases with the paid ones is how accurate they are and how often they get updated. This accuracy may be an problem when we need to target users based on their city, but this time this is not what we’re looking for, we’ll looking at their ASN database.

If you are wondering ASN stands for Autonomous System Number. According to the Wikipedia:

An autonomous system (AS) is a collection of connected Internet Protocol (IP) routing prefixes under the control of one or more network operators on behalf of a single administrative entity or domain that presents a common, clearly defined routing policy to the internet.^[1]
https://en.wikipedia.org/wiki/Autonomous_system_(Internet)

ASNs are a “big” routers on the ISPs and datacenters that are in charge of announcing the IP addreses they hold. ( sorry for this unaccurate description, trying to make this simple ) in order to let other AS to know how to reach their IP addreses.

Each ISP usually have their own ( they can have more than 1 ) . ASN. For example one of main ASN in Google is: AS15169 registered to Google LLC, and this Autonomous System manages 9.5 millions IPs from Google:

This means that we could query any IP address we and the ASN database will return their current ASN that it belongs to.

For example we may query Google DNS’s IP address: 8.8.8.8 and the database will return the AS number and the organization name:

Array
(
    [autonomous_system_number] => 15169
    [autonomous_system_organization] => GOOGLE
)

Some other examples let’s query for this Fastly CDN IP address 151.101.134.133

Array 
(
   [autonomous_system_number] => 54113 
   [autonomous_system_organization] => FASTLY
)

Or let’s query for an IP in a dedicated servers provide like LiquedWeb

Array
(
    [autonomous_system_number] => 32244
    [autonomous_system_organization] => LIQUIDWEB
)

We could use the AS Number and the Organization names as a way to try to catch the spam, since most spam traffic is likely going to come from a co-location / vpn providers that we could identify this way.

Since it’s a database we’ll need to setup a small endpoint in our domain in order to be able to query it. This implies some IT development but in the other side it has some big wins:

There will be NO query limits.

The cost of having this solution running is the cost endpoint development

We could have our website developer querying this info via server-side and have this data pushed to the dataLayer instead of needing to have an extra XHR request and needing to delay the hits, YAY!

Now, in the order side of the road there some handicaps:

Not as accurate data as network/domain in other databases
Data freshness accuracy won’t be premium, but as we all know GA wasn’t either.

Getting the ASN DB

As I’ve mentioned above the GeoLite ASN database is free and you’ll be able to get it after signup for a free account at : https://dev.maxmind.com/geoip/geoip2/geolite2/

PHP Example

Another good point is that MaxMind already provides libreries for PHP/NodeJS/Perl and other languages to help on reading querying their GEOLite databases, which helps on setting up our endpoint.

As usual I’m providing a example for PHP, since it’s the most widly used language and the one that it’s avaiable on almost any hosting around the world

If we don’t have composer installed yet, that’s gonna be our first step:

curl -sS https://getcomposer.org/installer | php

next, we’ll be installing the needed dependences

php composer.phar require geoip2/geoip2:~2.0

<?php
require_once 'vendor/autoload.php';
use GeoIp2\Database\Reader;
$ip_as_details = new Reader('geo/GeoLite2-ASN.mmdb');
$asn_details = $ip_as_details->get('8.8.8.8');
// As this point we could build a JSON and send it back to the browser.
print_r($asn_details);

Last step will be passing back this info to Google Analytics using a custom dimension, so we can use it in our filters or segments.

Extra – Grabbing the network domain

I was about to publish the post and I decided to add a little extra , let’s also learn how to track the “network domain” .

Google Analytics was using the IP’s PTR for the “network domain” . Again you may wonder what’s “PTR” , and it stands for “Pointer record” and it basically resolves an IP to a FQDN ( fully-qualified domain name ). This is it’s the inverse of a A DNS Record.

For example we can make a Reverse IP Lookup to google DNS’s and it will return “dns.google”.

root@sd1:/# nslookup
> set q=ptr
> 8.8.8.8
Server:         8.8.8.8
Address:        8.8.8.8#53
Non-authoritative answer:
8.8.8.8.in-addr.arpa    name = dns.google.

Or we may try with one Google Bot IP address, which most sea must be familiar

> set q=ptr
> 66.249.66.1
Server:  dns.google
Address:  8.8.8.8
Non-authoritative answer:
1.66.249.66.in-addr.arpa        name = crawl-66-249-66-1.googlebot.com

Last example let’s query google.com IP address

> set q=a
> google.com
Server:  dns.google
Address:  8.8.8.8
Non-authoritative answer:
Name:    google.com
Address:  172.217.17.14
> set q=ptr
> 172.217.17.14
Server:  dns.google
Address:  8.8.8.8
Non-authoritative answer:
14.17.217.172.in-addr.arpa      name = mad07s09-in-f14.1e100.net

If we want to have the network domain info back in our GA reports we’ll just need to parse the hostname of the PTR for grabing just the root domain, on this last case it would be: 1e100.net .

I wouldn’t advise about tracking to full ptr hostname for 2 reasons: First mosts of hostname are a mix of the IP address + a the ISP domain which will be agains the GDPR ( we cannot record the user’s IP address ) and also it will create a high cardinality which won’t help on analyzing the data.

Now, remember that we were building and endpoint in PHP to get the ASN details, just some more lines of data would allow to have the network domain pushed into our datalayer! 🙂

$ip_ptr = gethostbyaddr('8.8.8.8');

Dealing with getting the root domains, can be a pain task due to all the new domain tlds and needing to have in mind the third level tlds. In case you want to have this done easily you can use the following PHP library https://github.com/utopia-php/domains , which will let you grab the “registable” domain name within a hostname

require_once '../vendor/autoload.php';
use Utopia\Domains\Domain;
// demo.example.co.uk
$domain = new Domain('demo.example.co.uk');
$domain->get(); // demo.example.co.uk
$domain->getTLD(); // uk
$domain->getSuffix(); // co.uk
$domain->getRegisterable(); // example.co.uk
$domain->getName(); // example
$domain->getSub(); // demo
$domain->isKnown(); // true
$domain->isICANN(); // true
$domain->isPrivate(); // false
$domain->isTest(); // false

I’m providing the example in PHP language, but it doesn’t mean you have to use it at all, this code/idea can be developed on almost any server-side language you may be using. In the last instance you run a small VM or VPS to have a PHP environment where you can host your endpoint :).

June 1, 2020

Tracking your visitors effective connection speed details

I know this is just currently a draft but being it available on Chrome, Edge and Opera ( or any chrome based browser ) make this really usefull in my opinion.

In those browsers, there’s a API that allows to get the details about the current connection of the current user. We cab query some info like the current “estimated” connection link, the round-trip ( latency ), based on the recently observed requests by the browser.

All these details can be queried via the Network Information API on the supported browsers. I know if not much widly adopted yet, but according to canIuse it’s supported by around a 70% of browser globally, it’s not perfect but I think it’s enough, with the time more browser should be end adding support for it.

We can query (at this moment) for the following details:

Property	Value
downlink
*downlinkMax (available in workers)*
*rtt*	round-trip time in milliseconds
*effectiveType*	`slow-2g` , `2g` , `3g` , `4g`
*type (available in workers)*	`bluetooth`, `cellular`, `ethernet`, `none`, `wifi`, `wimax`, `other`,`unknown`

On this we’ll focusing on the effectiveType since is the attribute that is widly available on the browsers. We need to have in mind that is NOT the real connection type of the user, but the current “effective” connection type. Meaning that is an estimation based on the measured network performance for the previous/current requets. This value is actually calculated based on the maximun download speeds and the minumun RTT values recently observed.

This mean that an user may really be under a fiber connection, connected via Wifi with a very bad link and the effectiveType may report 2g. but since we are talking about the “effective” we should be fine

This reported value is calculated based on the following table:

effectiveType (ECT)	Min. RTT	Max. Down
slow-2g	2000ms	50kbps
2g	1400ms	70kbps
3g	270ms	700kbps
4g	0ms	inf.

https://developer.mozilla.org/en-US/docs/Glossary/Effective_connection_type

Code Snippet

(function() {
    var connection = navigator.connection || navigator.mozConnection || navigator.webkitConnection;
    return {
        effectiveType: connection.effectiveType,
        rtt: connection.rtt,
        downlink: connection.downlink
    };
}
)();

onChange Event

We can also listen for connection info changes, using the following listener:

navigator.connection.addEventListener('change', ()=>{
  dataLayer.push({
     'event': 'connection-changed'
  });
});

April 20, 2020

Tracking the anchor text for the incoming links in Google Tag Manager

Introduction

It’s been a long time since I took care of this blog’s “Analytics” ( In the blacksmith’s house, a wooden knife). And I noticed that would be cool having the info about the Anchor Text the sites referring to my sites are using to link me.

So I’m sharing the solution I built today in order to capture which Anchor Text was on the referring URLs and sending the info back to Google Tag Manager and from there we’ll be able send an event to APP+WEB or to any other place we want 🙂

How it works

Execution Flow Chart

The flow chart on the right side, shows how the executions flow works. We’ll have 2 main pieces:

– One GTM CUSTOM HTML Tag
– One PHP File

The first one will the responsible of doing the main logic and doing a XMLRequest call to the second one that will take care of reading the current visitor referrer page and scrape it in order to try to find the current Anchor Text that the user clicked.

We’re using extense logic to void any kind of false positives/duplicate hits. For example when an user goes back into a mobile phone or swipes. We don’t want to consider these “page reloads” as landings despite they may still hold a valid referrer info.

SERVER SIDE CODE

PHP Snippet Code

First we need to upload the following php snippet to any server supporting PHP 7.x ( because of the use of arrays literals ).

This code can be highly improved for example for adding a timeout in the case the page is not reachable. If someone asks I may adding more sanity check for the script.

// David Vallejo (@thyngster)
// 2020-04-14
// Needs PHP7.X

if(!isset($_GET["url"])){
        die("missing url parameter");
}

$links = [];
if(isset($_SERVER["HTTP_REFERER"])){
        $url = $_GET["url"];
        $referrer_link_html_content = file_get_contents($url);
        $current_domain = str_replace("www.","", parse_url($_SERVER["HTTP_REFERER"], PHP_URL_HOST));
        $doc = new DOMDocument();
        $doc->loadHTML($referrer_link_html_content);

        $rows = $doc->getElementsByTagName('a');
        foreach ($rows as $row)
        {
                if($row instanceof DOMElement){
                        preg_match_all('/'.$current_domain.'/i', $row->getAttribute('href'), $matches, PREG_OFFSET_CAPTURE);
                        if(count($matches[0]) > 0){
                                $links[] = [
                                        "url" => $row->getAttribute('href'),
                                        "anchor_text" => $row->textContent
                                ];
                        }
                }
        }
}
header('Content-type: application/json; charset=UTF-8');
header("Access-Control-Allow-Origin: *");
echo json_encode($links, JSON_PRETTY_PRINT | JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES);
exit;

Python Snippet code

I know this code is not the best one since I’m not a python coder, but it can give an overall idea about how to run this based on the Python.

should be used like:

python anchor.py REFFERER_LINK LINKTOSEARCH

# use: python anchor.py REFFERER LINKTOSEARCH
#!/usr/bin/env python
import json
import urllib2
import requests
import sys
from bs4 import BeautifulSoup
from urlparse import urlparse

links = []

if len(sys.argv) > 1:
    url = sys.argv[1]
else:
    print("URL argument is missing")
    sys.exit()

if len(sys.argv) > 2:
    referrer = sys.argv[2]
else:
    print("REFERRER argument is missing")
    sys.exit()

headers = {'User-Agent': 'Mozilla/5.0'}
response = requests.get(url, headers = headers)
soup = BeautifulSoup(response.text, "html.parser")

for ahref in soup.select('a[href*="'+urlparse(referrer).netloc.replace("www.", "")+'"]'):
        links.append({
                "url": ahref.attrs["href"],
                "anchor_text": ahref.text
        })

print json.dumps(links, sort_keys=True,indent=4, separators=(',', ': '))

GTM Custom HTML Code

NOTE Remember that the following code needs to be added to GTM wrapped between <script></script> tags!

Also remember that we need to update the endPointUrl value to the domain where we’ve uploaded the PHP script

  (function(){
    try{
	  var endPointUrl = 'https://domain.com/getLinkInfo.php';
      // We don't want this to run on page reloads or navigations. Just on Real Landings
      if (window.performance && window.performance.navigation && window.performance.navigation.type === 0) {
          var referrer = document.referrer;
          var current_url = document.location.href;

          var grab_hostname_from_url = function(url) {
              var h;
              var a = document.createElement("a");
              a.href = url;
              h = a.hostname.replace('www.', '');
              return h;
          }
          // Only continue if the current referrer is set to a valid URL
          if (referrer.match(/^(?:http(s)?:\/\/)?[\w.-]+(?:\.[\w\.-]+)+[\w\-\._~:/?#[\]@!\$&'\(\)\*\+,;=.]+$/)) {
              // current referrer domain != current_domain
              console.log(grab_hostname_from_url(grab_hostname_from_url(referrer).indexOf(grab_hostname_from_url(current_url)) === -1))
              if (grab_hostname_from_url(referrer).indexOf(grab_hostname_from_url(current_url)) === -1) {
                  fetch(endPointUrl+ '?url=' + referrer).then(function(response) {
                      return response.json();
                  }).then(function(json) {
                      json.forEach(function(link) {
                          if (current_url.indexOf(link.url)>-1) {
                          //if (current_url===link.url.indexOf) {
                              window.dataLayer.push({
                                  event: 'incoming-link',
                                  linked_url: link.url,
                                  landing_url: document.location.href,
                                  referring_url: referrer,
                                  anchor_text: link.linkText
                              });
                          }

                      })
                  });
              }
          }
      }
      
    }catch(e){}   
  })();

Now we’re only one step away of having this working, we’ll need to setup a firing trigger for our tag, this ideally should be the All Pages trigger to get it fired asap.

Reported Data Info

dataLayer Key	dataLayer Value
`event`	incoming-link
`linked_url`	Current Link in the Referral Page
`landing_url`	Current URL
`referring_url`	Full Referrer Info
`anchor_text`	The Anchor Text on the referrer page linking to your site

Caveats

Please note that this solution relies on the current document.referrer, so don’t expect it to work for all referrals since some of them may be stripping the full referrer info, like Google SERPS do, or even some browser may end stripping the referrer details down to origin for privacy reason.

Also it may happens that the referring URL is linking to us in more than 1 place, on this case the scraping endpoint will return all the links and anchors texts matching. From that point of, it’s up to you how you report it in Google Analytics or any too 😀

In any case this should work for most of the common referrals traffic.

Working Demo Video

April 14, 2020

APP + WEB: Google Analytics Measurement Protocol version 2

The Google Analytics Measurement Protocol allows users and developers to make HTTP requests directly to Google Analytics endpoint in order to measure how users interact from any enviroment/platform.

Since Google announced the new APP+WEB Properties back in summer, we noticed that the &v parameter that used to hold a fixed 1 value turned to be a =2 value in our hit requests. Which implicitily means that at some point a new version of the Measurement Protocol is going to be released.

I tried to reverse-engineer all the details I could about the parameters used on this new upcoming protocol.

Please have in mind that the , and I’m publishing all the info I was able to gather.

Introduction

The new Measurement Protocol cames with some great new improvements over the version 1 that we’re used to see in our Universal Analytics hits.

I’d try to think about this new protocol as an enhanced version of the previous one. They even share some parameters.

What’s new on the version 2 protocol

This new measurement protocol seems to had been designed having some performance optimizations in mind.

First thing we need to have in mind is that APP+WEB doesn’t longer have “hit types“, everything we may end sending to APP+WEB is an “event” that may (or may not) be accompanied with parameters.

There 2 groups of parameters in the APP+WEB Measurement Protocol .
Let’s think about them as the event parameters “scope“.

Event Related Parameters ( ep.* , epn.* )
User Related Parameters ( up.* , upn.* )

Also the parameters accepts 2 diferente values types:

Batched Events

Now by default APP+WEB Protocol allows to send batched events, meaning that with a single hit request we’ll be able to send multiple events. I know this is not new at all, and we ever needed to debug an APP implemention we’d have noticed that version 1 protocol allowed us to send batched hits ( via /batch endpoint ).

In any case v2, comes with some extra enhanced comparted with the legacy version,

Events within a single hit request share parameters. So the hits payload will the smaller. for example won’t make much sense sending the &dl document.location for all the events if that’s a shared value across all event within the current hit.
POST is now the only accept Method. This will bypass the old GET 1082 bytes limit.

Debugging

Debugging the new measurument protocol v2 has became even easier, since the new properties offer a Debug View.

In order to have our hits showing up here, we’ll need to add a _dbg=1 parameter to our hits.

&_dbg=1

Then our hits will show up in the DebugView report in real time, making our debugging efforts much easier that they actual are.

Turning on the debug on the web based library

If you’re working on a website based implementation you can turn on the “oficial” debugging logs just loading the GTAG container with the &dbg={{randomNumber}} parameter:

https://www.googletagmanager.com/gtag/js?id=G-XXXXXXX&l=dataLayer&cx=c&dbg=918

This will turn on the debug output into our browser, giving us a log of detailed info about what’s happening.

Building a request

APP+WEB hits need to go to a new endpoint that is located on the following URL:

https://www.google-analytics.com/g/collect

As we mentioned in our technical overview for the new APP+WEB Properties now the hits al built in 2 separate parts:

URL QueryString will hold the common parameters
Request Payload ( POST ), this will hold the events related data.

The Request Payload will only be available when there’re more than 1 event on the current hit request. If the hit only contains one event, the parameter will be attached to the QueryString as the rest of the common shared parameters

The following code will help us to understand how should be build a hit, and also how to send it to APP+WEB Endpoint using the navigator.sendBeacon function.

// APP+WEB Endpoint
var endPoint = 'https://www.google-analytics.com/g/collect';

// Base Event Model for Web Hit
var eventModel = {
    v: 2,
    tid: 'G-XXXXXXXX-0',
    _p: Math.round(2147483647 * Math.random()),
    sr: screen.width + 'x' + screen.height,
    _dbg: 1,
    ul: (navigator.language || "").toLowerCase(),
    cid: '1908161148.1586721292',
    dl: 'https://appweb.thyngster.com/',
    dr: '',
    dt: 'APP + WEB Measurement Protocol version2 DEMO',
    sid: new Date() * 1,
    _s: 1
}

// A queue to batch our events
var events = [];

var requestQueryString;
var requestBody;

// Let's push some events 
events.push({
    'en': 'pageview'
});
// Second Event
events.push({
    'en': 'scroll',
    '_et': '5000',
    'epn.percent_scrolled': '90'
});
// Another more event
events.push({
    'en': 'useless_no_bounce_event',
    '_et': '5000',
    'ep.no_bounce_time': '5sec'
});

// Is there any event in our queue?
if (events.length > 0) {
    // If there's only one event, we'll not pushing a body within our request
    if (events.length === 1) {
        Object.assign(eventModel, events[0]);
    } else {
        requestBody = events.map(function(e) {
            return (Object.keys(e).map(key=>key + '=' + e[key]).join('&'));
        }).join("\n");
    }
    requestQueryString = Object.keys(eventModel).map(key=>key + '=' + encodeURIComponent(eventModel[key])).join('&');
    navigator.sendBeacon(endPoint + '?' + requestQueryString, requestBody);
}

Measurement Protocol Version 2 hit example . Multiple Events.

APP + Web Measurement Protocol v2 Hit Example . Just 1 Event

Parameters Reference

Request Parameters

These parameters are available across all hits. There are related to the current hit.

Parameter	Value Type	Value
v	int	Protocol Version
*tid*	string	Stream ID ( G-XXXXXXXXX )
*cid*	string	Client ID Value
sid	string	Session ID . ( current session start TimeStamp )
sr	string	Screen Resolution
*_dbg*	bool	Debug Switch
ul	string	User Language
*_fid*
*_uci*	bool
_p
*gtm*	string	Container Hash
_s	integer	Session Hits Count

Shared Parameters

Parameter	Value Type	Value
dl	string (url)	Document Location
dr	string (url)	Document Referer
dt	string	Document Title
*sid*	string	Session ID
*sct*	integer	Session Count
*seg*	boolean	Session Engagement
*_fv*	bool	First Visit
*_nsi*	bool	New Session Id
*_ss*	bool	Session Start
cu	string	Currency Code
_c

Event Parameters

Parameter	Value Type	Value
en	string	Event Name
*_et*	integer	Event Time
up.*	string	User Parameter String
upn.*	number	User Parameter Number
ep.*	string	Event Parameter String
epn.*	number	Event Parameter Number

Ecommerce

NOTE: I want to add that this was live on the latest gtag version one week ago, and that it seems it has been removed. In any case I wouldn’t expect to have changes on the final release.

We’re splitting the parameters related to the Ecommerce on 3 categories. We need to have in mind that APP+WEB have 2 main groups of models for the Enhanced Ecommerce, the Products Model and the Promotions Model.

Products Model, is used in every single ecommerce event that is sent to Google Analytics . Which includes product listings, products clicks, product details views, products adds to cart, products remove from cart, product checkout, products purchases and products refunds.

Promotions Model, this is the second model, this is for the promotions tracking in the Enhanced Ecommerce, since they’re not directly related to a product this is a total aside model used on APP+WEB

Product Items ( Shared Product Related data )
Product List Details ( Product Lists Related data , this goes along with Product Items )
Promotions

Product Items

Products Items are send under it’s own incremental key, &pr1, &pr2 … &prN . Then each of these parameters will hold all the product model info.

Example:

&pr1': 'idP12345~nmAndroid Warhol T-Shirt~lnSearch Results~brGoogle~caApparel/T-Shirts~vaBlack~lp1~qt2~pr2.0',

As you can see we can split the data within this parameter key by the tilde character ( ~ ) to be able to see a proper Product Model

id: P12345
nm: Android Warhol T-Shirt
ln: Search Results
br: Google
ca: Apparel/T-Shirts
va: Black
qt: 2
pr: 2.0

Parameter		Value Type	Value
pr[0-9]	id	string	Product ID/Sku
	nm	string	Product Name
	br	string	Product Brand
	ca	string	Product Category Hierarchy Level 1
	*ca2*	string	Product Category Hierarchy Level 2
	*ca3*	string	Product Category Hierarchy Level 3
	*ca4*	string	Product Category Hierarchy Level 4
	*ca5*	string	Product Category Hierarchy Level 5
	va	string	Product Variant
	pr	number	Product Unit Price
	qt	integer	Product Quantity
	cp	string	Product Coupon
	ds	number	Product Discount

Product Impressions

These are the Measurement Protocol related parameters to the products Impressions. They are complimentary to the product items. Expect these on the product impressions and product clicks events

Parameter	Value Type	Value
ln	string	List Name
li	string	List ID
lp	string	List Position

Transaction Related Data

The next table shows the parameters related to the transacion info.

Parameter	Value Type	Value
ep.transaction_id	string	Transaction ID
ep.affiliation	string	Transactionm Affiliation
epn.value	number	Transaction Revenue
epn.tax	number	Transaction Tax
epn.shipping	number	Transaction Shipping
ep.coupon	string	Transaction Coupon

Promotions

And finally the next table shows the parameters related to the promotions tracking. We should expect these parematers to be showing up into the promotion views and promotion clicks events

Parameter	Value Type	Value
pi	string	Promotion ID
pn	string	Promotion Name
cn	string	Creative Name
cs	string	Creative Slot (Position )
lo	string	Locationo ID

April 13, 2020

Tracking the Protocol version in Google Analytics via Google Tag Manager
Despite you being a SEO or not, I’m sure you’re aware of how important the WPO ( Web Performance Optimization ) and this of includes of course how fast your site loads. The faster it loads the better for your users ( and better for the conversion rates they say … ).

At this point you may have heard about HTTP/2 (2015) , which the replacement for the oldie HTTP/1.1 ( 1995) , you have even heard about http/3 ( last draft Feb 2020 ), which is ever a more modern Hypertext Transfer Protocol, witch runs over QUIC transport layer protocol and that now run over UDP instead of TCP.

Ok, I know all this may be too much unneeded technical details, but I found some clients that may have some different websites/servers, and they need to track their sites performs

Sooo, this time we’re going to learn how to track the request protocol version using for loading the current page and pushing it back to Google Analytics as a Custom Dimension.

We’ll need to create the following Custom JavaScript Variable in Google Tag Manager, We’ll be using it later in our Google Analytics Tags.
```
// getProtocolVersion()
function(){
    // Search on performance API for the navigation type entry and grab the info if available
    if(window.performance && window.performance.getEntriesByType("navigation") && window.performance.getEntriesByType("navigation").length > 0 && window.performance.getEntriesByType("navigation")[0].nextHopProtocol){
        return window.performance.getEntriesByType("navigation")[0].nextHopProtocol;        
    // This is deprecated on Chrome, but better checking in in case performance fails :)    
    }else if(window.chrome && window.chrome.loadTimes() && window.chrome.loadTimes().connectionInfo){
        return window.chrome.loadTimes().connectionInfo;        
    }else{
        // If nothing is available let's record the Scheme
        return document.location.protocol ? document.location.protocol.match(/[^:]*/)[0] : "(not set)";
    }
}
```
This piece of code mainly relies on the window.performance API from the browser, If it’s not available for any reason ( old browsers ) , a (not -set) will be set. ( NOTE: There’s a deprecated API in Chrome Browers: chrome.loadTimes(), that we’ll be checking in case performance is not available ).

What we do is checking for the “navigation” type entry in the performance API. Since we just need to know the main html request protocol details. ( the request that contains our HTML source )

After that we should be able to see the info in the preview mode, check the following screenshot:

Now we just need to create a new custom dimension index ( hit scope ) and map the value to this newly created variable. Or pass it as a Parameter to the page_view event if you’re already using APP+WEB Properties
March 23, 2020
Tracking Android In-App visits in Google Analytics
This is going to be a quick post about how to track in-app visits from Android devices.

When an Android App opens a website in a webview ( in-app visit ), the visit usually comes with an special referrer, It does start with “android-app” referrer string, here you can see a log line about how the referrers comes up.
```
77.XXX.XXX.XXX - - [20/Mar/2020:11:20:10 +0000] "GET /in-app-test HTTP/1.0" 200 1580 "android-app://org.telegram.messenger" "Mozilla/5.0 (Linux; Android 10; GM1913) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Mobile Safari/537.36"
```
Since this is a non standard referrer format (it doesn’t start with http), we could easily detect these visit with these rules:
- It doesn’t start with “http”
- It isn’t an empty value
Refer to the following trigger for Google Tag Manager:

This will only fire on the landing page ( subsequents pageviews will have a referrer starting with http ),then we’ll just have an event per session, We could for example fire a non-interactional event to Google Analytics.

And, then the events will start showing up in the real-time reports,

We could also calculate this within a Custom Variable in Google Tag Manager , and use it to force the visit attribution if needed, since Google Analytics will ignore the non-standard referrers and reports them as Direct Traffic.
```
function(){
  if (!document.referrer.match(/^http.*/) && !document.referrer.match(/^$/)){
      return document.referrer;
  }  
}
```
Before someone asks it’s not possible to detect the in-app visits from iOS afaik, and this won’t help on tagging these apps that open up a link into an standalone browser (like Whatsapp does).

Also I’m not 100% sure that all Android versions/apps will report thing the same way, but it seems to be common on latests android versions and major apps.
March 20, 2020
#Discussion :: GDPR Compliance – Google Analytics Setup Proposal
NOTE: I want to start this post with a big disclaimer over it I’m not publishing it in order to tell anyone how they should be doing the Google Analytics tracking to comply with the GDPR / CCPA .

The goal of this post being able to start an open discussion about the reliability of this exposed method and any final decision should be taken the site owners under their own responsability.

One of biggest issues I ( my clients ) are hitting when implementing a hard “cookies-consent wall” is that they would likely lost all the attribution info for at least all the people that bounces. Which can be a disaster if you use Google Analytics for reporting about how your investments in marketing are working. ( not to mention that losing the info about pageviews, sessions, for all that many amount of traffic ).

Let me show you my proposal for setting up Google Analytics for when the users didn’t yet selected an option for their cookies preferences:

Then, what are we doing here:
- If the current user didn’t yet selected his preference, we’ll be launching a pageview hit to Google Analytics
- This is not an standard hit/tracker initialization. It’s a stateless tracker with all the cookies writing disabled, the IP Anonymization enabled and with the AdsFeatures forcely set to false.
```
if (!userConsent) {
  ga('create', 'UA-123123123-123', 'auto', {
    'storage': 'none',
    'storeGac': false,
    'anonymizeIp': true,
    'allowAdFeatures': false
  });
  // We'll save the current clientId into a variable,
  // if later on, the user gives it's consent, we'll be using 
  // to write the cookie
  ga('set', 'customTask', function(tracker) {
    window._gacid = tracker.get('clientId');
  });
  ga('send', 'pageview');
}
```
At this point when the user lands we’ll be launching a pageview in order to track that session start, but no cookie will be used ( if the users reloads a new clientId will be genarated ). If at some point the user accepts the cookies, we’ll write down the uses random-generated-clientId into the cookie and we’ll be able to properly track the user journey.

All the tracking happens ( imo ) in a first-party content, and we’re respecting the user privacy while we takes a decision. It’s just an extra “anonymized” session starting hit, that will allow to keep a vision from where our traffic is coming.

Of course after the user has choosen not to be tracked, so this should only be used while our “consent-cookie” is not present, from that point on, we should obey to what our cookies states.

I really feel this respects the GDPR since there won’t be any cookies if the users doesnt’ explicitly allow it, and we’ re taking extra steps to protects the user privacy in all other ways we can when sending the hit.

In any case, I’m not a lawer nor an expert on user-privacy, so I’d love to have feedback from other people on this.

DISCLAIMER: This post in NOT mean to show a law-approved way to use Google Analytics. Please get a proper advise from an user-privacy expert or from your lawer before implementing your tracking the way is showed on this post.
March 18, 2020

sameSite Automated Fix and status reporting tool

It has been a hard week with all these vendors announcing the Four Hoursemen of the Cookies Apocalypse arrival.

There’re a lot of changes coming when we talk about cookies ( 1st, 3rd party ), ITP, GDPR, CCPA,etc . I understand it may be a terrible headache for anyone but we need to keep some calm.

Last update has came from Google Chrome, which was expected to start blocking the cookies not containing the sameSite attribute on 4th February. Luckily they have postponed around 2 weeks ( for now ).

One of the main concerns about this latests Chrome update 80 is that it’s not up to us to fix things ( or it is? ). If a cookie is being set via JS, that JS owner is the one in charge for setting the cookie properly. So we may expect some old libraries not being updated on time, or some vendors not even caring about taking care of this properly.

In order to deal with the sameSite switch for cookies, I’m releasing today a JS snippet that will hook in the document.cookie execution flow and will take care of two main tasks:

Reporting all the cookies that are not setting the sameSite value properly ( remember they need to have the sameSite and the Secure switches )
If we decide to turn it on, the script will be able to automatically add sameSite and Secure parameters automatically!

The script I’m sharing will take care reporting on the window.load event a list of cookies that didn’t either set the sameSite or the Secure switches for the cookies, and will also report the cookie name, if the cookie setting has been autofixed and the original and fixed cookie setting string.

{
'event': 'samesite-cookies-report',
'cookiesList': [{
                            'cookieName': {{cookie.name}},
                            'cookieSameSite': {{cookie.secure}},
                            'cookieSecure': {{cookie.sameSite}},
                            'autofixed': {{autoFix}},
                            'originalCookieString': {{cookie.string}},
                            'fixedCookieString': {{cookie.fixedString}}
}]
}

Then based on the samesite-cookies-report event on Google Tag Manager you could push this details as an event to Google Analytics or report it to anywhere else.

Main point of this script is being able to monitorize any cookie being set somewhere in our website, so we can contact their stakeholder to have it fixed as soon as posible

(function() {
  try {
    // Set this to true if you want to automatically add the sameSite attribute        
    var autoFix = false;
    var cookiesReportList = [];

    // Detecting if the current browser is a Chrome >=80
    var browserDetails = navigator.userAgent.match(/(MSIE|(?!Gecko.+)Firefox|(?!AppleWebKit.+Chrome.+)Safari|(?!AppleWebKit.+)Chrome|AppleWebKit(?!.+Chrome|.+Safari)|Gecko(?!.+Firefox))(?: |\/)([\d\.apre]+)/);
    var browserName = browserDetails[1];
    var browserVersion = browserDetails[2];
    var browserMajorVersion = parseInt(browserDetails[2].split('.')[0]);

    // We only want to hook the cookie behavior if it's Chrome +80 
    if (browserName === 'Chrome' && browserMajorVersion >= 80) {
      var cookie_setter = document.__lookupSetter__('cookie');
      var cookie_getter = document.__lookupGetter__('cookie');

      Object.defineProperty(document, "cookie", {
        get: function() {
          return cookie_getter.apply(this, arguments);
        },
        set: function(val) {
          var cookie = {
            name: '',
            sameSite: false,
            secure: false,
            parts: val.split(';'),
            string: val
          }
          cookie.parts.forEach(function(e, i) {
            var key = e.trim();
            cookie.parts[i] = e.trim();
            if (i === 0) {
              cookie.name = key.split('=')[0];
            }
            if (key.match(/samesite/)) {
              cookie.sameSite = true;
            }
            if (key.match(/secure/)) {
              cookie.secure = true;
            }
          });
          if (cookie.sameSite === false || cookie.secure === false) {
            if (autoFix === true && document.location.protocol==="https:") {
              if (arguments[0][arguments[0].length - 1]) {
                arguments[0] = arguments[0].substring(0, arguments[0].length - 1);
              }
              if (cookie.sameSite === false) {
                arguments[0] = arguments[0] + '; sameSite=None';
              }
              if (cookie.secure === false) {
                arguments[0] = arguments[0] + '; secure';
              }
            }
            cookiesReportList.push({
              'cookieName': cookie.name,
              'cookieSameSite': cookie.sameSite,
              'cookieSecure': cookie.secure,
              'autofixed': autoFix,
              'originalCookieString': cookie.string,
              'fixedCookieString': arguments[0]
            });
          }
          return cookie_setter.apply(this, arguments);
        }
      });
    }
    window.addEventListener('load', function(event) {
      dataLayer.push({
        'event': 'samesite-cookies-report',
        'cookiesList': cookiesReportList
      });
    });
  } catch (err) {}
})();

On the top of code you may see the following line:

var autoFix = false;

ok, if we change this to true, the script will automatically take care of accordingly adding the missing parts 🙂

One of the things to have in mind is that we need this code to be run as soon as possible on the page execution flow, so if we’re setting this via GTM, we’ll need to setup this tag to fire on the first event on the page ( most of time will be “All Pages ), and give it some extra priority:

WARNING: If you only plan to use this script as a reporting tool you can stay safe. If you plan to use the autofixing feature, please have in mind that I only tested in some sites of mine, so it’s your liability to properly setting and testing it up in your site before going live.

If you’re really interested on knowing more about how cookies are updating their behaviour to protect the users privacy, best place is https://www.cookiestatus.com/ . A site where my friend Simo is collecting all the info about almost all the mainstream browsers out there.

February 9, 2020

The Definitive Approach for preventing duplicate transactions on Google Analytics – Using a Universal CustomTask

It’s been a long time since I wrote my post about how to prevent duplicate transactions on Google Analytics. At that point, the customTask wasn’t a thing on the Google Analytics JS library, and the approach consisted of writing a cookie on each transaction and then work with some blocking triggers.

It’s a working solution for sure, but based on all the feedback I had over the years, it was not easy to understand for people. Things got worse even with the Enhanced Ecommerce since there’s no specific hit type to block ( remember that on EEC, any hit is used as a transport for the Ecommerce data ).

That’s why I’m releasing a completely new approach to prevent duplicate transactions on Google Analytics. It’s based on the customTask functionality and it will work out of the box independently on how you have set up your Enhanced Ecommerce Tracking, sound good yes?

If you wonder how are we going to achieve this, take a look at the following flow chart

Basically, we’ll check the current hit payload to find out if it has any transaction-related data, and, only, in that case, we’ll be removed the e-commerce related data from the hit, If that transaction has been already tracked on the current browser ( we’ll be using a cookie to keep track of recorded transactions, just as we used to do in our old solution )

To have this working the only thing we need to do it to create a new Variable in Google Tag Manager with the following for our “duplicate transactions blocking customTask” .

*Note that I tried t add as many comments as I could in the customTask code, so please take some time to understand how it works! 🙂

function() {
  return function(customTaskModel) {
    var originalSendHitTask = customTaskModel.get('sendHitTask');
    // Helper Function to grab the rootDomain
    // Will help on seeting the cookie to the highest domain level
    var getRootDomain = function() {
      var domain = document.location.host;
      var rootDomain = null;
      if (domain.substring(0, 4) == "www.") {
        domain = domain.substring(4, domain.length);
      }
      var domParts = domain.split('.');
      for (var i = 1; i <= domParts.length; i++) {
        document.cookie = "testcookie=1; path=/; domain=" + domParts.slice(i * -1).join('.');
        if (document.cookie.indexOf("testcookie") != -1) {
          var rootDomain = domParts.slice(i * -1).join('.');
          document.cookie = "testcookie=1; expires=Thu, 01 Jan 1970 00:00:01 GMT; path=/; domain=" + domParts.slice(i * -1).join('.');
          break;
        }
      }
      return rootDomain;
    };
    // The custom Task
    customTaskModel.set('sendHitTask', function(model) {
      try {
        // Let's grab the hit payload
        var rawHitPayload = model.get('hitPayload');
        // We're converting the payload string into a key=>value object
        var hitPayload = (rawHitPayload).replace(/(^\?)/, '').split("&").map(function(n) {
          return n = n.split("="),
            this[n[0]] = n[1],
            this
        }.bind({}))[0];

        // Let's check if this hit contains a transaction info
        // if the hit contains a &pa parameter and the value equals to "purchase" this hits contains a transaction info        
        if ((hitPayload && hitPayload.pa && hitPayload.pa === "purchase")) {
          // Let's grab our the previous transactions saved in our cookie ( if any )  
          var transactionIds = document.cookie.replace(/(?:(?:^|.*;\s*)__transaction_ids\s*\=\s*([^;]*).*$)|^.*$/, "$1") ? document.cookie.replace(/(?:(?:^|.*;\s*)__transaction_ids\s*\=\s*([^;]*).*$)|^.*$/, "$1").split('|') : [];
          // if the current transaction ID is already logged into our cookie, let's perform the magic
          if (transactionIds.length > 0 && transactionIds.indexOf(hitPayload.ti) > -1) {            
            // EEC hit keys magic regex. The following regex will match all the payload keys that are related to the ecommerce
            var eecRelatedKeys = /^(pa|ti|ta|tr|ts|tt|tcc|pr(\d+)[a-z]{2}((\d+)|))$/;
            // Now we'll loop through all the payload keys and we'll remove the ones that are related to the ecommerce
            for (var key in hitPayload) {
              if (key.match(eecRelatedKeys)) {
                delete(hitPayload[key]);
              }
            }
            // not let's update the payload into the hit model! :)
            model.set('hitPayload', Object.keys(hitPayload).map(function(key) {
                return key + '=' + hitPayload[key];
            }).join('&'), true);            
          } else {
            // IF the execution arrived to this point. It means that this is a NEW transaction
            // Then, we'll do nothing to the payload but instead we'll be adding the current transaction ID to our cookie
            transactionIds = [hitPayload.ti].concat(transactionIds);
            var _expireDate = new Date();
            // This cookie will expire in 2 years
            _expireDate.setMonth(_expireDate.getMonth() + 24);
            document.cookie = "__transaction_ids=" + transactionIds.join('|') + ";expires=" + _expireDate + ";domain=" + getRootDomain() + ";path=/";            
          }

        }
        // Send the hit
        originalSendHitTask(model);
      } catch (err) {
        // In case the above fails, we want to send the hit in any case!!!
        originalSendHitTask(model);
      }
    });
  };
}

We’re done. From now all this customTask will be taking care of detecting transactions traveling on the hits, writing it to a cookie and removing the transaction data from the hit if needed!

You don’t need a blocking trigger
You don’t need an extra condition on your firing trigger
You don’t need a variable for checking for the value of the cookie
It’s doesn’t matter how you’ve set up your e-commerce tracking, the customTask will work despite your current approach ( sending it with the default pageview, or an event, or if using the dataLayer data or based on a variable that builds up the e-commerce data for GTM ).
You won’t need to block your default pageview on the confirmation page to have the ecommerce working without duplicates.

It will just simply work!

Of course, you may want to block some other tags from firing since the customTask will write all the data into a cookie, it would be accessible for you to use it at your need. Just grab the “__transaction_ids” cookie value, and search for your already recorded transactions

November 21, 2019

Blog

GTM/GA Debugger Features

Changelog

Getting the ASN DB

PHP Example

Extra – Grabbing the network domain

Code Snippet

onChange Event

Introduction

How it works

Execution Flow Chart

SERVER SIDE CODE

PHP Snippet Code

Python Snippet code

GTM Custom HTML Code

Reported Data Info

Caveats

Working Demo Video

Introduction

What’s new on the version 2 protocol

Batched Events

Debugging

Building a request

Parameters Reference

Request Parameters

Shared Parameters

Event Parameters

Ecommerce

Product Items

Product Impressions

Transaction Related Data

Promotions