Blog

  • Cross-Domain Tracking on Google Analytics 4 (GA4)

    Now that Universal Analytics deprecation has been officially announced, it’s time to start writtingsome technical posts about how things works on Google Analytics 4.

    First one, is going to be about how Cross Domain Tracking works, ( not only for GA4 but also for any GTAG.JS based tool ).

    Back in time (about three years ago) I noticed a new “_gl” parameter being attached to some links by Google Analytics, I’d say it stands for “Google Linker” , just looking how it offers support for the cross-domain tracking not only for Google Analytics but also for all the other pixels based on the gtag containers (Adwords, Doubleclick ) and not only for Google Analytics. At that point I started to reverse engineer the code to see how it works. If you’re interested on it, you can take a quick look to a notes draft I wrote at that time on this Gist:

    To the date Google has only officially published a way to perform the cross-domain tracking . and this is adding our domains in the Data Stream configuration section. ( Admin > Data Streams > Select the stream > More Tagging Settings > Configure your domains )

    We may think that some things are happening in the backend, but what’s is really happening when we add some configuration in our Admin section is that the current served GTAG Container get’s some extra settings.

    On this case when we add some domains to our configurationm the GA4 GTAG Container is adding a click/submit listener with some condictions based on the current domains we’ve added. If you’re used to work with Google Tag Manager, think about this like adding a Click Listener and a Form Submit Triggers with a triggering condition based on these domains list.

    Please have in mind, that on this case, this will also affect the in-build outgoing links tracking on Google Analytics 4 (It will not trigger the events if the current link domain name is within the defined list on the admin side of our Data Stream).

    If you read in between lines from the previous paragraphs you may have guessed that there must be some code on the GTAG.js code taking care of the cross-domain linking, and this means that we can reverse it.

    How the Cross Domain Tracking works on the GA4/GTAG

    First think is looking how the current new linker parameter looks like in GA4:

    _gl=1*tp0qzs*_ga*OTYxNDI4MjA4LjE2NDg1NzM2OTM.*_ga_RNYCK86MYK*MTY0OTE3NjMxMy41LjEuMTY0OTE4MDM1OC4w

    We can easily identify that we could split this in some different values by the “*” character:

    KeyValue
    1tp0qzs
    _gaOTYxNDI4MjA4LjE2NDg1NzM2OTM
    _ga_RNYCK86MYKMTY0OTE3NjMxMy41LjEuMTY0OTE4MDM1OC4w

    The first value ( if I’, not wrong ), it’s a fingerprint based on the current user agent and browser plugins plus a time checking hash. This will check that the current browser getting the linker is the same as the one that generated it and that it was not generated long ago ( it used to have 2 minutes expiration time on Universal Analytics Linker ). This is done to prevent the cookies being overridden by mistake because shares a link with a linker value on it.

    Please note, that we will have as many _ga_[A-Z]{6,9} keys as GA4 Streams being used in our website . and that also will be holding some other cookies values like the Adwords or Double Click ones. This will vary depending on your current setup. For now we’ll be just focusing on the Google Analytics 4 ones.

    GA4 Cookies Info

    If we look back into Google Analytics history, we used to have _utma, _utmb, _utmc, _utmz cookies for Urchin (urchin.js) and First Google Analytics version (ga.js). At this timet all client/session tracking ( a,b,c cookies ) and attribution info (z cookie) was being calculated client-side, then again when Universal Analytics was released all this logic was moved to the server-side and then Google switched to use just one cookie ( _ga ) to keep hold the clientId ( &cid ) .

    Now, it seems we switched to some hybrid method, where the session calculation is done client-side again and the attribution it’s being calculated on the server-side. That’s why we have a new cookie (in addition to the _ga one ), that’s is being used to hold the current session_id (&sid), session_count (&sct), session_time, and based on this report also the session_start (&_ss), first_visit (&_fv) internal events .

    Let’s take a look to a typical Google Analytics 4 Cookies Set:

    _ga:             GA1.1.961428208.1648573693
    _ga_THYNGSTER:  GS1.1.1649176313.5.1.1649181571.0

    There has not been any changes in our well-known “_ga” cookie. It still holds the current hostname segments count, a random Id and lastly the time when the cookie was created for the first time. Here is a table showing the current values ( I need to double check 2 of the values to be sure what are they for, that’s why I actually set them as TBD )

    ValueDescription
    GS1Fixed String: Google Stream 1
    1Current hostname segments count
    1649176313Current Session Start Time
    5Sessions Count
    1TBD
    1649181571Last hit Time
    0TBD

    This is the way GA4 is able to determine when a session start, the session duration, and the current session count, even checking the _ga cookie it will be able to define the first visit time.

    Looking Inside the Google Analytics 4 Cross-Domain Tracking

    If we take a look at the officials docs, there’s no much info about how to customize the cross-domain tracking, beside than telling us to add our domains to the admin.

    This can be kind of an issue/limitation if our setup is not just based on clicks, or form submits, I can think on some examples like wanting to do a cross-domain linking to an iFrame, or if our site is redirecting to a destination page that it’s generated dynamically ( for example for these forms doing a validation and then redirecting the user to a search listing page, without doing any form submit at all )

    These situations won’t be handled by GA4 and on these cases we’d want to get the build the linkerParam for GA4 so we can attach it wherever we want.

    The good point is that , even if it’s not documented, we can use some global variables to grab the data, and even make use of some helpers available to generating the linker, decorate links or anything we want πŸ™‚

    First thing we’ll we learn is about the window.google_tag_data.gl object. This global variable will be holding ALL the cross-domain config and information. This is: the current data model for the Google Linker Config.

    It’ll looks like:

    {
        "decorators": [
            {
                "domains": [
                    /outgoing\.com/
                ],
                "fragment": false,
                "placement": 1,
                "forms": true,
                "sameHost": false
            }
        ],
        "data": {
            "query": {},
            "fragment": {}
        },
        "init": true
    }
    decorators[:decorator] This holds the current decorators for the cross-domain, we have have more than 1 decorator since we have may have an AdWords gtag container adding one, and GA4 another one.
    :decorator.domainsAn array of regexes to be matched against the current clicked link
    :decorator.fragmentShould we attach the _gl parameter to the fragment (#)
    :decorator.formsIs this decorator for “submit” events
    :decorator.placementThe current order preference for applying the decorator
    :decorator.sameHostN/A
    data{}Current Linker Info
    data.queryCross-linking parameters read from the QueryString ( decoded )
    data.fragmentCross-linking parameters read from the Fragment ( decoded )

    It’s more simple that it looks at a first look. This variable will allow us to know if there are any “decorators” configured ( a decorator is meant to “add” the linkerParam (_gl) to any Link or Form that is matching the current domains on the list.

    Also if on the current page load there was a valid linker parameter, the data jet will show us which clientIds ( cookies ) were overriden . Nice!


    Note: In the event of an invalid linker (if it was generated on a different browser or some minutes ago , the linker won’t work and the data here will show empty)

    Trick #1 – Adding a new domain name for the auto-decoration

    There’s a small trick you can do to add new domains dynamically to GTAG Decorator. Remember we said that the current settings on the Admin were reflected on some code in our GTAG containers?, after checking the current code, we can push new domains programatically from JS, just like this:

    if(window.google_tag_data && window.google_tag_data.gl.decorators && window.google_tag_data.gl.decorators.length > 0){
         window.google_tag_data.gl.decorators[0].domains.push(/analytics-ninja\.com/)
    }

    glBridge Utilities

    If the case that we want to grab the current linker to we can add it ourselves to any link. There are also some good news, analytics.js exposes some utilities for performing this task.

    The utils are available on the window.google_tag_data.glBridge object

    As you can see there are the same as we use to have on Universal Analytics for setting the autoLinking, the decoration of links, the linkerParam generation. We are just focusing are the generate and get ones, the first one will be equivalent to the getLinkerParam , the second one will allow us to “unhash” the linker values.

    google_tag_data.glBridge.generate({})

    This function takes an object of clientIds values as an argument and returns a valid “_gl” linker value that we could attach to our links.

    window.google_tag_data.glBridge.generate({
        _ga: '121321321321.2315648466',      
        _ga_THYNGSTER: '1649176313.5.1.1649183273.0'
    });

    As you can see, we’ll need to grab our current cookies values and just pass them to the function and it will return our precious linkerParam πŸ™‚


    google_tag_data.glBridge.get()

    This one is also pretty self-explanatory, it will grab the linker param from the current URL (if present) and it will return the client cookies/id decoded values hold on the linker .

    Advise

    Please note that universal.js is likely to be gone in like 1.5y. I don’t expect Google to remove the analytics.js library and just stop processing hits at some point ( or maybe modifying the library so it doesn’t fire hits at all). At the moment the gtag.js container doesn’t expose this brigde functions, but it may do in some near future.

    If you’re look to some guidance about how to implement this functionality without relying on that library I’m providing some examples ( reversed from universal.js code )

    If the current linkerParam value is “invalid” ( ie: was not generated from the same browser or it generated long ago ), this function will just return an empty object {}

    NOTE: I’m working on a library that totally replicates this analytics.js Google Linker Bridge functionality . To have some future proof solution for when Universal Analytics is sunsetted. It will be publish in the next weeks.

    If you’re interested in all this I’m publishing some proof of concept functions that you could use as a base for your coding. This code should be adapted to support the Adwords, DoubleClick, multiple GA4 cookies, Google Signals ids ( _gid cookie ) , Google Remarketing Cookies ( _gac ) to be able to say that it’s a good replacement. But it this point I’m offering these snippets ( all of them reversed/copied from the analytics.js source code )

    var decrypt_cookies_ids = function(a, b) {
            var P = function(a) {
                if (encodeURIComponent instanceof Function) return encodeURIComponent(a);
                F(28);
                return a
            };   
            var m = function(a, b) {
                for (var c in b) b.hasOwnProperty(c) && (a[c] = b[c])
            };
            
            var H = function() {
                var a = {};
                var b = window.google_tag_data;
                window.google_tag_data = void 0 === b ? a : b;
                a = window.google_tag_data;
                b = a.gl;
                b && b.decorators || (b = {
                    decorators: []
                }, a.gl = b);
                return b
            };
                 
            var c = P(!!b);
            b = H();
            b.data || (b.data = {
                query: {},
                fragment: {}
            }, c(b.data));
            c = {};
            if (b = b.data) m(c, b.query), a && m(c, b.fragment);
            return c
        }
    var generateLinkerParam = function(a) {
        // Function to properly grab ID's from Cookies
        var getCookiebyName = function(name) {
            var pair = document.cookie.match(new RegExp(name + '=([^;]+)'));
            return !!pair ? pair[1].match(/GA1\.[0-9]\.(.+)/)[1] : undefined;
        };
    
        // These are the 3 values used by the new linker
        var cookies = {
            _ga: getCookiebyName("_ga"),
            // Google Analytics GA ID
            _gac: undefined,
            // Google Remarketing
            _gid: getCookiebyName("_gid")// Google ID
        };
    
        // Calculate current browser_fingerprint based on UA, time, timezone and language
        // 
        var browser_fingerprint = (function(a, b) {
            var F = function(a) {
                // Didnt check what this does, the algo just needs F to be defined. commenting out
                Ne.set(a)
            };
            a = [window.navigator.userAgent, (new Date).getTimezoneOffset(), window.navigator.userLanguage || window.navigator.language, Math.floor((new Date).getTime() / 60 / 1E3) - (void 0 === b ? 0 : b), a].join("*");
            if (!(b = F)) {
                b = Array(256);
                for (var c = 0; 256 > c; c++) {
                    for (var d = c, e = 0; 8 > e; e++)
                        d = d & 1 ? d >>> 1 ^ 3988292384 : d >>> 1;
                    b[c] = d
                }
            }
    
            F = b;
            b = 4294967295;
            for (c = 0; c < a.length; c++)
                b = b >>> 8 ^ F[(b ^ a.charCodeAt(c)) & 255];
            return ((b ^ -1) >>> 0).toString(36);
        }
        )();
    
        // Function to hash the cookie value
        // The following functions takes a string and returns a hash value.
        var hash_cookie_value = function(val) {
            var A, C, D = function(a) {
                A = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_.";
                C = {
                    "0": 52,
                    "1": 53,
                    "2": 54,
                    "3": 55,
                    "4": 56,
                    "5": 57,
                    "6": 58,
                    "7": 59,
                    "8": 60,
                    "9": 61,
                    "A": 0,
                    "B": 1,
                    "C": 2,
                    "D": 3,
                    "E": 4,
                    "F": 5,
                    "G": 6,
                    "H": 7,
                    "I": 8,
                    "J": 9,
                    "K": 10,
                    "L": 11,
                    "M": 12,
                    "N": 13,
                    "O": 14,
                    "P": 15,
                    "Q": 16,
                    "R": 17,
                    "S": 18,
                    "T": 19,
                    "U": 20,
                    "V": 21,
                    "W": 22,
                    "X": 23,
                    "Y": 24,
                    "Z": 25,
                    "a": 26,
                    "b": 27,
                    "c": 28,
                    "d": 29,
                    "e": 30,
                    "f": 31,
                    "g": 32,
                    "h": 33,
                    "i": 34,
                    "j": 35,
                    "k": 36,
                    "l": 37,
                    "m": 38,
                    "n": 39,
                    "o": 40,
                    "p": 41,
                    "q": 42,
                    "r": 43,
                    "s": 44,
                    "t": 45,
                    "u": 46,
                    "v": 47,
                    "w": 48,
                    "x": 49,
                    "y": 50,
                    "z": 51,
                    "-": 62,
                    "_": 63,
                    ".": 64
                };
                for (var b = [], c = 0; c < a.length; c += 3) {
                    var d = c + 1 < a.length
                      , e = c + 2 < a.length
                      , g = a.charCodeAt(c)
                      , f = d ? a.charCodeAt(c + 1) : 0
                      , h = e ? a.charCodeAt(c + 2) : 0
                      , p = g >> 2;
                    g = (g & 3) << 4 | f >> 4;
                    f = (f & 15) << 2 | h >> 6;
                    h &= 63;
                    e || (h = 64,
                    d || (f = 64));
                    b.push(A[p], A[g], A[f], A[h])
                }
                return b.join("")
            };
            return D(String(val));
        };
    
        // Now we have all the data Let's build the linker String! =)
        // First value is a fixed "1" value, the current GA code does the same. May change in a future
        return ["1", browser_fingerprint, "_ga", hash_cookie_value(cookies._ga), "_gid", hash_cookie_value(cookies._gid)].join('*');
    };
  • Android Debugger 0.0.1 Release . Debugging Native Apps on Android

    This past February, I attended to the SuperWeek Conference in EgerszalΓ³k / HU, where I decided to participate on the Punchcard Prize with a new tool that I had been using interally over the past years. The only difference is that I used to use it with some command line interface and for releasing it I build my first Window application ever ( beside an NFO Viewer for an European demogroup back in the years ).

    After being in a closed Beta for a few weeks and fighting for 2 weeks to get my company certified for a code signing certificate. I’m releasing my Android Debugger to everyone.

    .



    This new tool will allow anyone to easily debug any Firebase (GA4) , GTM ( Limited Info ) and Universal Analytics ( GA3 ) implementations on any Android App.

    I created some videos showing up all the current version features and how to use it in a detailed video with even some voice over. These are the main features:

    Firebase ( GA4 )

    View the events ( included the autogenerated ones, session_start, app_background, etc ) batches in real time, including all the info included the event parameters , user properties, audiences, and any other internal payload data.

    Google Tag Manager

    This is a work in progress report. Right it will allow to see which containers are being loaded, which events are being triggered and the list of varibles being evaluated. ( note this is not app based and it will report all the hits coming from the current connected device )

    Google Analytics ( GA3 )

    Most of you should be used to the data on this report, it basically report the Universal Analytics ( GA3 ) hits payloads ( note this is not app based and it will report all the hits coming from the current connected device )

    More Features

    But this are not the only features it includes, two of the most painful points when needed to debug an Android App are:

    1. Installing the USB Drivers
    2. The need of using libraries or modifying the manifest file

    I’ve some good needs this is not longer an issue with Android Debugger . You will be able to debug any app installed on your physical device or emulator, ( yep, I said ANY ), without needing to modify or ask the developers to make any updates or deploying any debug version.

    And if you’re using Android +11, you’ll be able to pair your device with the debugging just scanning a QR code. Yep, that’s right, no more risk searching for USB drivers on forums, no more issues trying to have your computer properly recognizing your device!

    Of course if you still prefer it you can install the drivers and debug your device using the USB connection, whatever you prefer!

    And last but not least, you will be able to cast/control your device screen from the debugger, which will allow you to record your debugging sessions ( remember the debugger shows the data in real time ), or you ‘ll be able to share your debugging with clients/co-workers ) . How cool is that?. I have to mentions this feature is offered by scrcpy ( I wish I was remotely capable of writing something like that! ).

    How to use it

    Installation

    You can DOWNLOAD the installer from the following link:
    https://www.analytics-debugger.com/tools/android/ , and the tool with be opened as soon it’s installed.

    Note: The current installer/tool is signed, but Windows needs some installs in order to verify the application, because of this some of you may be getting this message. Just click on the Run Anyway button.

    The first time you run the tool, you’ll be asked to input a license key, don’t worry since this is FREE. Just click on the link provided to create a new account and get a free license.

    More notes

    Some people may get a windows notification about the app wanting to use the networks, this happens when you want to do the Wireless Pairing, which is basically what Windows alersts you about. This will only be shown once.

    Support

    I’m offering this tool as “it is” , and I’ll be trying to make improvements and updates as long as my daily work allows me to work on it. In you have any bug report/comment please reach me on Twitter.

    Links

    Tool Page / Download: https://www.analytics-debugger.com/tools/android/

    Extra (not directly)Related

    Purchard Prize. It reads: “THE VERY FIRST POST-COVID19 BRONZE PUNCHCARD PRIZE #SUPERWEEK ANALYTICS SUMMIT”
    Me, trying not to collapse during my presentation on the SuperWeek 2022. I ended on the 3rd position.

  • GTM/GA Debugger 1.0.0

    It took me almost a year ( once again ) to have a new version, but this time I did the things properly, I’ve restarted the tool totally from scratch, I re-did the dev environment, I re-coded the detections code, I changed the CSS Framework, I updated the JS backend.

    There have been hundreds of commits, hundreds of hours , and so much fun and learning in the process. If you ask me the main features beside the more accurate reporting are the new Preview Enhancer and the new full support for Google Analytics 4.

    GTM/GA Debug 1.0.0 Splash Screen



    In the other side, I’ve reestructured the whole tool allowing me now publishing standalone functions/ fixed individually ( which was not possible before and the main reason for not having some regular updates ).

    The are a lot of new cool stuff in the upcoming features queue, so stay tunned :).

    Now, before listing some of the new stuff, I would like to mentioned that you can support this tool in some several ways,

    Buying me a coffee : Yep, I can’t stop eating or even dring, but I won’t go far without my daily coffee doses.

    Leaving a Review at CWS: I want to leave a review .

    OVERALL

    • Much more curated layout (now most things looks fine on responsive mode )
    • Less memory usage
    • Improved reporting quality
    • Special work on β€œSPA” pages repoting ( a new β€œvirtual” page block is created )
    • Upgraded from Vue2 to Vue3
    • Moved from Vuex to Pinia
    • Moved from Bulma.io to TailwindCSS
    • Hundreds of fixes
    • New GA / GA4 Blocking tool
    • Better error management
    • Better UX

    GTM

    1. Google Tag Manager Detection is now much accurate and faster.
    2. It now supports any Google Tag Managertype container ( including GTAG/AMP ones )
    3. It now supports even support multiple dataLayer.
    4. any dataLayer push types are now supported (Functions, [], etc )
    5. Improved GTAG pushed report ,, they now show the commands and paramters
    6. New Preview Mode Enhancer, do into preview with a single click, avoiding the race condition created by the native GTM preview
    7. New active previews report , shows all containers on preview including the current worsspace name and preview time stamp
    8. Now you can preview AMP Containers!

    GA

    1. Added support for server side hits
    2. Added support for AMP hits
    3. Consent Mode reporting ( shows if the hits contain the Google Consent mode data )

    GA4

    • New Reporting
    • Support for Server Side Hits
    • Better Items Reporting
    • Report on GA4 Server Side Reponse hits
    • Report on GA4 Server Sude Set Cookies
    • Better batching waiting report ( it shows a spinner )

    EEC

    • Better Ecommerce detection
    • Improved data tables
    • Impressions are now shown grouped
    • Promotions and impressions are updated in real time

    Check some demostrations videos below

  • Storage-Less Session Tracking with Google Analytics

    This weekend I was doing a, long-due, room cleanup and I found buried in the bottom of a drawer an old hard I was using some years ago. Within it’s content I found one folder named “WIP” ( Work In Progress ) and there I found some experiments, tools and proof of concepts I was working on 4-5 years ago and that got lost on the desk drawer when I upgraded my computer to a SSD drives.

    It seems that at some point I was playing around with some way of doing a “storage-less” session tracking for Universal Analytics.

    We’ll be relying on the window.name to keep our clientId across our user’s navigation journey. There will be some handicaps of course, but at some point someone may find these handicaps a reasonable price for keeping their users privacy in place.

    The window.name property is used for setting targets for hyperlinks ( if you even wondered how some sites can open links on some specific window/tab ).


    Another good point is that is widely supported. While on JavaScript 1 it was a only read property since JavaScript 1.1 it’s a read/write.

    window.name browser compatibilty

    With all this said, let’s started setting up everything for our tracking in Google Tag Manager , we’ll be only need 2 variables, one will be simple JavaScript Variable that we’ll be using for reading the clientId from the window.name property back to the tracker ( Universal Analytics Tag ), and a customTask Variable for writing the clientId.

    We’ll just need 2 variables for getting the tracking in place, one will trying to read the clientId from the window.name property and will return undefined if it’s not set. For sanity reasons, we’re encoding the clientId using BASE64 and using a prefix to properly detect if the current stored value stored is valid. We’ll be using this variable as the clientId field in our tags:

    The second one, it’s pretty simple customTask variable , that we’ll grabbing the clientId from the tracker model and writting it down to our window.name property.

    cjs – customTask – set window.name

    function(){  
      return function(model) {
          window.name = "CLT:" + btoa(String(model.get('clientId')));
      } 
    }


    cjs – clientId

    function(){
      if(window.name.match(/^CLT/)){
        return atob(window.name.split('CLT:')[1]);
      }else{
         return undefined
      };  
    }

    Now that we have everything, let’s configure our tags, we’ll need to set the “storage” and “cookieUpdate” fields to none and false to properly prevent our tracker to set any cookie.

    Since we want to make this 100% GDRP/Privacy compliant, we’re seeting the storeGac to false and we’re switching on the IP anonimization.

    We’re all set now. Our clientId will be kept as long as the user stays on the current tab. This means that the client won’t be kept if:

    • Open a links on a new tab/window ( target: _blank or right clicking on a link )

    I know this tracking approach won’t be enough for most of users, needing to keep the current users cross-session state across sessions, but it will be enough if we only want to track sessions.

    Also, take in mind that the mainly pourpose of this post is showing a way

    Video Demo

  • How to redact PII Data from Google Analytics 4 hits

    If I were asked about some missing feature on Google Analytics 4 ( a.k.a. APP+WEB, New Web Analytics ), I would say it would be the lack of the customTask functionality that my friend Simo has leveraged in the last years.

    Sadly at the moment there’s nothing similar available ( I really hope to have something in the future ). In the past I collaborated on this Brian Clifton’s post/code about How to Remove PII from Google Analytics, So I decided to base the redacting logic on it, just because a lot of people may have already some custom regex list and setup that could be re-used on here.

    How it works

    Google Analytics 4 bases it’s tracking on using navigator.sendBeacon for sending the hits, and falling the old-fashined new Image() functionality if for any reason the current browser doesn’t support the first one.

    What we are doing in Monkey Patching the browser’s sendBeacon functionality using a Proxy Pattern. In order to remove any PII (Personally Identificable Information) from hits payload before they reach the Google Analytlics 4 Endpoint.

    Monkey patchingΒ is a technique to add, modify, or suppress the default behavior of a piece of code at runtime without changing its original source code. It has been extensively used in the past by libraries, such as MooTools, and developers to add methods that were missing in JavaScript.

    https://www.audero.it/blog/2016/12/05/monkey-patching-javascript/

    I don’t expect GA4 to be failing over the new Image hits many times, but I’m currently working on adding some support for also redacting the hits being sent using this method.

    Before going forward

    Monkey Patching “never is” a the right way to go, but neither Google Analytics 4 or sendBeacon offers anything to achieve this functionality, so it’s the last option to go.

    The current code, only tried to override the hits going to Google Analytics 4 endpoint, and leaves any other hits to go in a transparent mode. I’ve also tried to check everything I was able to think of in order to prevent any issues.

    Setting Up Everything

    The only thing you need to do is running the attached code to your site, “before” GA4 fires any hit.

    If you are using Google Tag Manager you should be using the Tag Secuencing for firing the code before the Config tag is fired, refer to the next screenshot for more details:

    If you’re using Tealium, you should run this as a “Pre Loader” extension for example.

    Example of Redacted GA4 Payload Hit

    The Code

    (function() {
    
        /*
        *  
        * Analytics Debugger S.L.U. 2021 ( David Vallejo @thyng )
        *  MIT  License
        * All redact Logic is ran within this function
        * 
        */
        window.__piiRedact = window.__piiRedact || false;
        var piiRedact = function piiRedact(payload) {
            // Regex List
            var piiRegex = [{
                name: 'EMAIL',
                regex: /[^\/]{4}(@|%40)(?!example\.com)[^\/]{4}/gi,
                group: ''
            }, {
                name: 'SELF-EMAIL',
                regex: /[^\/]{4}(@|%40)(?=example\.com)[^\/]{4}/gi,
                group: ''
            }, {
                name: 'TEL',
                regex: /((tel=)|(telephone=)|(phone=)|(mobile=)|(mob=))[\d\+\s][^&\/\?]+/gi,
                group: '$1'
            }, {
                name: 'NAME',
                regex: /((firstname=)|(lastname=)|(surname=))[^&\/\?]+/gi,
                group: '$1'
            }, {
                name: 'PASSWORD',
                regex: /((password=)|(passwd=)|(pass=))[^&\/\?]+/gi,
                group: '$1'
            }, {
                name: 'ZIP',
                regex: /((postcode=)|(zipcode=)|(zip=))[^&\/\?]+/gi,
                group: '$1'
            }];
    
            // Helper Convert QueryString to an Object 
            var queryString2Object = function queryString2Object(str) {
                return (str || document.location.search).replace(/(^\?)/, "").split("&").map(function(n) {
                    return n = n.split("="),
                    this[n[0]] = decodeURIComponent(n[1]),
                    this;
                }
                .bind({}))[0];
            };
            // Helper Convert an Object to a QueryString
            var Object2QueryString = function Object2QueryString(obj) {
                return Object.keys(obj).map(function(key) {
                    return key + '=' + encodeURIComponent(obj[key]);
                }).join('&');
            };
            // Convert the current payload into an object
            var parsedPayload = queryString2Object(payload);
            // Loop through all keys and check the values agains our regexes list
            for (var pair in parsedPayload) {
                piiRegex.forEach(function(pii) {
                    // The value is matching?
                    if (parsedPayload[pair].match(pii.regex)) {
                        // Let's replace the key value based on the regex
                        parsedPayload[pair] = parsedPayload[pair].replace(pii.regex, pii.group + '[REDACTED ' + pii.name + ']');
                    }
                });
            }
            // Build and send the payload back
            return Object2QueryString(parsedPayload);
        };
        if (!window.__piiRedact) {
            window.__piiRedact = !0;
            try {
                // Monkey Patch, sendBeacon 
                var proxied = window.navigator.sendBeacon;
                window.navigator.sendBeacon = function() {
                    if (arguments && arguments[0].match(/google-analytics\.com.*v\=2\&/)) {
    
                        var endpoint = arguments[0].split('?')[0];
                        var query = arguments[0].split('?')[1];
                        var beacon = {
                            endpoint: endpoint,
                            // Check for PII
                            query: piiRedact(query),
                            events: []
                        };
                        // This is a multiple events hit
                        if (arguments[1]) {
                            arguments[1].split("\r\n").forEach(function(event) {
                                // Check for PII
                                beacon.events.push(piiRedact(event));
                            });
                        }
    
                        // We're all done, let's reassamble everything
                        arguments[0] = [beacon.endpoint, beacon.query].join('?');
                        if (arguments[1] && beacon.events.length > 0) {
                            beacon.events.join("\r\n");
                        }
                    }
                    return proxied.apply(this, arguments);
                }
                ;
            } catch (e) {
                // In case something goes wrong, let's apply back the arguments to the original function
                return proxied.apply(this, arguments);
            }
        }
    }
    )();
    

  • HTML Media Elements Tracking Library

    Some years ago I wrote a post about how to Track html5 videos which has been widely used and copied around the web. 2 years ago I wrote a total new tracking code , which I never publicly released.

    Today I’m releasing a total new refactored code, for tracking HTML Media Elements. This means tracking <video> and <audio> elements.

    This is my first library that I’ve build thinking on it about being a full library to be used along any project, instead of being a snippet to be used on a Google Tag Manager Tag. Because of this I’m providing the library in the following formats AMDUMDIIFE and ESM . So it can be used anywhere. At the same i’m providing a CDN access via jDelivr.

    The library will take care of initializing the tracking and pushing the data back to Google Tag Manager ( using a dataLayer.push ), to Tealium ( using a utag.link ), or just to the console . Along with the event a full data model will be sent, with some details about the current event and the video ( the video title, duration, visibliity status, etc ).

    The current data model is based on Google Tag Manager’s Youtube Tracking Trigger / Model, making available the use of the current in-built video variables on GTM.

    The library will take or tracking the current videos on the page, but will also be able to “detect” newly added elements on the page ( like videos added on modals , or loaded programmatically ), that will also be tracked with no hassles. Just setting observe switch to true will enable the use of the Mutation Observer API ( where available ), to do this work for you,

    This is not all, along with this new library I’m releasing a Google Tag Manager Custom Template, will makes event easier the setup, just adding the template along with a DomReady Trigger and you’ll be done.

    HTML Media Elements Custom Template


    Using a custom Video Title

    When using HTML Media Element, we don’t have a way to pass any video details, this library will allow you to customize the current video Title being reported.

    < video src="" data-html-media-element-title="Demo Video version 1">

    This will make the VideoTitle to be reported as “Demo Video version 1“, is there’s not data-attribute the library will use the current video file name

    Passing back video details

    Not only you can pass the video Title library is totally eases the work of passing back to the events using data-elements.

    You can pass all the custom data you need about the video to have it passed back to the tracking events. To achieve this we can all the data we want to the videos using data-attributes.

    This can be done using data-attributes with the following format:

    data-html-media-element-param-{{PARAM NAME}}="{{PARA VALUE}}"

    All the data added to the <video> elements will be passed back to events so you can used them.

    For example:

    < video width="400" 
    controls 
    data-html-media-element-param-band="Neil Zaza"
    data-html-media-element-param-song-name="I'm Alright"
    data-html-media-element-param-category="Music"
    data-html-media-element-title="video test">
        <source src="mov_bbb.mp4" type="video/mp4">
        <source src="mov_bbb.ogg" type="video/ogg">
        Your browser does not support HTML video.
    </video>

    This will turn on have a videoData (or audioData) object passing the data this way:

    {
         element:  video
         elementClasses:  ""
         elementId:  "vbst4f9ed29"
         elementTarget:  video
         elementUrl:  "https://local.dev/demo/mp3.html"
         event:  "video"
         videoCurrentTime:  2
         videoData:
        	 band:  "Neil Zaza"
        	 category:  "Music"
        	 songname:  "I'm Alright"
         videoDuration:  361
         videoElapsedTime:  2
         videoIsMuted:  false
         videoLoop:  false
         videoNetworkState:  1
         videoPercent:  0
         videoPlaybackRate:  1
         videoProvider:  "html5"
         videoStatus:  "pause"
         videoTitle:  "video test"
         videoUrl:  "mov_bbb.mp4"
         videoVisible:  true
         videoVolume:  1
     }

    Library Usage

    Web Page

    <script src="https://cdn.jsdelivr.net/npm/@analytics-debugger/html-media-elements@latest/dist/htmlMediaElementsTracker.min.js">
    
    <script>
        window._htmlMediaElementsTracker.init({
            tms: 'debug',
            datalayerVariableNames: ['auto'],
            debug: true,
            observe: true,
            data_elements: true,        
            start: true,
            play: true,
            pause: true,
            mute: true,
            unmute: true,
            complete: true,
            seek: true,
            progress: true,
            error: true,
            progress_tracking_method: 'percentages',
            progress_percentages: [1,2,3,4,5,6,7,8,9,10],
            progress_thresholds: [],        
        });   
    </script>

    NPM

    npm i @analytics-debugger/html-media-elements

    Configuration Settings

    key namevalue typedescription
    tmsstringTag Management System we are using . Accepted values:
    “gtm”, “tealium”, “debug”
    datalayerVariableNamesarrayIf the TMS is Google Tag Manager, we can push the data to an specific dataLayer , by default the library will search for the current dataLayer variable name
    debugbooleanEnable debug output to console
    observebooleanAutomatically track newly added video/audio elements
    data_elementsbooleandata-html-media-element-title attribute will be used for elementTitle if provided
    startbooleanTrack Audio/Video Start Event
    playbooleanTrack Audio/Video Play Event
    pausebooleanTrack Audio/Video Pause Event
    mutebooleanTrack Audio/Video Mute Event
    unmutebooleanTrack Audio/Video Unmute Event
    completebooleanTrack Audio/Video End Event
    seekbooleanTrack Audio/Video Seek Event
    progressbooleanTrack Audio/Video Progress Events
    progress_tracking_methodboolean‘percentages’ or ‘thresholds’ // thresholds not available yet
    progress_percentagesarrayArray of % where we should fire an event
    progress_thresholdsarrayTBD

    We will be able to track the current HTML Media Elements Events ( Start, Play, Pause, Mute, Unmute, Complete, Seek, Progress ). We’ll just need to set to true the events we want to track within the init config variable.

    Along with the events the library pushes some details about the video.

    Data Model

    KeyValue ExampleDescription
    eventgtm.audio/gtm.videoCurrent Media Element Type
    Providerhtml5Fixed value, describes the current media element provider
    Statusstart,pause,mute,unmute,progress, seek, completed, errorcurrent media element event name
    Urlhttp://www.dom.comCurrent Video Holding URL ( iframe url reported if it’s the case)
    TitleVideo DemoCurrent video element data-media-element-title value, defaults to current video file name
    Duration230Media element duration in seconds
    CurrentTime230Media element current time in seconds
    ElapsedTime230Elapsed time since last pause/play event
    Percent15Media element current played %
    Visibletrue|falseReports if the video is visible within the current browser viewport
    isMutedtrue|falseIs the current media element muted?
    PlaybackRate1Media Element PlaybackRate, default: 1
    Looptrue|falseIs the video set to loop?
    Volume0.8Current Video Volume
    NetworkStateNetwork State
    DataObjectList of custom video data coming from data-attributes tagging
    elementClasses“”Element Classes List
    elementId“”Element Id
    elementTarget“”Element Target
    elementUrl“”Element URL

    Configuring The

    JSDelivr CDN: https://www.jsdelivr.com/package/npm/@analytics-debugger/html-media-elements

    Template URL: https://tagmanager.google.com/gallery/#/owners/analytics-debugger/templates/gtm-html-media-elements-tracker

    GitHub: https://github.com/analytics-debugger/html-media-elements-tracking-library

    Demo Page: https://www.analytics-debugger.com/demos/gtm-html-media-elements/

  • Tracking Google Analytics 4 Events using Data Attributes

    I must admit it, I like to use data-attributes for user clicks interactions rather than delegating that work on the IT team or relying on the class/id attributes. ( Data Attributes Tracking ) .

    For Universal Analytics, this was some kind of easy work, since we had some fixed data attributes names (category, action, label, value when talking about events, or pagepath if we wanted to use a virtual pageview ). With the new event based tracking model on Gooogle Analytics 4 ( GA4 , formerly APP+WEB ), this has change, and we have a single hit type which is going to be an “event” all the time, but them we have an unlimited possiblities of parameter names.

    On this post I’ll showing my approach to automate the events tracking on Google Analytics 4 using data attributes. Let’s go for it.

    First we’ll need a data-attribute named “data-ga4-event” , this one will allow us on the next steps to setup a CSS Selector to trigger our tags.

    Then for the events parameters we’ll use the following format: data-ga4-param-{{PARAM_NAME}} .
    Note that data attributes use kebab-case, so we’ll using is as “clicked-link-url”

    DATA ATTRIBUTES
    data-ga4-event{{event name}}
    data-ga4-param-{{PARAM_NAME}}one per each needed parameter
    Data Attributes Definition

    Let’s now see some examples. A simple event without parameters will look like this:

    <button id="cta"
    data-ga4-event="cta click"
    >CHECKOUT</button>

    and if we need to pass some paraemters it will look like:

    <a href="https://twitter.com/thyng"
        data-ga4-event="social_link_click"
        data-ga4-param-social-network-name="twitter"
        data-ga4-param-social-network-user="thyng"
    >Follow me on Twitter</button>

    You may now be thinking, that would need a separate JS snippet for each event, but we’ll be using some JS magic to automatically convert this data attribute tagging on dataLayer pushes automatically.

    (function() {
        // Grab all tagged elements
        var events = document.querySelectorAll('[data-ga4-event]');
        var unCamelCase = function(str, separator) {
            separator = typeof separator === 'undefined' ? '_' : separator;
            return str.replace(/([a-z\d])([A-Z])/g, '$1' + separator + '$2').replace(/([A-Z]+)([A-Z][a-z\d]+)/g, '$1' + separator + '$2').toLowerCase();
        }
        for (var i = 0; i < events.length; i++) {
            events[i].addEventListener('click', function(event) { 
                var target = event.currentTarget;
                if(target){             
                    var dl = {} 
                    dl['event'] = target.dataset['ga4Event'];
                    Object.entries(target.dataset).forEach(function(e) {
                        var key = e[0];
                        var value = e[1]
                        var m = key.match('ga4Param(.+)');
                        if (m && m[1]) {
                            dl[unCamelCase(m[1],'_')] = value;
                        }
                    })                
                    window.dataLayer.push(dl);
                }
                            
            });
        }
    })()

    The snippet above will take care of building a dataLayer push each time a click is performed on a element that has a data-ga4-event attribute, and will take also care of converting all data-param-X attributes in snake_case parameters within our event push.

    As example our previous example:

    <a href="https://twitter.com/thyng"
        data-ga4-event="social_link_click"
        data-ga4-param-social-network-name="twitter"
        data-ga4-param-social-network-user="thyng"
    >Follow me on Twitter</button>

    Will turn being the following dataLayer push:

    window.dataLayer.push({
        "event": "social_link_click",
        "social_network_name": "twitter",
        "social_network_user": "thyng"
    });

    Of course you could add some more feature to this snippet, for example for automatically sanitizing the values before the push is sent, or you could build some black list to prevent any non-predefined event to go into your reports.

  • New Release: GTM/GA Debugger 0.4.0

    It’s been a long time since the last post, even more since the last extension update. To be exact it took me around 1 year to have this new version ready.

    The main reason for this delay was that I switched how the extension is built at least 5 times. I don’t consider myself a developer which implies that many times I end choosing not the best stack I should. Anyway this has been a real opportunity for my to learn a lot of new technologies/frameworks I didn’t know about or just I never was able to understand, just to mention some: React / Svelte , WebPack, Rollup, Git, Gulp, Trevis. So at this point I’m really “happy” of all the time “wasted” on refactoring the extension so many times.

    In case you’re interested after these so many changes, I ended building the extension using Vue.js 2 and Bulma as the CSS Framework. This has allowed me to build an extension that it’s faster, it’s build on top of some good tecnhologies ( instead of having thousand of non-efficient JS code lines ).

    I know that for most people most of the changes won’t be noticiable, mostly because I tried kept the UI as it was in the previous version, but internally everything is different, while also como new features where added.

    In the following video, I’m showing an overview of what the new extension has to offer:

    GTM/GA Debugger Features

    • GTM/GTAG Debug Support
    • Multiple dataLayer Support ( View all the dataLayer pushes and current state )
    • View all Universal Analytics Hits being sent
    • View all GA4 (App+Web) Hits being sent
    • Filter out the hits by the type or property/stream ids
    • Filter out the dataLayer pushes by their type ( core, ga4, custom, etc )
    • Parse Hits payload to see a human.friendly keys translation
    • Enhanced Ecommerce Report ( based on GA/GA4 hits )
    • All Reports are in Real Time
    • Copy any Hit/dataLayer push info to the clipboard in a friendlyu format within a mouse click
    • Trace any Hit/dataLayer push
    • Real Time GA hits Payload debugging
    • More …

    I really lost track all everything that was added on this specific release, so I’m providing a quick Changelog

    Changelog

    • [NEW] – Now it’s based on Vue2.js + Bulma
    • [NEW] – GA4 Hits Full Support
    • [NEW] – GA4 Ecommerce Support
    • [NEW] – Multiple dataLayer Support
    • [NEW] – Multiple GTAG/GTM Containers support
    • [NEW] – Copy hits as string
    • [NEW] – Hits Stack Trace Reporting
    • [NEW] – Hits Debug ( run the hits againts official GA debug endpoint )
    • [NEW] – DataLayer Pushes Stack Trace Reporting
    • [NEW] – GTM Preview Enhancer
    • [ENHANCEMENT] – Debugging can be started clicking on a button rather than needing to press F5
    • [ENHANCEMENT] – Improved GTM/GA/GA4 detection – Faster detection delay
    • [ENHANCEMENT] – Improved GTM/GA/GA4 detection – Better accuracy
    • [ENHANCEMENT] – Improved SSR/SPA/PWA pages debugging.
    • [ENHANCEMENT] – Pushes/Hits timing are now correct and are shown in the real order they are triggered
    • [ENHANCEMENT] – UI is now more responsive, showing a better interface when using it in the sidebar
    • [FIX] – All bugs reported ( sites where the tool was not working properly ) has been addresses . Thanks to everyone that helped on reporting
    • [FIX] – Incogonito Mode Support
    • [FIX] – GA detection for hits non-send to GA endpoints
    • [FIX] – GTM detection locally served containers
    • [FIX] – +40 Tickets og bugs.

    As you may noticed some tools are gone: the Data Attributes Inspector and the Profiler Tab Report, I removed this feature for this release in order to focus on the tool reliability, they will be added back on the next releases.

    More news about the extension is that it will be available for Firefox, Opera and Edge ( as soon as I can’t have it approved on their marketplaces )

    Now I’m looking for some betatesters that will help me on identifying issues on some new releases. Yay!.

    Last big new is that hit the 40.000 users this past week. Yeah, according to Chrome Store data, the extension is being used by more than 40K users weekly, I’d never thought the tool was end having these many users, but also this created some “responsability” at my side that I’m currently not sure how to handle it.

    In the last year I declined all the extension puchase offers and also I didn’t accept any offer for adding ads within the tool, I really want to keep this tool free of ads, but it really takes lot of time. Because of this I decided to start accepting donations via Ko-Fi, Getting some help this will allow to publish updates more regularly. This is some totally opcional, I’ll keep working on the extension anyway, but some people in the past asked for being able to help.

    Click on the button below if the extension has been helpful for your work:

    Buy Me a Coffee at ko-fi.com


    Now if you are not still using the extension you can get it for free in the following link: INSTALL EXTENSION

  • Tracking user’s IP Autonomous System Number and Organization details to prevent the spam

    Around end of 2019, Google Analytics dropped the Network Domain and Service Provider dimensions support from their reports making an official announment in February about it.

    These 2 dimensions, where widely used to fight the spam in Google Analytics and there have been a lot of posts going around this topic in the last months. Simo Ahava wrote about how to collect the ISP data third party service in you want to check it.

    On this post we’ll learning what’s an Autonomous System and how we could use this info to try to fight the spam. And coolest part is that we’ll be able to use a free database for this. Continue reading πŸ™‚

    There are some other services and commercial databases that will provide this details, but let’s be honest there’re some big handicaps:

    • If you use a free services, you will hit the limit quota quickly
    • If you have a high traffic website this is not going to be cheap

    There’re basically 3 different types of subscriptions, SaaS ( they host the app and the database, DB ( you host the Database and the query system ), WebService.

    I’m attaching a list of some of the providers available, in case you want to check them.

    SaaSDBWebServiceUpdates
    MaxMindβœ… βœ… βœ… Weekly/Monthly
    IP2Locationβœ… βœ… βœ… Monthly
    IPStackβŽβŽβœ… Hourly
    ip-apiβŽβŽβœ… ?
    ipgeolocation.ioβŽβœ… βœ… Weekly
    db-ipβŽβŽβœ… Monthly/Daily
    Abstract IP Geolocationβœ…βŽβœ…Daily

    In any case there are a lot of posts around this topic on the web, and I’m trying to give this issue a new solution.

    MaxMind provides their GEO LITE databases for Free , these database are updated weekly ( on Tuesdays to be exact ) and they provide info about:

    • Countries
    • Cities
    • ASN

    The main difference on this databases with the paid ones is how accurate they are and how often they get updated. This accuracy may be an problem when we need to target users based on their city, but this time this is not what we’re looking for, we’ll looking at their ASN database.

    If you are wondering ASN stands for Autonomous System Number. According to the Wikipedia:

    An autonomous system (AS) is a collection of connected Internet Protocol (IP) routing prefixes under the control of one or more network operators on behalf of a single administrative entity or domain that presents a common, clearly defined routing policy to the internet.[1]

    https://en.wikipedia.org/wiki/Autonomous_system_(Internet)

    ASNs are a “big” routers on the ISPs and datacenters that are in charge of announcing the IP addreses they hold. ( sorry for this unaccurate description, trying to make this simple ) in order to let other AS to know how to reach their IP addreses.

    Each ISP usually have their own ( they can have more than 1 ) . ASN. For example one of main ASN in Google is: AS15169 registered to Google LLC, and this Autonomous System manages 9.5 millions IPs from Google:

    https://ipinfo.io/AS15169

    This means that we could query any IP address we and the ASN database will return their current ASN that it belongs to.

    For example we may query Google DNS’s IP address: 8.8.8.8 and the database will return the AS number and the organization name:

    Array
    (
        [autonomous_system_number] => 15169
        [autonomous_system_organization] => GOOGLE
    )

    Some other examples let’s query for this Fastly CDN IP address 151.101.134.133

    Array 
    (
       [autonomous_system_number] => 54113 
       [autonomous_system_organization] => FASTLY
    )

    Or let’s query for an IP in a dedicated servers provide like LiquedWeb

    Array
    (
        [autonomous_system_number] => 32244
        [autonomous_system_organization] => LIQUIDWEB
    )

    We could use the AS Number and the Organization names as a way to try to catch the spam, since most spam traffic is likely going to come from a co-location / vpn providers that we could identify this way.

    Since it’s a database we’ll need to setup a small endpoint in our domain in order to be able to query it. This implies some IT development but in the other side it has some big wins:

    There will be NO query limits.

    The cost of having this solution running is the cost endpoint development

    We could have our website developer querying this info via server-side and have this data pushed to the dataLayer instead of needing to have an extra XHR request and needing to delay the hits, YAY!

    Now, in the order side of the road there some handicaps:

    • Not as accurate data as network/domain in other databases
    • Data freshness accuracy won’t be premium, but as we all know GA wasn’t either.

    Getting the ASN DB

    As I’ve mentioned above the GeoLite ASN database is free and you’ll be able to get it after signup for a free account at : https://dev.maxmind.com/geoip/geoip2/geolite2/

    PHP Example

    Another good point is that MaxMind already provides libreries for PHP/NodeJS/Perl and other languages to help on reading querying their GEOLite databases, which helps on setting up our endpoint.

    As usual I’m providing a example for PHP, since it’s the most widly used language and the one that it’s avaiable on almost any hosting around the world

    If we don’t have composer installed yet, that’s gonna be our first step:

    curl -sS https://getcomposer.org/installer | php

    next, we’ll be installing the needed dependences

    php composer.phar require geoip2/geoip2:~2.0

    <?php
    require_once 'vendor/autoload.php';
    use GeoIp2\Database\Reader;
    $ip_as_details = new Reader('geo/GeoLite2-ASN.mmdb');
    $asn_details = $ip_as_details->get('8.8.8.8');
    // As this point we could build a JSON and send it back to the browser.
    print_r($asn_details);

    Last step will be passing back this info to Google Analytics using a custom dimension, so we can use it in our filters or segments.

    Extra – Grabbing the network domain

    I was about to publish the post and I decided to add a little extra , let’s also learn how to track the “network domain” .

    Google Analytics was using the IP’s PTR for the “network domain” . Again you may wonder what’s “PTR” , and it stands for “Pointer record” and it basically resolves an IP to a FQDN ( fully-qualified domain name ). This is it’s the inverse of a A DNS Record.

    For example we can make a Reverse IP Lookup to google DNS’s and it will return “dns.google”.

    root@sd1:/# nslookup
    > set q=ptr
    > 8.8.8.8
    Server:         8.8.8.8
    Address:        8.8.8.8#53
    Non-authoritative answer:
    8.8.8.8.in-addr.arpa    name = dns.google.

    Or we may try with one Google Bot IP address, which most sea must be familiar

    > set q=ptr
    > 66.249.66.1
    Server:  dns.google
    Address:  8.8.8.8
    Non-authoritative answer:
    1.66.249.66.in-addr.arpa        name = crawl-66-249-66-1.googlebot.com

    Last example let’s query google.com IP address

    > set q=a
    > google.com
    Server:  dns.google
    Address:  8.8.8.8
    Non-authoritative answer:
    Name:    google.com
    Address:  172.217.17.14
    > set q=ptr
    > 172.217.17.14
    Server:  dns.google
    Address:  8.8.8.8
    Non-authoritative answer:
    14.17.217.172.in-addr.arpa      name = mad07s09-in-f14.1e100.net

    If we want to have the network domain info back in our GA reports we’ll just need to parse the hostname of the PTR for grabing just the root domain, on this last case it would be: 1e100.net .

    I wouldn’t advise about tracking to full ptr hostname for 2 reasons: First mosts of hostname are a mix of the IP address + a the ISP domain which will be agains the GDPR ( we cannot record the user’s IP address ) and also it will create a high cardinality which won’t help on analyzing the data.

    Now, remember that we were building and endpoint in PHP to get the ASN details, just some more lines of data would allow to have the network domain pushed into our datalayer! πŸ™‚

    $ip_ptr = gethostbyaddr('8.8.8.8');

    Dealing with getting the root domains, can be a pain task due to all the new domain tlds and needing to have in mind the third level tlds. In case you want to have this done easily you can use the following PHP library https://github.com/utopia-php/domains , which will let you grab the “registable” domain name within a hostname

    require_once '../vendor/autoload.php';
    use Utopia\Domains\Domain;
    // demo.example.co.uk
    $domain = new Domain('demo.example.co.uk');
    $domain->get(); // demo.example.co.uk
    $domain->getTLD(); // uk
    $domain->getSuffix(); // co.uk
    $domain->getRegisterable(); // example.co.uk
    $domain->getName(); // example
    $domain->getSub(); // demo
    $domain->isKnown(); // true
    $domain->isICANN(); // true
    $domain->isPrivate(); // false
    $domain->isTest(); // false

    I’m providing the example in PHP language, but it doesn’t mean you have to use it at all, this code/idea can be developed on almost any server-side language you may be using. In the last instance you run a small VM or VPS to have a PHP environment where you can host your endpoint :).

  • Tracking your visitors effective connection speed details

    Tracking your visitors effective connection speed details

    I know this is just currently a draft but being it available on Chrome, Edge and Opera ( or any chrome based browser ) make this really usefull in my opinion.

    In those browsers, there’s a API that allows to get the details about the current connection of the current user. We cab query some info like the current “estimated” connection link, the round-trip ( latency ), based on the recently observed requests by the browser.

    All these details can be queried via the Network Information API on the supported browsers. I know if not much widly adopted yet, but according to canIuse it’s supported by around a 70% of browser globally, it’s not perfect but I think it’s enough, with the time more browser should be end adding support for it.

    We can query (at this moment) for the following details:

    PropertyValue
    downlink
    downlinkMax (available in workers)
    rttround-trip time in milliseconds
    effectiveTypeslow-2g , 2g , 3g , 4g
    type (available in workers)bluetooth, cellular, ethernet, none, wifi, wimax, other,unknown

    On this we’ll focusing on the effectiveType since is the attribute that is widly available on the browsers. We need to have in mind that is NOT the real connection type of the user, but the current β€œeffective” connection type. Meaning that is an estimation based on the measured network performance for the previous/current requets. This value is actually calculated based on the maximun download speeds and the minumun RTT values recently observed.

    This mean that an user may really be under a fiber connection, connected via Wifi with a very bad link and the effectiveType may report 2g. but since we are talking about the “effective” we should be fine

    This reported value is calculated based on the following table:

    effectiveType (ECT)Min. RTTMax. Down
    slow-2g2000ms50kbps
    2g1400ms70kbps
    3g270ms700kbps
    4g0msinf.
    https://developer.mozilla.org/en-US/docs/Glossary/Effective_connection_type

    Code Snippet

    (function() {
        var connection = navigator.connection || navigator.mozConnection || navigator.webkitConnection;
        return {
            effectiveType: connection.effectiveType,
            rtt: connection.rtt,
            downlink: connection.downlink
        };
    }
    )();

    onChange Event

    We can also listen for connection info changes, using the following listener:

    navigator.connection.addEventListener('change', ()=>{
      dataLayer.push({
         'event': 'connection-changed'
      });
    });