Blog

  • Google Analytics 4 (GA4) Events Demystified

    At his point, many ( if not all ) have heard Google Analytics is moving to an “events” based tracking model with Google Analytics 4. But, what does it really imply? Do we have to worry about it?. To be honest, it’s not a big ( from the implementation side ) deal since we have been already using “events” all the time, we used to call them hit types. If we look at it from the reporting side it may lead to some “hard times” when trying to use the data, not because it’s better or worse, just because it’s different.

    This post will try to explain Google Analytics 4 Events from the technical perspective, trying to explain how to current event model works, where can the events come from, the limitations, etc.

    I’d say that one of the most important things when working with GA4, is realizing how important is going to be the data model definition we do at the start. Because this is going to condition the future of our implementation and data.

    But don’t worry about this for now. we’ll dig into this across the post 😊.

    How does Google Analytics 4 record the data

    Google Analytics 4 works much similarly to Universal Analytics.

    We’ll be sending hits (network requests) to a specific endpoint ( https://endpoint.url/collect ). This shouldn’t be anything new for anyone, that’s how all analytics tools and pixels work. And this is the way it works for the client-side tracking (gtag.js), server-side tracking ( measurement protocol ), and the app tracking ( Firebase Analytics SDK ).

    Tracking endpoints

    I found there are 5 different endpoints that we could use to send the data to Google Analytics 4, these are:

    • https://www.google-analytics.com/g/collect
    • https://analytics.google.com/g/collect
    • https://custom.domain/g/collect (this will really forward the hits to the first one on this list)
    • https://app-measurement.com
    • https://www.google-analytics.com/mp/collect

    Depending on where we are doing the tracking we’ll be using one of them.

    We could see hits flowing to 4 different endpoints for GA4 + 1 for Firebase

    The first two endpoints are the ones used by the client-side tracking but you may wonder why sometimes we see the hits coming through analytics.google.com, and some other times via the google-analytics.com domain. The reason is that if current GA4 property has “Enable Google signals data collection info” turned on, GA4 will use the *.google.com endpoint ( si Google would be able to use their cookies to identify the users, I guess )

    JavaScript Client Library

    The page tracking is done using a library provided by Google, the same way we used to have analytics.js , ga.js or urchin.js libraries in the past Google Analytics versions.

    The default code snippet will look like this:

    <!-- Global site tag (gtag.js) - Google Analytics -->
    <script async src="https://www.googletagmanager.com/gtag/js?id=G-THYNGSTER"></script>
    <script>
      window.dataLayer = window.dataLayer || [];
      function gtag(){dataLayer.push(arguments);}
      gtag('js', new Date());
    
      gtag('config', 'G-THYNGSTER');
    </script>

    If you have noticed it the snippet loads a JavaScript file from www.googletagmanager.com domain, and this is because all gtag.js snippets are in essence a predefined Google Tag Manager template. It’s not just a plain GTM container, since it does some internal stuff, but it works also based on tags, triggers, and variables.

    Previous tracking libraries were offering a public API to perform all the tracking at our end, ie: it was accepting some methods/calls and converting them to hits, doing the cross-domain tracking allowing us to use Tasks, while at the same time doing some logic for generating the cookies, reading the browser details, and this library was shared across all the users worldwide web.

    This is no longer working this way, now each Data Stream / Measurement ID will have its own snippet and it will load a separate js file. We may look at this as a performance penalty but it’s done this way for a reason.

    Each gtag.js container it’s now built dynamically at Google’s end and contains personalized code for the current property and also holds the settings for the current data Data Stream / Measurement ID. And that’s why the container sizes are different for each container we check. Don’t worry, this is normal and expected. The container size will vary depending on many things, like if we have the Enhanced measurement features we have enabled or the current settings we defined on the admin interface for our property.

    GA4 Containers Sizes

    One thing that has been confusing me since Google Analytics 4 arrived, was thinking that there were lots of things happening on the back that were hardly possible to debug, like the conversions, or the created / modified events.

    And well, that’s not the way it works, almost any setting or feature you enable on the admin it’s going to be translated into code and will be executed on the client-side. This means that when you add a new event on the interface that’s will add some code on the gtag.js container will send an event, and this will make that you “may” end seeing “ghost” events on the browser, don’t waste your time as me trying to see why your implementation was firing duplicated events :). Or for example when we define a conversion event when we configure our internal domains or the ignored referrals.

    While this approach may help some people in doing some common tracking tasks, on the other side it’s preventing to do some advanced implementation because some “loved” features like the “customTasks” are now missing. I’m ok with Google trying to control how things are done, but there will always be sites that will need custom /U personalized implementations, and I really feel that Google should provide some public/documented API methods to easily perform some of the most used common tasks like the cross-domain tracking in Google Analytics 4.

    Let’s see some examples, when you “create a new event” from the Admin Interface, this event won’t be created server-side, what’ is happening is that GA4 will add some code logic to send that hit client-side.

    Google Analytics 4 events creation modal

    Another example would be when you enable the Enhanced Measurement, this will turn on having some code added to your container. Remember that we mentioned that GA4 was in essence a Google Tag Manager container?, if you take a look at the current Measuring categories you’ll notice how they all match the current triggers available on GTM ( clicks tracking, scrolls tracking, youtube tracking )

    Enhanced measurement

    And that’s not all, when we change the session duration or the engagement time, some session_timeout variables will be updated internally (engagementSeconds, sessionMinutes, sessionHours)

    Session Timeout Adjust

    We could keep going on examples, or build a full list, but that’s likely going to get outdates sooner than later. The main idea you need to get from this part of the post is that GTAG is like a “predefined” GTM template and that all the tracking happens on the client’s browser.

    Firebase Analytics SDK

    Apps are usually tracked using the Firebase Analytics SDK . A good starting point would be visiting the following Url: https://firebase.google.com/docs/analytics/get-started?platform=android&hl=en

    The apps hits will use their own endpoint and format, the hits will go to https://app-measurement.com and the current payload will be sent in binary format, which makes it really difficult to debug, event if using Charles, Fiddles, or any other MITM proxy app.

    If you want to debug your Firebase implementation. I recommend you use my Android Debugger for Windows. Once you install the app, you’ll be able to request a free lifetime license.

    Android Debugger Splash Screen

    Google Analytics 4 Measurement Protocol

    Google Analytics finally offers a proper Measurement “Protocol“, which is at the time of writing this post it’s in Beta stage.

    This protocol will use the https://www.google-analytics.com/mp/collect endpoint, and rather than having the developers build the request payloads using some non-intuitive keys, now it accepts a POST request with a JSON string attached to the body using application/json Content-Type:

    fetch('https://www.google-analytics.com/mp/collect?measurement_id=G-THYNGSTER&api_secret=12zneF6DSDFSDFjJPgDAzzQ', {
      method: "POST",
      headers: {
         'Content-Type': 'application/json'
      },
      body: JSON.stringify({
        "client_id": "12345678.87654321",
        "user_id": "RandomUserIdHash",
        "events": [{
          "name": "follow_me_at_twitter",
          "params": {
            "twitter_handle": "@thyng",
            "value": 7.77,
        },{
          "name": "follow_intent",
          "params": {
            "status": "success"
        }]
      })
    });
    KeyType
    client_idstrRequired. 
    user_idstrOptional.
    timestamp_microsintOptional. Hit offset. Up to 3 days ( 2,592e+11 microseconds ) before the current property’s defined timezone.
    user_properties{}Optional.
    non_personalized_adsboolOptional. ( whatever use this event for ads personalization )
    event[][]Required. ( Max 25 Events per request )
    event[].namestrRequired. 
    events[].params{}Optional.

    In any case, there are some things you need to have in mind, you should keep your API Secret not exposed, meaning that this endpoint should not be used client-side, because that would mean that your API Secret would need to be exposed. This endpoint is more likely to be used to track offline interactions, ( like refunds ), or for tracking our transactions server-side.

    At the time of writing this post ( Apr 2022 ), one of the biggest handicaps of this protocol is that it doesn’t support any sessionId parameter, meaning that you won’t be able to stitch the current server-side hits to the client-side session. This should be fixed over the next months,

    In the meanwhile, I’ve published a the GA4 Payload Parameters CheatSheet, which you could use to send some server-side hits in the old-school way ( like we used to do with the first Measurement Protocol for Universal Analytics ) and where you could attach the “&sid” parameter.

    There are of course some other points to have in mind, like that GA4 has some reserved event and parameters names, that you should not be using. We’ll cover this later in the “events” section.

    Events Model / Hit Types

    Let’s start by saying that everything on Google Analytics 4 is an “event“. I’m sure that it’s not the first time you hear that, and it’s totally right, but at the same time if we strictly look to Universal Analytics we were also sending “events“, but then we used to call them “hit types“.

    In a technical meaning, nothing has changed at all. We have networks requests to some endpoints. That is it!. If you want to learn a bit more about how the hits are built or sent from the web tracking library you can take a look at GA4: Google Analytics Measurement Protocol version 2 post to learn a bit more about how it works.

    The main difference on GA4 is that now Google does not offer a fixed tracking data model besides the page_views and the e-commerce. Meaning that the responsibility for building a proper data model falls on us. While working on our definition we need to have in mind that there are some predefined/reserved event and parameters names and that we have some limits we need to have in count (About total events, names, and values lengths).

    Universal Analytics Hit Types Model

    If we take a closer look, since Urchin times we’ve been using “events” for our tracking in Google Analytics. Yep, I’m not joking, we had, we just called them “hit types“.

    Just so you know, we could replicate the current Universal Analytics Data Model in Google Analytics 4 following the next table of events:

    Hit Type / EventParameters
    pageview– Location
    – Path
    – Title
    event– Category
    – Action
    – Label
    – Value
    – Non Interaction
    timing– Category
    – Variable
    – Label
    – Value
    social– Network
    – Action
    – Opt. Target
    exception– Description
    – Fatal
    screenview– Screen Name
    transaction ( Legacy Ecommerce )– Id
    – Affiliation
    – Revenue
    – Tax
    – Shipping
    – Coupon
    item ( Legacy Ecommerce )– Id
    – Name
    – Brand
    – Category
    – Variant
    – Price
    – Quantity

    Even Google offers a setting that will automatically convert all your ga() calls to some predefined events on GA4. From your Data Stream configuration you can enable this feature and all events, timing, and exception events will be converted to GA4 events ( they will add a listener to the ga('sent', 'event|exception|timing') calls for doing this,

    This tool wil map the data in the following way:

    Event NameParameters
    [event_name]This will take the current eventAction
    eventCategory > event_category 
    eventAction > event
    eventLabel > event_label
    eventValue > value
    timing_completetimingCategory > event_category
    timingLabel > event_label
    timingValue > value
    timingVar > name
    exceptionexDescription > description 
    exFatal > fatal 

    Beware because since its converting all Event Actions on “events“, depending on your current de events definition on Universal Analytics you have end up hitting the unique event names limit (500)

    Google Analytics 4 Events

    Event Sources

    The events on Google Analytics 4 can come from 4 different sources. These are:

    • Public Web/App endpoint.
    • Measurement Protocol ( Server Side )
    • Internal self-generated events
    • Admin defined events

    Public Web Endpoint

    The main actual origin for GA4 events we’ve already talked about them. These are the event that is being generated on our site coming from the GTAG.js container ( Check the GA4 Payload Parameters CheatSheet here ).

    Measurement Protocol ( Server Side )

    Another source for our events is the measurement protocol. This works similarly to the public endpoint. but the hits would be sent via server-side and we’ll need to use an API Secret within our requests.

    Internal self-generated Events

    This one can be a bit confusing, GA4 auto-generates some of the events we see in the reports. This means that we see some events in our reports that won’t be seen in our browser.

    This doesn’t mean that they’re being generated randomly or using some server-side logic. Most ( if not all ) of these events are created because a parameter was added to some event.

    Our events payloads may have some extra parameters attached to them sometimes that will make GA4 internally spawn a separate event. As far as I’ve been able to identify this is the list of the internally generated events and what’s the parameter that will trigger them.

    Event NameTrigger
    session_start&_ss
    first_visit&_fv
    user_engagement&seg

    For example, if the current event payload contains a &_ss parameter, a session_start will be generated, if it contains a $_fv then we should be able to see a first_visit events and so on. This list may grow in the future (and it may be missing some events that I’ve not been able to spot yet)

    If we’ve enabled the Enhanced Measurement, we may also see some events in our reports ( this time this event will be visible without the browser requests ), these are:

    Event NameParameters
    clicklink_id
    link_classes
    link_url
    link_domain
    outbound
    file_downloadlink_id
    link_text
    link_url
    file_name
    file_extension
    video_play
    video_pause
    video_seek
    video_buffering
    video_progress
    video_complete
    video_url
    video_title
    video_provider
    video_current_time
    video_duration
    video_percent
    visible
    view_search_resultssearch_term
    scrollpercent_scrolled
    page_viewpage_referrer ( URL and Title are Shared Parameters )


    On the other side, when working with the Firebase Analytics SDK, this one will automatically track a lot of events, without us needing to explicitly define them.

    Here is the current list of autogenerated event names by Firebase:

    ad_activeviewAPP
    ad_clickAPP
    ad_exposureAPP
    ad_impressionAPP
    ad_queryAPP
    adunit_exposureAPP
    app_clear_dataAPP
    app_installAPP
    app_updateAPP
    app_removeAPP
    errorAPP
    first_openAPP
    in_app_purchaseAPP
    notification_dismissAPP
    notification_foregroundAPP
    notification_openAPP
    notification_receiveAPP
    os_updateAPP
    screen_viewAPP
    user_engagementAPP,
    Note: These events will not count towards the unique events name limit

    Admin defined events

    We’ve already talked about these ones, when we create or modify an event within the admin section, these settings will be translated to the client-side tracking.

    This means the following:

    • You may see events being fired on the browser that you didn’t define on Google Tag Manager or GTAG. This is normal, don’t go crazy with it. If you see a duplicate event or a new event that you don’t know where it’s coming from take a look at the Data Stream Settings
    • You may have some unexpected parameters or event names if a “modify” rule is being used.

    Events Limitations

    Google Analytics 4 is full of limitations in many aspects, and it makes it a bit difficult to understand all of them, even more, when the limits keep constantly changing.

    We have limits for event names and values length, same for the event parameters and the user properties. At the same time, we have a limit on how many parameters and properties we can append to each event. And these limits may vary between the free and 360 versions.

    There are also, some exporting limitations (The free version it’s capped to 1M daily hit export to Big Query ) or the data retention settings wherein the free version will top at 14 months while the 360 will allow to hold up to 50 months on data.

    But this is not all the limits we’ll have … we will also have limits for the total conversions, audiences, insights, and funnels we can set. This is not directly related to the events, so if you’re interested you can visit the official Configuration Limits Information.

    Collecting and Names Limitations

    We can attach up to 25 event parameters ( 100 on GA4 360 ) to each event, and we can identify these values in our hits easily these are the ones starting with “^ep(|n).*“. Event Parameters are meant to add some metadata to our events.

    ep.event_origin: gtag

    Each of these parameters should have a name no longer than 40 characters and a value not bigger than 100 characters.

    At the same type, we have the “user properties“, We can attach up to 25 user properties to each hit these are attributes that will describe segments for our users. For example, we could think about recording the current user newsletter sign-up status, or the total purchases made by the current user. We can identify his data in our hits because they will start with “^up(|n).*“,

    up.newsletter_opt_in: yes
    upn.user_total_purchases: 43

    Each of these properties should have a name no longer than 24 characters and a value not bigger than 36 characters.

    Logged itemLimitFree360
    EventsEvent Name 40 chars
    Event parameter Name40 chars
    Event parameter Value100 chars
    Params per event25100
    User propertiesTotal per Property25
    Property Name24 chars
    Property Value36 chars
    User-ID256 characters
    Custom dimensionsEvent Scope50125
    Item Scope10
    User Scope25100
    Custom MetricsEvent Scope50125
    Events Offset3 days
    Full Limits Table

    Event Values Typing

    You may have noticed that some of the parameters start may start with up, ep, upn, epn . This is because an event parameter/user property can be either a string or a number, the good news is that we don’t need to define them since they’re automatically typed by GA4. Just take a look at the logic it’s used to define if a parameter is a string or a number.

    var value = 'something';
    if(typeof(value) === "number" && !isNaN(value)){
        console.log("is a number parameter");   
    }else{
        console.log("is a string parameter");
    }

    SGTM – Google Analytics 4 Hits

    The last thing I want to shout out is that GA4 hits sent via Server Side Google Tag Manager, are able of doing two things that we won’t see on the regular hits.

    First of these is that the hits sent server-side are able to set first-party cookies on the user browser, this is achieved using a Cookie-set header to the request:

    And the last one is that they may contain a response body, this is used to send back some pixels client-side. ie: SGTM builds up a pixel request and gets it back to the browser so it gets sent if for example, it was missing some third party cookie value (where sending it via server-side won’t be making any difference )

    More Questions

    How can I identify a conversion?

    If the current event has a &_c=1 parameter it will be counted as a conversion

    Are there any e-commerce limits?

    Yes, they’re, as far I’ve been able to deduct from the code.

    • A max of 200 items can be sent within a single event, any item above them will be skipped
    • A Max of 10 items scopes parameters, any parameter above this limit will be removed from the item

    It takes some seconds to see my hits

    Google Analytics 4, can delay up to 5 seconds the hits firing. This is because it uses an internal queue in order to batch the event and save some hits requests. At this time there is no way to “force” the queue dispatch, and there’re some situations where the queue is skipped and the events are sent right way. This is for example the first a visitor comes to your site (ie: when there’s no cookie present).

    Why can’t I use any of my parameters on the reports?

    You can send ANY parameters along with your events, but this doesn’t mean that you’ll be able to use them on your exploring reports. This can be confusing because while you’ll see the parameters on the Real-Time reports, you’ll need to set up them as dimensions on the admin in order to be able to use them. If you think about it, it makes sense, the real-time report is just some streaming report where no data is being parsed/processed at all, and we can not expect GA4 to process all the data coming with the events, so it will only process the parameters that we’ve configured. We need to setup then in the Custom Definitions section

    I’ve set-up my dimensions, but they show no data

    I’m not if this is only me, but it drove me crazy sometimes. I’d say that if you add a new event with some parameters and then you directly go to adding in the admin, they won’t show up, but you’ll be able to type the parameter name manually. All times I did this, I was not getting info for that dimension. My advice is to wait some hours before the custom definition and only do it if the dimension is being shown for being selected. ( rather than manually typing it ). If you did it wrong, the only solution that worked for me was archiving the dimension and re-creating it.

  • Cross-Domain Tracking on Google Analytics 4 (GA4)

    Now that Universal Analytics deprecation has been officially announced, it’s time to start writtingsome technical posts about how things works on Google Analytics 4.

    First one, is going to be about how Cross Domain Tracking works, ( not only for GA4 but also for any GTAG.JS based tool ).

    Back in time (about three years ago) I noticed a new “_gl” parameter being attached to some links by Google Analytics, I’d say it stands for “Google Linker” , just looking how it offers support for the cross-domain tracking not only for Google Analytics but also for all the other pixels based on the gtag containers (Adwords, Doubleclick ) and not only for Google Analytics. At that point I started to reverse engineer the code to see how it works. If you’re interested on it, you can take a quick look to a notes draft I wrote at that time on this Gist:

    To the date Google has only officially published a way to perform the cross-domain tracking . and this is adding our domains in the Data Stream configuration section. ( Admin > Data Streams > Select the stream > More Tagging Settings > Configure your domains )

    We may think that some things are happening in the backend, but what’s is really happening when we add some configuration in our Admin section is that the current served GTAG Container get’s some extra settings.

    On this case when we add some domains to our configurationm the GA4 GTAG Container is adding a click/submit listener with some condictions based on the current domains we’ve added. If you’re used to work with Google Tag Manager, think about this like adding a Click Listener and a Form Submit Triggers with a triggering condition based on these domains list.

    Please have in mind, that on this case, this will also affect the in-build outgoing links tracking on Google Analytics 4 (It will not trigger the events if the current link domain name is within the defined list on the admin side of our Data Stream).

    If you read in between lines from the previous paragraphs you may have guessed that there must be some code on the GTAG.js code taking care of the cross-domain linking, and this means that we can reverse it.

    How the Cross Domain Tracking works on the GA4/GTAG

    First think is looking how the current new linker parameter looks like in GA4:

    _gl=1*tp0qzs*_ga*OTYxNDI4MjA4LjE2NDg1NzM2OTM.*_ga_RNYCK86MYK*MTY0OTE3NjMxMy41LjEuMTY0OTE4MDM1OC4w

    We can easily identify that we could split this in some different values by the “*” character:

    KeyValue
    1tp0qzs
    _gaOTYxNDI4MjA4LjE2NDg1NzM2OTM
    _ga_RNYCK86MYKMTY0OTE3NjMxMy41LjEuMTY0OTE4MDM1OC4w

    The first value ( if I’, not wrong ), it’s a fingerprint based on the current user agent and browser plugins plus a time checking hash. This will check that the current browser getting the linker is the same as the one that generated it and that it was not generated long ago ( it used to have 2 minutes expiration time on Universal Analytics Linker ). This is done to prevent the cookies being overridden by mistake because shares a link with a linker value on it.

    Please note, that we will have as many _ga_[A-Z]{6,9} keys as GA4 Streams being used in our website . and that also will be holding some other cookies values like the Adwords or Double Click ones. This will vary depending on your current setup. For now we’ll be just focusing on the Google Analytics 4 ones.

    GA4 Cookies Info

    If we look back into Google Analytics history, we used to have _utma, _utmb, _utmc, _utmz cookies for Urchin (urchin.js) and First Google Analytics version (ga.js). At this timet all client/session tracking ( a,b,c cookies ) and attribution info (z cookie) was being calculated client-side, then again when Universal Analytics was released all this logic was moved to the server-side and then Google switched to use just one cookie ( _ga ) to keep hold the clientId ( &cid ) .

    Now, it seems we switched to some hybrid method, where the session calculation is done client-side again and the attribution it’s being calculated on the server-side. That’s why we have a new cookie (in addition to the _ga one ), that’s is being used to hold the current session_id (&sid), session_count (&sct), session_time, and based on this report also the session_start (&_ss), first_visit (&_fv) internal events .

    Let’s take a look to a typical Google Analytics 4 Cookies Set:

    _ga:             GA1.1.961428208.1648573693
    _ga_THYNGSTER:  GS1.1.1649176313.5.1.1649181571.0

    There has not been any changes in our well-known “_ga” cookie. It still holds the current hostname segments count, a random Id and lastly the time when the cookie was created for the first time. Here is a table showing the current values ( I need to double check 2 of the values to be sure what are they for, that’s why I actually set them as TBD )

    ValueDescription
    GS1Fixed String: Google Stream 1
    1Current hostname segments count
    1649176313Current Session Start Time
    5Sessions Count
    1TBD
    1649181571Last hit Time
    0TBD

    This is the way GA4 is able to determine when a session start, the session duration, and the current session count, even checking the _ga cookie it will be able to define the first visit time.

    Looking Inside the Google Analytics 4 Cross-Domain Tracking

    If we take a look at the officials docs, there’s no much info about how to customize the cross-domain tracking, beside than telling us to add our domains to the admin.

    This can be kind of an issue/limitation if our setup is not just based on clicks, or form submits, I can think on some examples like wanting to do a cross-domain linking to an iFrame, or if our site is redirecting to a destination page that it’s generated dynamically ( for example for these forms doing a validation and then redirecting the user to a search listing page, without doing any form submit at all )

    These situations won’t be handled by GA4 and on these cases we’d want to get the build the linkerParam for GA4 so we can attach it wherever we want.

    The good point is that , even if it’s not documented, we can use some global variables to grab the data, and even make use of some helpers available to generating the linker, decorate links or anything we want πŸ™‚

    First thing we’ll we learn is about the window.google_tag_data.gl object. This global variable will be holding ALL the cross-domain config and information. This is: the current data model for the Google Linker Config.

    It’ll looks like:

    {
        "decorators": [
            {
                "domains": [
                    /outgoing\.com/
                ],
                "fragment": false,
                "placement": 1,
                "forms": true,
                "sameHost": false
            }
        ],
        "data": {
            "query": {},
            "fragment": {}
        },
        "init": true
    }
    decorators[:decorator] This holds the current decorators for the cross-domain, we have have more than 1 decorator since we have may have an AdWords gtag container adding one, and GA4 another one.
    :decorator.domainsAn array of regexes to be matched against the current clicked link
    :decorator.fragmentShould we attach the _gl parameter to the fragment (#)
    :decorator.formsIs this decorator for “submit” events
    :decorator.placementThe current order preference for applying the decorator
    :decorator.sameHostN/A
    data{}Current Linker Info
    data.queryCross-linking parameters read from the QueryString ( decoded )
    data.fragmentCross-linking parameters read from the Fragment ( decoded )

    It’s more simple that it looks at a first look. This variable will allow us to know if there are any “decorators” configured ( a decorator is meant to “add” the linkerParam (_gl) to any Link or Form that is matching the current domains on the list.

    Also if on the current page load there was a valid linker parameter, the data jet will show us which clientIds ( cookies ) were overriden . Nice!


    Note: In the event of an invalid linker (if it was generated on a different browser or some minutes ago , the linker won’t work and the data here will show empty)

    Trick #1 – Adding a new domain name for the auto-decoration

    There’s a small trick you can do to add new domains dynamically to GTAG Decorator. Remember we said that the current settings on the Admin were reflected on some code in our GTAG containers?, after checking the current code, we can push new domains programatically from JS, just like this:

    if(window.google_tag_data && window.google_tag_data.gl.decorators && window.google_tag_data.gl.decorators.length > 0){
         window.google_tag_data.gl.decorators[0].domains.push(/analytics-ninja\.com/)
    }

    glBridge Utilities

    If the case that we want to grab the current linker to we can add it ourselves to any link. There are also some good news, analytics.js exposes some utilities for performing this task.

    The utils are available on the window.google_tag_data.glBridge object

    As you can see there are the same as we use to have on Universal Analytics for setting the autoLinking, the decoration of links, the linkerParam generation. We are just focusing are the generate and get ones, the first one will be equivalent to the getLinkerParam , the second one will allow us to “unhash” the linker values.

    google_tag_data.glBridge.generate({})

    This function takes an object of clientIds values as an argument and returns a valid “_gl” linker value that we could attach to our links.

    window.google_tag_data.glBridge.generate({
        _ga: '121321321321.2315648466',      
        _ga_THYNGSTER: '1649176313.5.1.1649183273.0'
    });

    As you can see, we’ll need to grab our current cookies values and just pass them to the function and it will return our precious linkerParam πŸ™‚


    google_tag_data.glBridge.get()

    This one is also pretty self-explanatory, it will grab the linker param from the current URL (if present) and it will return the client cookies/id decoded values hold on the linker .

    Advise

    Please note that universal.js is likely to be gone in like 1.5y. I don’t expect Google to remove the analytics.js library and just stop processing hits at some point ( or maybe modifying the library so it doesn’t fire hits at all). At the moment the gtag.js container doesn’t expose this brigde functions, but it may do in some near future.

    If you’re look to some guidance about how to implement this functionality without relying on that library I’m providing some examples ( reversed from universal.js code )

    If the current linkerParam value is “invalid” ( ie: was not generated from the same browser or it generated long ago ), this function will just return an empty object {}

    NOTE: I’m working on a library that totally replicates this analytics.js Google Linker Bridge functionality . To have some future proof solution for when Universal Analytics is sunsetted. It will be publish in the next weeks.

    If you’re interested in all this I’m publishing some proof of concept functions that you could use as a base for your coding. This code should be adapted to support the Adwords, DoubleClick, multiple GA4 cookies, Google Signals ids ( _gid cookie ) , Google Remarketing Cookies ( _gac ) to be able to say that it’s a good replacement. But it this point I’m offering these snippets ( all of them reversed/copied from the analytics.js source code )

    var decrypt_cookies_ids = function(a, b) {
            var P = function(a) {
                if (encodeURIComponent instanceof Function) return encodeURIComponent(a);
                F(28);
                return a
            };   
            var m = function(a, b) {
                for (var c in b) b.hasOwnProperty(c) && (a[c] = b[c])
            };
            
            var H = function() {
                var a = {};
                var b = window.google_tag_data;
                window.google_tag_data = void 0 === b ? a : b;
                a = window.google_tag_data;
                b = a.gl;
                b && b.decorators || (b = {
                    decorators: []
                }, a.gl = b);
                return b
            };
                 
            var c = P(!!b);
            b = H();
            b.data || (b.data = {
                query: {},
                fragment: {}
            }, c(b.data));
            c = {};
            if (b = b.data) m(c, b.query), a && m(c, b.fragment);
            return c
        }
    var generateLinkerParam = function(a) {
        // Function to properly grab ID's from Cookies
        var getCookiebyName = function(name) {
            var pair = document.cookie.match(new RegExp(name + '=([^;]+)'));
            return !!pair ? pair[1].match(/GA1\.[0-9]\.(.+)/)[1] : undefined;
        };
    
        // These are the 3 values used by the new linker
        var cookies = {
            _ga: getCookiebyName("_ga"),
            // Google Analytics GA ID
            _gac: undefined,
            // Google Remarketing
            _gid: getCookiebyName("_gid")// Google ID
        };
    
        // Calculate current browser_fingerprint based on UA, time, timezone and language
        // 
        var browser_fingerprint = (function(a, b) {
            var F = function(a) {
                // Didnt check what this does, the algo just needs F to be defined. commenting out
                Ne.set(a)
            };
            a = [window.navigator.userAgent, (new Date).getTimezoneOffset(), window.navigator.userLanguage || window.navigator.language, Math.floor((new Date).getTime() / 60 / 1E3) - (void 0 === b ? 0 : b), a].join("*");
            if (!(b = F)) {
                b = Array(256);
                for (var c = 0; 256 > c; c++) {
                    for (var d = c, e = 0; 8 > e; e++)
                        d = d & 1 ? d >>> 1 ^ 3988292384 : d >>> 1;
                    b[c] = d
                }
            }
    
            F = b;
            b = 4294967295;
            for (c = 0; c < a.length; c++)
                b = b >>> 8 ^ F[(b ^ a.charCodeAt(c)) & 255];
            return ((b ^ -1) >>> 0).toString(36);
        }
        )();
    
        // Function to hash the cookie value
        // The following functions takes a string and returns a hash value.
        var hash_cookie_value = function(val) {
            var A, C, D = function(a) {
                A = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_.";
                C = {
                    "0": 52,
                    "1": 53,
                    "2": 54,
                    "3": 55,
                    "4": 56,
                    "5": 57,
                    "6": 58,
                    "7": 59,
                    "8": 60,
                    "9": 61,
                    "A": 0,
                    "B": 1,
                    "C": 2,
                    "D": 3,
                    "E": 4,
                    "F": 5,
                    "G": 6,
                    "H": 7,
                    "I": 8,
                    "J": 9,
                    "K": 10,
                    "L": 11,
                    "M": 12,
                    "N": 13,
                    "O": 14,
                    "P": 15,
                    "Q": 16,
                    "R": 17,
                    "S": 18,
                    "T": 19,
                    "U": 20,
                    "V": 21,
                    "W": 22,
                    "X": 23,
                    "Y": 24,
                    "Z": 25,
                    "a": 26,
                    "b": 27,
                    "c": 28,
                    "d": 29,
                    "e": 30,
                    "f": 31,
                    "g": 32,
                    "h": 33,
                    "i": 34,
                    "j": 35,
                    "k": 36,
                    "l": 37,
                    "m": 38,
                    "n": 39,
                    "o": 40,
                    "p": 41,
                    "q": 42,
                    "r": 43,
                    "s": 44,
                    "t": 45,
                    "u": 46,
                    "v": 47,
                    "w": 48,
                    "x": 49,
                    "y": 50,
                    "z": 51,
                    "-": 62,
                    "_": 63,
                    ".": 64
                };
                for (var b = [], c = 0; c < a.length; c += 3) {
                    var d = c + 1 < a.length
                      , e = c + 2 < a.length
                      , g = a.charCodeAt(c)
                      , f = d ? a.charCodeAt(c + 1) : 0
                      , h = e ? a.charCodeAt(c + 2) : 0
                      , p = g >> 2;
                    g = (g & 3) << 4 | f >> 4;
                    f = (f & 15) << 2 | h >> 6;
                    h &= 63;
                    e || (h = 64,
                    d || (f = 64));
                    b.push(A[p], A[g], A[f], A[h])
                }
                return b.join("")
            };
            return D(String(val));
        };
    
        // Now we have all the data Let's build the linker String! =)
        // First value is a fixed "1" value, the current GA code does the same. May change in a future
        return ["1", browser_fingerprint, "_ga", hash_cookie_value(cookies._ga), "_gid", hash_cookie_value(cookies._gid)].join('*');
    };
  • Android Debugger 0.0.1 Release . Debugging Native Apps on Android

    This past February, I attended to the SuperWeek Conference in EgerszalΓ³k / HU, where I decided to participate on the Punchcard Prize with a new tool that I had been using interally over the past years. The only difference is that I used to use it with some command line interface and for releasing it I build my first Window application ever ( beside an NFO Viewer for an European demogroup back in the years ).

    After being in a closed Beta for a few weeks and fighting for 2 weeks to get my company certified for a code signing certificate. I’m releasing my Android Debugger to everyone.

    .



    This new tool will allow anyone to easily debug any Firebase (GA4) , GTM ( Limited Info ) and Universal Analytics ( GA3 ) implementations on any Android App.

    I created some videos showing up all the current version features and how to use it in a detailed video with even some voice over. These are the main features:

    Firebase ( GA4 )

    View the events ( included the autogenerated ones, session_start, app_background, etc ) batches in real time, including all the info included the event parameters , user properties, audiences, and any other internal payload data.

    Google Tag Manager

    This is a work in progress report. Right it will allow to see which containers are being loaded, which events are being triggered and the list of varibles being evaluated. ( note this is not app based and it will report all the hits coming from the current connected device )

    Google Analytics ( GA3 )

    Most of you should be used to the data on this report, it basically report the Universal Analytics ( GA3 ) hits payloads ( note this is not app based and it will report all the hits coming from the current connected device )

    More Features

    But this are not the only features it includes, two of the most painful points when needed to debug an Android App are:

    1. Installing the USB Drivers
    2. The need of using libraries or modifying the manifest file

    I’ve some good needs this is not longer an issue with Android Debugger . You will be able to debug any app installed on your physical device or emulator, ( yep, I said ANY ), without needing to modify or ask the developers to make any updates or deploying any debug version.

    And if you’re using Android +11, you’ll be able to pair your device with the debugging just scanning a QR code. Yep, that’s right, no more risk searching for USB drivers on forums, no more issues trying to have your computer properly recognizing your device!

    Of course if you still prefer it you can install the drivers and debug your device using the USB connection, whatever you prefer!

    And last but not least, you will be able to cast/control your device screen from the debugger, which will allow you to record your debugging sessions ( remember the debugger shows the data in real time ), or you ‘ll be able to share your debugging with clients/co-workers ) . How cool is that?. I have to mentions this feature is offered by scrcpy ( I wish I was remotely capable of writing something like that! ).

    How to use it

    Installation

    You can DOWNLOAD the installer from the following link:
    https://www.analytics-debugger.com/tools/android/ , and the tool with be opened as soon it’s installed.

    Note: The current installer/tool is signed, but Windows needs some installs in order to verify the application, because of this some of you may be getting this message. Just click on the Run Anyway button.

    The first time you run the tool, you’ll be asked to input a license key, don’t worry since this is FREE. Just click on the link provided to create a new account and get a free license.

    More notes

    Some people may get a windows notification about the app wanting to use the networks, this happens when you want to do the Wireless Pairing, which is basically what Windows alersts you about. This will only be shown once.

    Support

    I’m offering this tool as “it is” , and I’ll be trying to make improvements and updates as long as my daily work allows me to work on it. In you have any bug report/comment please reach me on Twitter.

    Links

    Tool Page / Download: https://www.analytics-debugger.com/tools/android/

    Extra (not directly)Related

    Purchard Prize. It reads: “THE VERY FIRST POST-COVID19 BRONZE PUNCHCARD PRIZE #SUPERWEEK ANALYTICS SUMMIT”
    Me, trying not to collapse during my presentation on the SuperWeek 2022. I ended on the 3rd position.

  • GTM/GA Debugger 1.0.0

    It took me almost a year ( once again ) to have a new version, but this time I did the things properly, I’ve restarted the tool totally from scratch, I re-did the dev environment, I re-coded the detections code, I changed the CSS Framework, I updated the JS backend.

    There have been hundreds of commits, hundreds of hours , and so much fun and learning in the process. If you ask me the main features beside the more accurate reporting are the new Preview Enhancer and the new full support for Google Analytics 4.

    GTM/GA Debug 1.0.0 Splash Screen



    In the other side, I’ve reestructured the whole tool allowing me now publishing standalone functions/ fixed individually ( which was not possible before and the main reason for not having some regular updates ).

    The are a lot of new cool stuff in the upcoming features queue, so stay tunned :).

    Now, before listing some of the new stuff, I would like to mentioned that you can support this tool in some several ways,

    Buying me a coffee : Yep, I can’t stop eating or even dring, but I won’t go far without my daily coffee doses.

    Leaving a Review at CWS: I want to leave a review .

    OVERALL

    • Much more curated layout (now most things looks fine on responsive mode )
    • Less memory usage
    • Improved reporting quality
    • Special work on β€œSPA” pages repoting ( a new β€œvirtual” page block is created )
    • Upgraded from Vue2 to Vue3
    • Moved from Vuex to Pinia
    • Moved from Bulma.io to TailwindCSS
    • Hundreds of fixes
    • New GA / GA4 Blocking tool
    • Better error management
    • Better UX

    GTM

    1. Google Tag Manager Detection is now much accurate and faster.
    2. It now supports any Google Tag Managertype container ( including GTAG/AMP ones )
    3. It now supports even support multiple dataLayer.
    4. any dataLayer push types are now supported (Functions, [], etc )
    5. Improved GTAG pushed report ,, they now show the commands and paramters
    6. New Preview Mode Enhancer, do into preview with a single click, avoiding the race condition created by the native GTM preview
    7. New active previews report , shows all containers on preview including the current worsspace name and preview time stamp
    8. Now you can preview AMP Containers!

    GA

    1. Added support for server side hits
    2. Added support for AMP hits
    3. Consent Mode reporting ( shows if the hits contain the Google Consent mode data )

    GA4

    • New Reporting
    • Support for Server Side Hits
    • Better Items Reporting
    • Report on GA4 Server Side Reponse hits
    • Report on GA4 Server Sude Set Cookies
    • Better batching waiting report ( it shows a spinner )

    EEC

    • Better Ecommerce detection
    • Improved data tables
    • Impressions are now shown grouped
    • Promotions and impressions are updated in real time

    Check some demostrations videos below

  • Storage-Less Session Tracking with Google Analytics

    This weekend I was doing a, long-due, room cleanup and I found buried in the bottom of a drawer an old hard I was using some years ago. Within it’s content I found one folder named “WIP” ( Work In Progress ) and there I found some experiments, tools and proof of concepts I was working on 4-5 years ago and that got lost on the desk drawer when I upgraded my computer to a SSD drives.

    It seems that at some point I was playing around with some way of doing a “storage-less” session tracking for Universal Analytics.

    We’ll be relying on the window.name to keep our clientId across our user’s navigation journey. There will be some handicaps of course, but at some point someone may find these handicaps a reasonable price for keeping their users privacy in place.

    The window.name property is used for setting targets for hyperlinks ( if you even wondered how some sites can open links on some specific window/tab ).


    Another good point is that is widely supported. While on JavaScript 1 it was a only read property since JavaScript 1.1 it’s a read/write.

    window.name browser compatibilty

    With all this said, let’s started setting up everything for our tracking in Google Tag Manager , we’ll be only need 2 variables, one will be simple JavaScript Variable that we’ll be using for reading the clientId from the window.name property back to the tracker ( Universal Analytics Tag ), and a customTask Variable for writing the clientId.

    We’ll just need 2 variables for getting the tracking in place, one will trying to read the clientId from the window.name property and will return undefined if it’s not set. For sanity reasons, we’re encoding the clientId using BASE64 and using a prefix to properly detect if the current stored value stored is valid. We’ll be using this variable as the clientId field in our tags:

    The second one, it’s pretty simple customTask variable , that we’ll grabbing the clientId from the tracker model and writting it down to our window.name property.

    cjs – customTask – set window.name

    function(){  
      return function(model) {
          window.name = "CLT:" + btoa(String(model.get('clientId')));
      } 
    }


    cjs – clientId

    function(){
      if(window.name.match(/^CLT/)){
        return atob(window.name.split('CLT:')[1]);
      }else{
         return undefined
      };  
    }

    Now that we have everything, let’s configure our tags, we’ll need to set the “storage” and “cookieUpdate” fields to none and false to properly prevent our tracker to set any cookie.

    Since we want to make this 100% GDRP/Privacy compliant, we’re seeting the storeGac to false and we’re switching on the IP anonimization.

    We’re all set now. Our clientId will be kept as long as the user stays on the current tab. This means that the client won’t be kept if:

    • Open a links on a new tab/window ( target: _blank or right clicking on a link )

    I know this tracking approach won’t be enough for most of users, needing to keep the current users cross-session state across sessions, but it will be enough if we only want to track sessions.

    Also, take in mind that the mainly pourpose of this post is showing a way

    Video Demo

  • How to redact PII Data from Google Analytics 4 hits

    If I were asked about some missing feature on Google Analytics 4 ( a.k.a. APP+WEB, New Web Analytics ), I would say it would be the lack of the customTask functionality that my friend Simo has leveraged in the last years.

    Sadly at the moment there’s nothing similar available ( I really hope to have something in the future ). In the past I collaborated on this Brian Clifton’s post/code about How to Remove PII from Google Analytics, So I decided to base the redacting logic on it, just because a lot of people may have already some custom regex list and setup that could be re-used on here.

    How it works

    Google Analytics 4 bases it’s tracking on using navigator.sendBeacon for sending the hits, and falling the old-fashined new Image() functionality if for any reason the current browser doesn’t support the first one.

    What we are doing in Monkey Patching the browser’s sendBeacon functionality using a Proxy Pattern. In order to remove any PII (Personally Identificable Information) from hits payload before they reach the Google Analytlics 4 Endpoint.

    Monkey patchingΒ is a technique to add, modify, or suppress the default behavior of a piece of code at runtime without changing its original source code. It has been extensively used in the past by libraries, such as MooTools, and developers to add methods that were missing in JavaScript.

    https://www.audero.it/blog/2016/12/05/monkey-patching-javascript/

    I don’t expect GA4 to be failing over the new Image hits many times, but I’m currently working on adding some support for also redacting the hits being sent using this method.

    Before going forward

    Monkey Patching “never is” a the right way to go, but neither Google Analytics 4 or sendBeacon offers anything to achieve this functionality, so it’s the last option to go.

    The current code, only tried to override the hits going to Google Analytics 4 endpoint, and leaves any other hits to go in a transparent mode. I’ve also tried to check everything I was able to think of in order to prevent any issues.

    Setting Up Everything

    The only thing you need to do is running the attached code to your site, “before” GA4 fires any hit.

    If you are using Google Tag Manager you should be using the Tag Secuencing for firing the code before the Config tag is fired, refer to the next screenshot for more details:

    If you’re using Tealium, you should run this as a “Pre Loader” extension for example.

    Example of Redacted GA4 Payload Hit

    The Code

    (function() {
    
        /*
        *  
        * Analytics Debugger S.L.U. 2021 ( David Vallejo @thyng )
        *  MIT  License
        * All redact Logic is ran within this function
        * 
        */
        window.__piiRedact = window.__piiRedact || false;
        var piiRedact = function piiRedact(payload) {
            // Regex List
            var piiRegex = [{
                name: 'EMAIL',
                regex: /[^\/]{4}(@|%40)(?!example\.com)[^\/]{4}/gi,
                group: ''
            }, {
                name: 'SELF-EMAIL',
                regex: /[^\/]{4}(@|%40)(?=example\.com)[^\/]{4}/gi,
                group: ''
            }, {
                name: 'TEL',
                regex: /((tel=)|(telephone=)|(phone=)|(mobile=)|(mob=))[\d\+\s][^&\/\?]+/gi,
                group: '$1'
            }, {
                name: 'NAME',
                regex: /((firstname=)|(lastname=)|(surname=))[^&\/\?]+/gi,
                group: '$1'
            }, {
                name: 'PASSWORD',
                regex: /((password=)|(passwd=)|(pass=))[^&\/\?]+/gi,
                group: '$1'
            }, {
                name: 'ZIP',
                regex: /((postcode=)|(zipcode=)|(zip=))[^&\/\?]+/gi,
                group: '$1'
            }];
    
            // Helper Convert QueryString to an Object 
            var queryString2Object = function queryString2Object(str) {
                return (str || document.location.search).replace(/(^\?)/, "").split("&").map(function(n) {
                    return n = n.split("="),
                    this[n[0]] = decodeURIComponent(n[1]),
                    this;
                }
                .bind({}))[0];
            };
            // Helper Convert an Object to a QueryString
            var Object2QueryString = function Object2QueryString(obj) {
                return Object.keys(obj).map(function(key) {
                    return key + '=' + encodeURIComponent(obj[key]);
                }).join('&');
            };
            // Convert the current payload into an object
            var parsedPayload = queryString2Object(payload);
            // Loop through all keys and check the values agains our regexes list
            for (var pair in parsedPayload) {
                piiRegex.forEach(function(pii) {
                    // The value is matching?
                    if (parsedPayload[pair].match(pii.regex)) {
                        // Let's replace the key value based on the regex
                        parsedPayload[pair] = parsedPayload[pair].replace(pii.regex, pii.group + '[REDACTED ' + pii.name + ']');
                    }
                });
            }
            // Build and send the payload back
            return Object2QueryString(parsedPayload);
        };
        if (!window.__piiRedact) {
            window.__piiRedact = !0;
            try {
                // Monkey Patch, sendBeacon 
                var proxied = window.navigator.sendBeacon;
                window.navigator.sendBeacon = function() {
                    if (arguments && arguments[0].match(/google-analytics\.com.*v\=2\&/)) {
    
                        var endpoint = arguments[0].split('?')[0];
                        var query = arguments[0].split('?')[1];
                        var beacon = {
                            endpoint: endpoint,
                            // Check for PII
                            query: piiRedact(query),
                            events: []
                        };
                        // This is a multiple events hit
                        if (arguments[1]) {
                            arguments[1].split("\r\n").forEach(function(event) {
                                // Check for PII
                                beacon.events.push(piiRedact(event));
                            });
                        }
    
                        // We're all done, let's reassamble everything
                        arguments[0] = [beacon.endpoint, beacon.query].join('?');
                        if (arguments[1] && beacon.events.length > 0) {
                            beacon.events.join("\r\n");
                        }
                    }
                    return proxied.apply(this, arguments);
                }
                ;
            } catch (e) {
                // In case something goes wrong, let's apply back the arguments to the original function
                return proxied.apply(this, arguments);
            }
        }
    }
    )();
    

  • HTML Media Elements Tracking Library

    Some years ago I wrote a post about how to Track html5 videos which has been widely used and copied around the web. 2 years ago I wrote a total new tracking code , which I never publicly released.

    Today I’m releasing a total new refactored code, for tracking HTML Media Elements. This means tracking <video> and <audio> elements.

    This is my first library that I’ve build thinking on it about being a full library to be used along any project, instead of being a snippet to be used on a Google Tag Manager Tag. Because of this I’m providing the library in the following formats AMDUMDIIFE and ESM . So it can be used anywhere. At the same i’m providing a CDN access via jDelivr.

    The library will take care of initializing the tracking and pushing the data back to Google Tag Manager ( using a dataLayer.push ), to Tealium ( using a utag.link ), or just to the console . Along with the event a full data model will be sent, with some details about the current event and the video ( the video title, duration, visibliity status, etc ).

    The current data model is based on Google Tag Manager’s Youtube Tracking Trigger / Model, making available the use of the current in-built video variables on GTM.

    The library will take or tracking the current videos on the page, but will also be able to “detect” newly added elements on the page ( like videos added on modals , or loaded programmatically ), that will also be tracked with no hassles. Just setting observe switch to true will enable the use of the Mutation Observer API ( where available ), to do this work for you,

    This is not all, along with this new library I’m releasing a Google Tag Manager Custom Template, will makes event easier the setup, just adding the template along with a DomReady Trigger and you’ll be done.

    HTML Media Elements Custom Template


    Using a custom Video Title

    When using HTML Media Element, we don’t have a way to pass any video details, this library will allow you to customize the current video Title being reported.

    < video src="" data-html-media-element-title="Demo Video version 1">

    This will make the VideoTitle to be reported as “Demo Video version 1“, is there’s not data-attribute the library will use the current video file name

    Passing back video details

    Not only you can pass the video Title library is totally eases the work of passing back to the events using data-elements.

    You can pass all the custom data you need about the video to have it passed back to the tracking events. To achieve this we can all the data we want to the videos using data-attributes.

    This can be done using data-attributes with the following format:

    data-html-media-element-param-{{PARAM NAME}}="{{PARA VALUE}}"

    All the data added to the <video> elements will be passed back to events so you can used them.

    For example:

    < video width="400" 
    controls 
    data-html-media-element-param-band="Neil Zaza"
    data-html-media-element-param-song-name="I'm Alright"
    data-html-media-element-param-category="Music"
    data-html-media-element-title="video test">
        <source src="mov_bbb.mp4" type="video/mp4">
        <source src="mov_bbb.ogg" type="video/ogg">
        Your browser does not support HTML video.
    </video>

    This will turn on have a videoData (or audioData) object passing the data this way:

    {
         element:  video
         elementClasses:  ""
         elementId:  "vbst4f9ed29"
         elementTarget:  video
         elementUrl:  "https://local.dev/demo/mp3.html"
         event:  "video"
         videoCurrentTime:  2
         videoData:
        	 band:  "Neil Zaza"
        	 category:  "Music"
        	 songname:  "I'm Alright"
         videoDuration:  361
         videoElapsedTime:  2
         videoIsMuted:  false
         videoLoop:  false
         videoNetworkState:  1
         videoPercent:  0
         videoPlaybackRate:  1
         videoProvider:  "html5"
         videoStatus:  "pause"
         videoTitle:  "video test"
         videoUrl:  "mov_bbb.mp4"
         videoVisible:  true
         videoVolume:  1
     }

    Library Usage

    Web Page

    <script src="https://cdn.jsdelivr.net/npm/@analytics-debugger/html-media-elements@latest/dist/htmlMediaElementsTracker.min.js">
    
    <script>
        window._htmlMediaElementsTracker.init({
            tms: 'debug',
            datalayerVariableNames: ['auto'],
            debug: true,
            observe: true,
            data_elements: true,        
            start: true,
            play: true,
            pause: true,
            mute: true,
            unmute: true,
            complete: true,
            seek: true,
            progress: true,
            error: true,
            progress_tracking_method: 'percentages',
            progress_percentages: [1,2,3,4,5,6,7,8,9,10],
            progress_thresholds: [],        
        });   
    </script>

    NPM

    npm i @analytics-debugger/html-media-elements

    Configuration Settings

    key namevalue typedescription
    tmsstringTag Management System we are using . Accepted values:
    “gtm”, “tealium”, “debug”
    datalayerVariableNamesarrayIf the TMS is Google Tag Manager, we can push the data to an specific dataLayer , by default the library will search for the current dataLayer variable name
    debugbooleanEnable debug output to console
    observebooleanAutomatically track newly added video/audio elements
    data_elementsbooleandata-html-media-element-title attribute will be used for elementTitle if provided
    startbooleanTrack Audio/Video Start Event
    playbooleanTrack Audio/Video Play Event
    pausebooleanTrack Audio/Video Pause Event
    mutebooleanTrack Audio/Video Mute Event
    unmutebooleanTrack Audio/Video Unmute Event
    completebooleanTrack Audio/Video End Event
    seekbooleanTrack Audio/Video Seek Event
    progressbooleanTrack Audio/Video Progress Events
    progress_tracking_methodboolean‘percentages’ or ‘thresholds’ // thresholds not available yet
    progress_percentagesarrayArray of % where we should fire an event
    progress_thresholdsarrayTBD

    We will be able to track the current HTML Media Elements Events ( Start, Play, Pause, Mute, Unmute, Complete, Seek, Progress ). We’ll just need to set to true the events we want to track within the init config variable.

    Along with the events the library pushes some details about the video.

    Data Model

    KeyValue ExampleDescription
    eventgtm.audio/gtm.videoCurrent Media Element Type
    Providerhtml5Fixed value, describes the current media element provider
    Statusstart,pause,mute,unmute,progress, seek, completed, errorcurrent media element event name
    Urlhttp://www.dom.comCurrent Video Holding URL ( iframe url reported if it’s the case)
    TitleVideo DemoCurrent video element data-media-element-title value, defaults to current video file name
    Duration230Media element duration in seconds
    CurrentTime230Media element current time in seconds
    ElapsedTime230Elapsed time since last pause/play event
    Percent15Media element current played %
    Visibletrue|falseReports if the video is visible within the current browser viewport
    isMutedtrue|falseIs the current media element muted?
    PlaybackRate1Media Element PlaybackRate, default: 1
    Looptrue|falseIs the video set to loop?
    Volume0.8Current Video Volume
    NetworkStateNetwork State
    DataObjectList of custom video data coming from data-attributes tagging
    elementClasses“”Element Classes List
    elementId“”Element Id
    elementTarget“”Element Target
    elementUrl“”Element URL

    Configuring The

    JSDelivr CDN: https://www.jsdelivr.com/package/npm/@analytics-debugger/html-media-elements

    Template URL: https://tagmanager.google.com/gallery/#/owners/analytics-debugger/templates/gtm-html-media-elements-tracker

    GitHub: https://github.com/analytics-debugger/html-media-elements-tracking-library

    Demo Page: https://www.analytics-debugger.com/demos/gtm-html-media-elements/

  • Tracking Google Analytics 4 Events using Data Attributes

    I must admit it, I like to use data-attributes for user clicks interactions rather than delegating that work on the IT team or relying on the class/id attributes. ( Data Attributes Tracking ) .

    For Universal Analytics, this was some kind of easy work, since we had some fixed data attributes names (category, action, label, value when talking about events, or pagepath if we wanted to use a virtual pageview ). With the new event based tracking model on Gooogle Analytics 4 ( GA4 , formerly APP+WEB ), this has change, and we have a single hit type which is going to be an “event” all the time, but them we have an unlimited possiblities of parameter names.

    On this post I’ll showing my approach to automate the events tracking on Google Analytics 4 using data attributes. Let’s go for it.

    First we’ll need a data-attribute named “data-ga4-event” , this one will allow us on the next steps to setup a CSS Selector to trigger our tags.

    Then for the events parameters we’ll use the following format: data-ga4-param-{{PARAM_NAME}} .
    Note that data attributes use kebab-case, so we’ll using is as “clicked-link-url”

    DATA ATTRIBUTES
    data-ga4-event{{event name}}
    data-ga4-param-{{PARAM_NAME}}one per each needed parameter
    Data Attributes Definition

    Let’s now see some examples. A simple event without parameters will look like this:

    <button id="cta"
    data-ga4-event="cta click"
    >CHECKOUT</button>

    and if we need to pass some paraemters it will look like:

    <a href="https://twitter.com/thyng"
        data-ga4-event="social_link_click"
        data-ga4-param-social-network-name="twitter"
        data-ga4-param-social-network-user="thyng"
    >Follow me on Twitter</button>

    You may now be thinking, that would need a separate JS snippet for each event, but we’ll be using some JS magic to automatically convert this data attribute tagging on dataLayer pushes automatically.

    (function() {
        // Grab all tagged elements
        var events = document.querySelectorAll('[data-ga4-event]');
        var unCamelCase = function(str, separator) {
            separator = typeof separator === 'undefined' ? '_' : separator;
            return str.replace(/([a-z\d])([A-Z])/g, '$1' + separator + '$2').replace(/([A-Z]+)([A-Z][a-z\d]+)/g, '$1' + separator + '$2').toLowerCase();
        }
        for (var i = 0; i < events.length; i++) {
            events[i].addEventListener('click', function(event) { 
                var target = event.currentTarget;
                if(target){             
                    var dl = {} 
                    dl['event'] = target.dataset['ga4Event'];
                    Object.entries(target.dataset).forEach(function(e) {
                        var key = e[0];
                        var value = e[1]
                        var m = key.match('ga4Param(.+)');
                        if (m && m[1]) {
                            dl[unCamelCase(m[1],'_')] = value;
                        }
                    })                
                    window.dataLayer.push(dl);
                }
                            
            });
        }
    })()

    The snippet above will take care of building a dataLayer push each time a click is performed on a element that has a data-ga4-event attribute, and will take also care of converting all data-param-X attributes in snake_case parameters within our event push.

    As example our previous example:

    <a href="https://twitter.com/thyng"
        data-ga4-event="social_link_click"
        data-ga4-param-social-network-name="twitter"
        data-ga4-param-social-network-user="thyng"
    >Follow me on Twitter</button>

    Will turn being the following dataLayer push:

    window.dataLayer.push({
        "event": "social_link_click",
        "social_network_name": "twitter",
        "social_network_user": "thyng"
    });

    Of course you could add some more feature to this snippet, for example for automatically sanitizing the values before the push is sent, or you could build some black list to prevent any non-predefined event to go into your reports.

  • New Release: GTM/GA Debugger 0.4.0

    It’s been a long time since the last post, even more since the last extension update. To be exact it took me around 1 year to have this new version ready.

    The main reason for this delay was that I switched how the extension is built at least 5 times. I don’t consider myself a developer which implies that many times I end choosing not the best stack I should. Anyway this has been a real opportunity for my to learn a lot of new technologies/frameworks I didn’t know about or just I never was able to understand, just to mention some: React / Svelte , WebPack, Rollup, Git, Gulp, Trevis. So at this point I’m really “happy” of all the time “wasted” on refactoring the extension so many times.

    In case you’re interested after these so many changes, I ended building the extension using Vue.js 2 and Bulma as the CSS Framework. This has allowed me to build an extension that it’s faster, it’s build on top of some good tecnhologies ( instead of having thousand of non-efficient JS code lines ).

    I know that for most people most of the changes won’t be noticiable, mostly because I tried kept the UI as it was in the previous version, but internally everything is different, while also como new features where added.

    In the following video, I’m showing an overview of what the new extension has to offer:

    GTM/GA Debugger Features

    • GTM/GTAG Debug Support
    • Multiple dataLayer Support ( View all the dataLayer pushes and current state )
    • View all Universal Analytics Hits being sent
    • View all GA4 (App+Web) Hits being sent
    • Filter out the hits by the type or property/stream ids
    • Filter out the dataLayer pushes by their type ( core, ga4, custom, etc )
    • Parse Hits payload to see a human.friendly keys translation
    • Enhanced Ecommerce Report ( based on GA/GA4 hits )
    • All Reports are in Real Time
    • Copy any Hit/dataLayer push info to the clipboard in a friendlyu format within a mouse click
    • Trace any Hit/dataLayer push
    • Real Time GA hits Payload debugging
    • More …

    I really lost track all everything that was added on this specific release, so I’m providing a quick Changelog

    Changelog

    • [NEW] – Now it’s based on Vue2.js + Bulma
    • [NEW] – GA4 Hits Full Support
    • [NEW] – GA4 Ecommerce Support
    • [NEW] – Multiple dataLayer Support
    • [NEW] – Multiple GTAG/GTM Containers support
    • [NEW] – Copy hits as string
    • [NEW] – Hits Stack Trace Reporting
    • [NEW] – Hits Debug ( run the hits againts official GA debug endpoint )
    • [NEW] – DataLayer Pushes Stack Trace Reporting
    • [NEW] – GTM Preview Enhancer
    • [ENHANCEMENT] – Debugging can be started clicking on a button rather than needing to press F5
    • [ENHANCEMENT] – Improved GTM/GA/GA4 detection – Faster detection delay
    • [ENHANCEMENT] – Improved GTM/GA/GA4 detection – Better accuracy
    • [ENHANCEMENT] – Improved SSR/SPA/PWA pages debugging.
    • [ENHANCEMENT] – Pushes/Hits timing are now correct and are shown in the real order they are triggered
    • [ENHANCEMENT] – UI is now more responsive, showing a better interface when using it in the sidebar
    • [FIX] – All bugs reported ( sites where the tool was not working properly ) has been addresses . Thanks to everyone that helped on reporting
    • [FIX] – Incogonito Mode Support
    • [FIX] – GA detection for hits non-send to GA endpoints
    • [FIX] – GTM detection locally served containers
    • [FIX] – +40 Tickets og bugs.

    As you may noticed some tools are gone: the Data Attributes Inspector and the Profiler Tab Report, I removed this feature for this release in order to focus on the tool reliability, they will be added back on the next releases.

    More news about the extension is that it will be available for Firefox, Opera and Edge ( as soon as I can’t have it approved on their marketplaces )

    Now I’m looking for some betatesters that will help me on identifying issues on some new releases. Yay!.

    Last big new is that hit the 40.000 users this past week. Yeah, according to Chrome Store data, the extension is being used by more than 40K users weekly, I’d never thought the tool was end having these many users, but also this created some “responsability” at my side that I’m currently not sure how to handle it.

    In the last year I declined all the extension puchase offers and also I didn’t accept any offer for adding ads within the tool, I really want to keep this tool free of ads, but it really takes lot of time. Because of this I decided to start accepting donations via Ko-Fi, Getting some help this will allow to publish updates more regularly. This is some totally opcional, I’ll keep working on the extension anyway, but some people in the past asked for being able to help.

    Click on the button below if the extension has been helpful for your work:

    Buy Me a Coffee at ko-fi.com


    Now if you are not still using the extension you can get it for free in the following link: INSTALL EXTENSION

  • Tracking user’s IP Autonomous System Number and Organization details to prevent the spam

    Around end of 2019, Google Analytics dropped the Network Domain and Service Provider dimensions support from their reports making an official announment in February about it.

    These 2 dimensions, where widely used to fight the spam in Google Analytics and there have been a lot of posts going around this topic in the last months. Simo Ahava wrote about how to collect the ISP data third party service in you want to check it.

    On this post we’ll learning what’s an Autonomous System and how we could use this info to try to fight the spam. And coolest part is that we’ll be able to use a free database for this. Continue reading πŸ™‚

    There are some other services and commercial databases that will provide this details, but let’s be honest there’re some big handicaps:

    • If you use a free services, you will hit the limit quota quickly
    • If you have a high traffic website this is not going to be cheap

    There’re basically 3 different types of subscriptions, SaaS ( they host the app and the database, DB ( you host the Database and the query system ), WebService.

    I’m attaching a list of some of the providers available, in case you want to check them.

    SaaSDBWebServiceUpdates
    MaxMindβœ… βœ… βœ… Weekly/Monthly
    IP2Locationβœ… βœ… βœ… Monthly
    IPStackβŽβŽβœ… Hourly
    ip-apiβŽβŽβœ… ?
    ipgeolocation.ioβŽβœ… βœ… Weekly
    db-ipβŽβŽβœ… Monthly/Daily
    Abstract IP Geolocationβœ…βŽβœ…Daily

    In any case there are a lot of posts around this topic on the web, and I’m trying to give this issue a new solution.

    MaxMind provides their GEO LITE databases for Free , these database are updated weekly ( on Tuesdays to be exact ) and they provide info about:

    • Countries
    • Cities
    • ASN

    The main difference on this databases with the paid ones is how accurate they are and how often they get updated. This accuracy may be an problem when we need to target users based on their city, but this time this is not what we’re looking for, we’ll looking at their ASN database.

    If you are wondering ASN stands for Autonomous System Number. According to the Wikipedia:

    An autonomous system (AS) is a collection of connected Internet Protocol (IP) routing prefixes under the control of one or more network operators on behalf of a single administrative entity or domain that presents a common, clearly defined routing policy to the internet.[1]

    https://en.wikipedia.org/wiki/Autonomous_system_(Internet)

    ASNs are a “big” routers on the ISPs and datacenters that are in charge of announcing the IP addreses they hold. ( sorry for this unaccurate description, trying to make this simple ) in order to let other AS to know how to reach their IP addreses.

    Each ISP usually have their own ( they can have more than 1 ) . ASN. For example one of main ASN in Google is: AS15169 registered to Google LLC, and this Autonomous System manages 9.5 millions IPs from Google:

    https://ipinfo.io/AS15169

    This means that we could query any IP address we and the ASN database will return their current ASN that it belongs to.

    For example we may query Google DNS’s IP address: 8.8.8.8 and the database will return the AS number and the organization name:

    Array
    (
        [autonomous_system_number] => 15169
        [autonomous_system_organization] => GOOGLE
    )

    Some other examples let’s query for this Fastly CDN IP address 151.101.134.133

    Array 
    (
       [autonomous_system_number] => 54113 
       [autonomous_system_organization] => FASTLY
    )

    Or let’s query for an IP in a dedicated servers provide like LiquedWeb

    Array
    (
        [autonomous_system_number] => 32244
        [autonomous_system_organization] => LIQUIDWEB
    )

    We could use the AS Number and the Organization names as a way to try to catch the spam, since most spam traffic is likely going to come from a co-location / vpn providers that we could identify this way.

    Since it’s a database we’ll need to setup a small endpoint in our domain in order to be able to query it. This implies some IT development but in the other side it has some big wins:

    There will be NO query limits.

    The cost of having this solution running is the cost endpoint development

    We could have our website developer querying this info via server-side and have this data pushed to the dataLayer instead of needing to have an extra XHR request and needing to delay the hits, YAY!

    Now, in the order side of the road there some handicaps:

    • Not as accurate data as network/domain in other databases
    • Data freshness accuracy won’t be premium, but as we all know GA wasn’t either.

    Getting the ASN DB

    As I’ve mentioned above the GeoLite ASN database is free and you’ll be able to get it after signup for a free account at : https://dev.maxmind.com/geoip/geoip2/geolite2/

    PHP Example

    Another good point is that MaxMind already provides libreries for PHP/NodeJS/Perl and other languages to help on reading querying their GEOLite databases, which helps on setting up our endpoint.

    As usual I’m providing a example for PHP, since it’s the most widly used language and the one that it’s avaiable on almost any hosting around the world

    If we don’t have composer installed yet, that’s gonna be our first step:

    curl -sS https://getcomposer.org/installer | php

    next, we’ll be installing the needed dependences

    php composer.phar require geoip2/geoip2:~2.0

    <?php
    require_once 'vendor/autoload.php';
    use GeoIp2\Database\Reader;
    $ip_as_details = new Reader('geo/GeoLite2-ASN.mmdb');
    $asn_details = $ip_as_details->get('8.8.8.8');
    // As this point we could build a JSON and send it back to the browser.
    print_r($asn_details);

    Last step will be passing back this info to Google Analytics using a custom dimension, so we can use it in our filters or segments.

    Extra – Grabbing the network domain

    I was about to publish the post and I decided to add a little extra , let’s also learn how to track the “network domain” .

    Google Analytics was using the IP’s PTR for the “network domain” . Again you may wonder what’s “PTR” , and it stands for “Pointer record” and it basically resolves an IP to a FQDN ( fully-qualified domain name ). This is it’s the inverse of a A DNS Record.

    For example we can make a Reverse IP Lookup to google DNS’s and it will return “dns.google”.

    root@sd1:/# nslookup
    > set q=ptr
    > 8.8.8.8
    Server:         8.8.8.8
    Address:        8.8.8.8#53
    Non-authoritative answer:
    8.8.8.8.in-addr.arpa    name = dns.google.

    Or we may try with one Google Bot IP address, which most sea must be familiar

    > set q=ptr
    > 66.249.66.1
    Server:  dns.google
    Address:  8.8.8.8
    Non-authoritative answer:
    1.66.249.66.in-addr.arpa        name = crawl-66-249-66-1.googlebot.com

    Last example let’s query google.com IP address

    > set q=a
    > google.com
    Server:  dns.google
    Address:  8.8.8.8
    Non-authoritative answer:
    Name:    google.com
    Address:  172.217.17.14
    > set q=ptr
    > 172.217.17.14
    Server:  dns.google
    Address:  8.8.8.8
    Non-authoritative answer:
    14.17.217.172.in-addr.arpa      name = mad07s09-in-f14.1e100.net

    If we want to have the network domain info back in our GA reports we’ll just need to parse the hostname of the PTR for grabing just the root domain, on this last case it would be: 1e100.net .

    I wouldn’t advise about tracking to full ptr hostname for 2 reasons: First mosts of hostname are a mix of the IP address + a the ISP domain which will be agains the GDPR ( we cannot record the user’s IP address ) and also it will create a high cardinality which won’t help on analyzing the data.

    Now, remember that we were building and endpoint in PHP to get the ASN details, just some more lines of data would allow to have the network domain pushed into our datalayer! πŸ™‚

    $ip_ptr = gethostbyaddr('8.8.8.8');

    Dealing with getting the root domains, can be a pain task due to all the new domain tlds and needing to have in mind the third level tlds. In case you want to have this done easily you can use the following PHP library https://github.com/utopia-php/domains , which will let you grab the “registable” domain name within a hostname

    require_once '../vendor/autoload.php';
    use Utopia\Domains\Domain;
    // demo.example.co.uk
    $domain = new Domain('demo.example.co.uk');
    $domain->get(); // demo.example.co.uk
    $domain->getTLD(); // uk
    $domain->getSuffix(); // co.uk
    $domain->getRegisterable(); // example.co.uk
    $domain->getName(); // example
    $domain->getSub(); // demo
    $domain->isKnown(); // true
    $domain->isICANN(); // true
    $domain->isPrivate(); // false
    $domain->isTest(); // false

    I’m providing the example in PHP language, but it doesn’t mean you have to use it at all, this code/idea can be developed on almost any server-side language you may be using. In the last instance you run a small VM or VPS to have a PHP environment where you can host your endpoint :).