GA4 Migration and Session-Stage Knowledge Challenges:
We at the moment are about six months into the large GA4 migration. At this level we’ve settled into what we will and may’t do. However there may be nonetheless extra discomfort than what we’d hoped for six months into a brand new product. One of many greatest points is the dearth of session-level data. The system was not constructed with classes in thoughts and a few might argue that’s effective. However quite a lot of occasions, you wish to perceive how specific campaigns are working in the intervening time. As an illustration, how are your Black Friday efforts doing on, you recognize, Black Friday? This knowledge is basically exhausting to return by in GA4. However worry not, we will probably be tackling this difficulty and extra on this weblog!
What this weblog is!
The aim of this weblog is to empower YOU to arrange a real-time channel & marketing campaign dashboard shortly and effectively. This weblog will equip you with 2 SQL queries that can be utilized to drag yesterday’s knowledge and the earlier 30/60 minutes of information from GA4, and show it in a looker studio report.
What this weblog is just not!
This weblog is just not a complete tutorial on GA4, BigQuery, or the mix of the 2. This weblog is a way to an finish to get you up and operating with a real-time dashboard. If you wish to perceive the entire detailed features of BigQuery & GA4, I’d advise taking the great Simmer course led by the sensible Johan van de Werken. That course is a deep dive into GA4 & BigQuery. Our method immediately is to empower entrepreneurs to discover a essential report that they could be lacking. We’ll be taking shortcuts and skipping some fundamental definitions. You’ve been warned! Now let’s go!
The issue with GA4 knowledge…(nicely, one of many issues)
There are quite a lot of points with GA4 however there’s one which comes up time and time once more. One which simply causes heartbreak amongst so lots of my shoppers. One which rips folks’s souls in half as I see them wince each time I point out it…yesterday and immediately’s knowledge simply doesn’t work correctly in GA4. The information doesn’t totally exist within the platform till 48 hours go by.
What do you imply by that?
Nicely let’s have a look…based on Google Analytics documentation, GA4 can take up to 48 hours to process data from your website or app.
I’ve seen the info processing occasions be each quick and gradual, however the reality is you possibly can’t depend on the info populating shortly. This generally is a actual pain-in-the-neck for all companies, however particularly for these within the ecommerce business.
However Josh! How am I going to see how my vacation sale is performing? How can we make fast choices to see how our campaigns are changing YoY and even DoD? Can we at the very least go into actual time to get a intestine test of how our campaigns are performing within the final half-hour? We simply despatched an e-mail out–are folks clicking it?
Actual-Time Fails in GA4
Nicely that’s sort of difficult too. GA4 at the moment shows solely the primary contact data in real-time on an mixture stage.
So as an example if I’m going to our web site with this UTM hyperlink: https://ftf.co/?utm_source=josh&utm_medium=lookiminvisible&utm_campaign=trytofindme, you’ll hope that this attribution supply would at the very least present up someplace in Realtime to point out how many individuals are on the positioning. Let’s see if that’s the case.
First we go to the hyperlink.

Now we navigate into Realtime. The very first thing I discover within the attribution part of Realtime is that we solely have entry to First Person knowledge. This isn’t tremendous helpful as I don’t actually care about how an individual initially got here to my web site, I’m to see why a person is there proper now. Think about the state of affairs the place you ship out 1000 emails about your Black Friday sale and you then navigate to see how many individuals are on the positioning from the e-mail you simply despatched out. All of a sudden you see 99 folks from an e-mail about March Insanity- not tremendous helpful huh?

Nicely, does that knowledge simply not exist then? If I dive into the person snapshot and discover my very own session (by probability as there may be not at the moment a approach to discover a particular person within the snapshot) you possibly can see that the marketing campaign, supply and medium knowledge exists…so it clearly acknowledges that I’m on the positioning with these knowledge parameters.

Perhaps we will discover the session knowledge within the comparability filters? I exist right here and subsequently I ought to have the ability to use my supply knowledge to search out myself even when it’s not explicitly out there within the dashboards. Nope, flawed once more! Although Google makes it out there in your filters, you possibly can’t truly use session supply as a real-time filter.

Thus, it’s unimaginable to see what channels are driving your web site visitors at this very second. Until… there was a approach to get real-time attribution knowledge right down to nearly the second. Nicely guess what, Bob – we will with BigQuery!
Join GA4 to BigQuery
Okay so let’s begin this course of. The very first thing you’ll must do is just remember to join GA4 to BigQuery. First you’ll wish to navigate to the admin part of GA4 and be sure you have an energetic hyperlink with BigQuery. In the event you haven’t began setting one up but, you’ll must create a BigQuery account after which link your GA4 project to it.


After getting an energetic hyperlink, it’s going to appear to be this:

The BigQuery connection will take a day or two to begin populating the info into your BigQuery challenge, so be affected person. Moreover, it’s going to ONLY begin populating knowledge after getting linked it; there may be at the moment no assist for retroactive GA4 knowledge in BigQuery.
NOTE: It is possible for you to to create a free hyperlink however you’ll undoubtedly wish to improve to the paid model of BigQuery (it doesn’t price very a lot to retailer the GA4 knowledge.) With out upgrading to paid, you won’t be able to entry the export kind “streaming” which we’ll want for any kind of actual time dashboard. Moreover, you won’t be able to retailer greater than two months of information at a time with out upgrading, so for the sake of having access to full performance, make sure that so as to add a billing account.
You may select whether or not or to not export Person Knowledge – however I might counsel it. We’re not going to be utilizing it for this tutorial however it will likely be useful down the highway.
The Stomach of the BigQuery Beast
Now, we’ll wish to navigate to BigQuery. Go to the Google Cloud Console https://console.cloud.google.com/welcome?challenge=PUT-PROJECT-ID-HERE. This could deliver you on to the proper challenge.
You need to see a brand new dataset beneath your project_id known as analytics_{{streamid}}.
In the event you chosen “Every day Export”, “Streaming Export”, & “Person Knowledge Export” in your connector you will note the next datasets:

Let’s break them down:
- Occasions: That is your day by day repository of historic analytics knowledge. As soon as a day, BigQuery takes the intraday knowledge and “deposits” it into the occasions dataset. It lives in occasions from that time on.
- Events_intraday: That is your “real-time” knowledge, it shows the final 12-24 hours value of information and continues to populate in close to real-time. You may even see yesterday’s knowledge in there till BigQuery processes and packages to the occasions dataset however basically you’ll see principally the final 24 hours.
- Pseudonymous_users: This exhibits details about the “person” in relationship to the GA4 cookie identifier (consumer ID). This has data on gadget, geo, audiences, and so forth all tied as much as the one person.
- Customers (not listed): If you’re sending in Person IDs, you’d have one other dataset of the identical data as pseudonymous person data listed by the person ID
For now we’re going to be specializing in the events_intraday dataset for the aim of this train however the identical logic could be utilized to the occasions dataset.
Manipulating and Condensing the Tables
In the event you dive right into a preview of a type of datasets you’ll see A LOT of columns.

The GA4/BigQuery connector provides us quite a lot of knowledge to mess around with however that doesn’t essentially imply we want all of it the entire time. Querying in BigQuery is what finally ends up costing the large bucks and querying a big dataset like GA4 knowledge can add up! Moreover, a number of the knowledge is damaged down into single columns that we’re used to seeing as one column (i.e. Session Supply & Session Medium). To get to the info we want and the way we’d prefer to see it, we have to construct our personal tables from the prevailing datasets.
Constructing our personal devoted desk with simply the knowledge we want will make discovering the info & visualizing the tables in Looker Studio approach simpler and it’ll prevent fairly a little bit of moola. Moreover, it’ll make it far more light-weight, making it quicker to replace. Throughout wins!
So now we want some SQL! “DID YOU SAY SQL?! OMG I DIDN’T KNOW THERE WAS CODE INVOLVED. I’VE INVESTED ALL THIS TIME IN YOUR LONG-ASS ARTICLE AND NOW I HAVE TO CODE!?”
Wait, wait, earlier than you get all huffy and puffy, I’ve bought a easy answer. Ahmed Ali put out an extremely beneficial web site which does all of the be just right for you: GA4SQL.
There’s a lot to mess around with right here however let’s check out two code snippets to assist us obtain essentially the most essential advertising real-time knowledge

So what does this code do? A few issues:
This SQL question is designed to categorise and mixture session knowledge from an analytics dataset. It categorizes classes into channels based mostly on the supply and medium of the session, utilizing a collection of CASE statements to find out if the session is from the next:
‘Direct’, ‘Cross-network’, ‘Paid Purchasing’, and numerous different outlined channel sorts equivalent to ‘Paid Search’, ‘Paid Social’, ‘Paid Video’, ‘Show’, ‘Natural Purchasing’, ‘Natural Social’, ‘Natural Video’, ‘Natural Search’, ‘Electronic mail’, ‘Associates’, ‘Referral’, ‘Audio’, ‘SMS’, and ‘Cell Push Notifications’. If none of those circumstances match, it defaults to ‘Unassigned’.
The question then aggregates this data, together with different session particulars just like the marketing campaign identify, area, nation, touchdown web page, gadget class, and date to depend distinct classes, sum engaged classes, purchases, and income, and depend whole customers. That is executed inside a subquery that extracts and prepares this knowledge from occasion parameters, visitors sources, and person IDs, filtering the info for the day past.
The question offers an in depth breakdown of person classes by numerous advertising channels and different dimensions, and calculates key efficiency metrics equivalent to engagement, purchases, and income for a given date.
Nice, so now that you’ve got a fundamental understanding of what the code is doing, we have to manipulate it additional to get to what we’re in search of with this report. Presently, the GA4SQL web site doesn’t have an choice for choosing from the intraday dataset, which is the dataset that we have to get actual time stories. Moreover, because it doesn’t pull from intraday, sure dimensions equivalent to hour and minute are lacking.
With that mentioned, a number of modifications must be made to the code to make it work for this objective:
1. It assumes we wish to pull from yesterday’s knowledge. We actually simply need this to drag from immediately’s knowledge. In the event you’d prefer to even have historic knowledge, you should use the code above with some slight modifications (equivalent to date vary) to construct a completely complete BigQuery historic report, however we’re at the moment after the final 24 hours of information.
2. There’s no point out of hours or minutes here- what if we wish to see the exercise of our web site within the final half-hour? We received’t have the ability to do this with the code above.
So there’s slightly modification we have to make.
We’ll wish to construct a view in BigQuery. A view is a digital desk created by a SQL question on our primary desk. This enables for a a lot smaller, versatile knowledge pull.
We will probably be constructing a “by the minute” view for our “final 30 minute LS report”.
Final half-hour:
Some edits have to be made to the SQL question to supply the proper time knowledge, and precisely pull the final half-hour of information out of your supply.
- Replace line 186 along with your desk ID within the format: project_ID.Dataset_ID.Table_ID.
- Replace strains 158 & 159 with the proper offset based mostly on the time zone set in your Google Analytics account. This may be present in Analytics beneath Admin > Property > Property particulars > Reporting Time Zone. The worth must be a string like “-08:00” or “+2:00”
- Replace line 165 with the proper offset. This interval ought to at all times be a constructive integer. Nevertheless, In case your time zone is forward of UTC you could additionally change the TIMESTAMP_SUB operate to TIMESTAMP_ADD, together with updating the interval worth. This line will output the datetime that your occasion occurred based on the time zone set in your GA4 account.
To get the final 60 min of information reasonably than the final 30 min
- On line 189 of the question change “INTERVAL 30 MINUTE” to “INTERVAL 60 MINUTE”
Okay we now have our code now.
Lets run the SQL. In BigQuery, you’ll want to navigate to your GA4 events_intraday knowledge set. As soon as you’re there, run a question.

In your question tab, paste the code above that you just’d prefer to run. Finally, it might be nice to have the choice for each the final 30 minutes and the hourly one. An important factor is to modify out the dataset within the SQL from with YOUR DATASET ID. You may command/ctrl F and seek for intraday inside the SQL. This could get you to the part that you’ll want to change.

When that is executed it can save you your BigQuery copy as a view or save as view relying on should you’ve already saved the question or not.

Within the prime proper it’s going to additionally inform you how massive the file is. That is what you should use to find out the price of querying the dashboard each minute. The above quantity for this question 95MB which in whole should you pull each minute of on a regular basis for a month would whole 4104000 MB. Only for this question, to have it run in actual time we’d be taking a look at round $18 {dollars} monthly (not unhealthy) based on Google Cloud Platform Pricing however this undoubtedly provides you a way of the necessary of excellent consolidation of your Question weight & the steadiness of how usually you pull the info.
When you save a duplicate to a view or reserve it as a view it’s going to immediate you so as to add it to a challenge and present knowledge set. Select your knowledge set and identify it no matter you’d like.


Now it is best to have a view with a a lot lighter data_set. That is what we will probably be utilizing to construct the Looker Studio report off of.

Add Actual-Time Knowledge to Looker Studio
Copy Our Template:
Now we’re able to create a realtime dashboard in Looker Studio! You may both create a brand new report from scratch OR you should use this template:
BigQuery Realtime Dashboard Template [MAKE A COPY]
To make use of this template, click on the hyperlink above. Then click on the 3-dot menu within the prime proper nook and choose “Make a duplicate.”

You’ll be prompted to pick out a brand new knowledge supply:

Click on the dropdown beneath “New Knowledge Supply” and choose “Create Knowledge Supply”:

Construct Your Personal:
Need to construct your individual? Comply with the steps under:
Now it’s similar to every other Looker Studio report. Navigate to LS and add a brand new knowledge supply. Seek for the BigQuery connector and choose it.

Discover your challenge ID and it is best to have your newly created views within the checklist to select from.

Customise Your Report For Actual-Time
Decide the Actual Time Desk View that you just’d like to make use of. We’ll use the 30 min one however you possibly can choose any and all. The one distinction is the prompt knowledge freshness.
An necessary step earlier than transferring on is to alter the info freshness on the info supply

Discover your knowledge supply that you just simply added

Click on on the info freshness:

Now you possibly can change the timing of when the info refreshes from BigQuery (i.e. it pulls the sql request once more). Keep in mind that the quicker this updates the extra $ you’ll be paying lots. For the 30-min dashboard, I’d counsel an information freshness of each 1 minute. For the hourly one, you possibly can in all probability get away with as soon as an hour or as soon as each 15 minutes. It’s actually as much as you!
In the event you’d like to drag each 1 min, you possibly can click on the customized radio button and alter the interval to each one 1min.

Now we’re prepared to begin constructing stories, the unique intention of this text was to get session knowledge in an actual time report so lets begin there. There’s actually infinite mixtures of visualizations that you are able to do however I like constructing a time collection of information by the minute and a few pie charts however you possibly can actually pull in no matter you’d like.

A couple of notes concerning the above knowledge:
- Typically there are null values within the knowledge for no matter cause…that right themselves in a while, you’ll simply must take care of these, you possibly can filter them out should you’d like.
- A time collection chart will restart in the beginning though its after the occasion. There’s a SQL remedy for this too which I’ll notice under
All and all it is a tremendous advanced and deeply intense approach to get to real-time advertising knowledge, and whereas essentially that is cool and the info is yours and at your fingertips to do what you need…you might wish to discover a easier answer like Piwik Professional or one other various, which doesn’t have the identical knowledge lag.
Extra modifications:
Final day of information pulled by hour
Some edits have to be made to the SQL question to supply the proper time knowledge, and precisely pull the final 24 hours of information out of your supply.
- Replace line 186 along with your desk ID within the format: project_ID.Dataset_ID.Table_ID.
- Replace strains 158 & 159 with the proper offset based mostly on the time zone set in your Google Analytics account. This may be present in Analytics beneath Admin > Property > Property particulars > Reporting Time Zone. The worth must be a string like “-08:00” or “+2:00”
- Replace line 165 with the proper offset. This interval ought to at all times be a constructive integer. Nevertheless, In case your time zone is forward of UTC you could additionally change the TIMESTAMP_SUB operate to TIMESTAMP_ADD, together with updating the interval worth. This line will output the datetime that your occasion occurred based on the time zone set in your GA4 account.