By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.
Apr 27, 2023
Google Analytics

Do not analyze your pageview data in GA4 anymore!

Use the raw data export to BigQuery instead. By Sanu Maharjan

One of the frequent KPIs that our clients like to see is the page views of their website. Keeping that in mind, we always provide the following metrics, namely, page views, unique page views, landing page, page path, and exit page. In this way, our clients have an overview of their customers on the website.

Ready to get your hands dirty with a bit of code? Let's take a closer look at this piece of SQL and see what it can do! 

https://gist.github.com/sigmaError/bc8d4d5728480fe7195b690ebf8d58f1

Preparing the data

First, we have our prep_page_view sub-query, this block of the query is simply saying "Hey before we carry on with the rest of the code, let's do some preparation first." 

In this prep block of code, we are only extracting the required columns, for example, event_timestamp and user_pseudo_id. The session_id, page_title, page_location and is_landing_page are different and they need to be extracted using UNNEST function. Why do you ask? I have already made a blog post about it here

Just to recap, information like page_view, page_title, and others are stored in special “repeated fields” called event_params. In order to get the values from these repeated fields, we have to use UNNEST function. If we want to get data for page view then we filter out by event_name = ‘page_view. With this, we get the following information for each page view event:

  • event_timestamp: The time the event occurred
  • user_pseudo_id: A pseudonymous identifier for the user who triggered the event
  • session_id: The Google Analytics session ID for the page view event
  • page_title: The title of the page that was viewed
  • page_location: The URL or location of the page that was viewed
  • is_landing_page: A Boolean value indicating whether the page view event was the first page in the user's session

Define Landing and Exit Page

With this sub-query, we can already calculate metrics like total page view and also unique page view, but we want to get a step further and see the landing and exit page path. To configure the landing and exit pages, we have to tweak our table further. For the landing page, we are first checking if the page_location column is TRUE, if it has a TRUE value then we provide the page URL. 

To get exit_page we use the window function and partition it by user_pseudo_id and session_id and order the data by event_timestamp in ascending and then if we take the last value of page_location, it will give us the exit_page.

The previous and next pages can be obtained by using a LAG and LEAD function. This is another window function, which will basically look at data and goes one step before (or after) respectively. In our case, it will give us the URL of the previous (or next) pages. With this kind of data structure, we can look at the journey of a customer over the session.

When you run the query, you should have a table, which will look something like the one below:

Now you can already play around with the data and make more intuitive tables and graphs. Below I have used Looker Studio to look at views and unique views for different pages. I have also made a chart for the landing page count for different pages.

If you want to take this to another level, then you can even make a funnel chart out of it. The funnel chart provides more insight into customer behavior. For example, 

  • how many customers were on the home page
  • how many of them looked at the different products
  • how many added the products to carts
  • how many added paid details 
  • how many bought the products

We might make another series on how to create a funnel chart in the future, but if you are in urgent need, don’t hesitate to contact us.

Differences to GA4

Why is it better now to make your analytics in BigQuery and not in GA4?

All in all you have more flexibilities in BigQuery to filter and transform your pageview data. E.g. there is a Landing Page dimension available in GA4, but not an Exit Page (so far). To get an idea of your exit behaviour in GA4, you have to add the number of exits to your Exploration, but still then the rate of exits is missing. 

Additionally, with the LEAD and LAG functions you can create a better funnel-like chart and more in depths analysis then in GA4.

Conclusion

Overall, the query provides a detailed view of user behavior on a website, including the sequence of pages visited and whether certain pages are frequently used as landing pages or exit pages. Now you can get your hands dirty with a bit of code and find out just how popular your website is.

Further Links

Check out our LinkedIn account, to get insights into our daily working life and get important updates about BigQuery, Data Studio, and marketing analytics.

We also started with our own YouTube channel. We talk about important DWH, BigQuery, Data Studio, and many more topics. Check out the channel here.

If you want to learn more about how to use Google Data Studio and take it to the next level in combination with BigQuery, check our Udemy course here.

If you are looking for help to set up a modern and cost-efficient data warehouse or analytical dashboards, send us an email at hello@datadice.io and we will schedule a call.