Filtering Organic Google Analytics Data For Bot Traffic

If you’ve been looking at your website’s Google Analytics data and are excited to notice a significant spike in organic traffic, it may well be due to your marketing efforts. But it may also be from bot traffic, perhaps with a specifically reported ‘’ keyword.

This guide will show you how to identify bot traffic that presents as organic search traffic, and how to filter it from your Google Analytics data.

The Problem

We recently worked with a digital marketing client who saw an unusual organic traffic spike on the first day of the week. So much for not liking Mondays. But when we looked closely, it was clear the spike was due to spam traffic identifiable by the keyword ‘’.

No problem, right? Just create a filter to ensure this spike and future spikes don’t affect our data. One problem though – the traffic that the bot generated on the site appeared as Google Organic traffic. This meant that we couldn’t set up a filter based on organic keywords. Yes, we could exclude search terms to filter it out, but this will simply mean the data would be reported as Direct traffic instead. organic traffic spike

We needed to identify a way to isolate this data from the rest so that we can then create an appropriate filter.  But this spam traffic is quite clever in that it appears exactly as Organic Traffic so trying the usual indicators of spam traffic yielded no results.

For example, typical indicators of spam traffic include using ‘hostname’ and ‘full referrer’ as Secondary Dimensions. Both, however, were reporting that this traffic had come as Google Organic.


Identifying key spam identifiers

Ways to identify key spam identifiers

After a quick google search failed to yield any answers, we persevered with different secondary dimensions until we uncovered the solution.

What We Found

We found one key secondary dimension that gave us a unique identifier we could use to expose the spam. That dimension was ‘Page Title’, which the spam displayed as (not set). This was because the traffic never technically landed on the page for long enough to generate Page Title data.


Now that we had the unique identifier, the next problem was we had to filter out blank data (not set).

Before we went any further, and to ensure that this Page Title data was the correct unique identifier, we carried out a more extensive historical search, looking at the three months prior. We wanted to see if that traffic had come through on other occasions. This search proved the spike was only evident on that Monday, indicating that we had found the most effective way of identifying the traffic.

Fortuitously, the search uncovered more spam in the form of direct traffic. Again, it was always on the same Monday.


The Solution

The solution was to create a filter within Google Analytics to include ‘Page Title’ as the Filter Field and using the regex expression of a single period (.) as the Filter Pattern. The period acts as a proxy for any text at all and with this, the filter effectively includes any session where the page title has text in it and excludes those without (not set). Here is how we set it up:


In the end, it’s a simple solution. In this case, it’s specific to ‘’ from a paid traffic website by the name of, but you can apply it for any other spam traffic identified with similar characteristics.

