Skip to Content
James Holding, Head of analytics

The author

James Holding

Head of analytics

There’s been a fairly prominent issue with a particular type of Google Analytics spam recently, whereby the language field has been used to display a message.

The language field is visible on the default screen when logged into Analytics, making it a likely target for anyone wanting to broadcast a message.

The below screenshot is an example of spam affecting many Analytics accounts recently:

Spam showing in Analytics

Note that the ‘G’ in Google is not actually a ‘G’, but actually the Latin letter small capital G with hex code ɢ.

Those behind this spam have likely bought the domain that uses this Latin character in place of a ‘G’ and will control the content, which is why they’ve included it – and why it would be a bad idea to access the link.

If you access the domain with a third-party testing tool, it appears to take you to: http://money.get.away.get.a.good.job.with.more.pay.and.you.are.okay.money.it.is.a.gas.grab.that.cash.with.both.hands.and.make.a.stash.new.car.caviar.four.star.daydream.think.i.ll.buy.me.a.football.team.money.get.back.i.am.alright.jack.ilovevitaly.com/.

ilovevitaly.com has appeared in historical Google Analytics spam of other types.

If we add ‘network domain’ or ‘city’ as a secondary dimension, we can see that they are from some unusual sources for a UK-focused account:

Network domain:

Network Domain

If you’re seeing spam in your Analytics, follow the below steps to stop these referrers ruining your data.

Remove spam with a simple filter

Spam language traffic can be filtered out of Google Analytics fairly easily.

  • Language codes shouldn’t contain a dot or comma, which can be used to exclude anyone trying to spam a URL link into this field.
  • Language codes should be relatively short in length. Our suggestion excludes languages 10 characters or more (this will include “(not set)” in case it occurs, which is 9 characters long)
  • Finally, we suggest excluding any language that is three words or more (exclude anything that contains ‘space word space word.’

To create the spam filter:

  1. Navigate to filters in the admin screen in Google Analytics
  2. Create a new filter and give it a name (e.g. “Exclude – spam languages”)
  3. Instead of ‘predefined’, select ‘custom’
  4. Choose ‘Exclude’ (default setting)
  5. Change the filter pattern to ‘language’
  6. Set the value to (\s\S+){2,}|.{10,}|\.|,
  7. Apply to the views you wish to exclude the traffic from

This should then look similar to the below:

Add filter to view

Please note: we would always recommend leaving at least one view without any filters on as a backup.

What about historical data?

We can use the filter applied at a view level in the previous section to create a segment which will allow you to single out this spam traffic.

The following will create a segment to view any language containing the dot character and can be edited to see the settings, using regex to find a language containing a dot as before: https://analytics.google.com/analytics/web/template?uid=IQKAhjl3SoSesEA_R26drQ

Similarly, the following will show all traffic excluding the spam languages: https://analytics.google.com/analytics/web/template?uid=DLYsIKGBSJuLG2y3QL5zPw

If you use any segments at the moment, you should consider also excluding this traffic. Spam in Google Analytics is a wider problem and we’ll be looking to expand on other options to identify and remove spam in the near future.