There’s a new type of spam hitting Google Analytics Accounts over the past few weeks which can be identified in your Google Analytics language report as:
Secret.ɢoogle.com You are invited! Enter only with this ticket URL. Copy it. Vote for Trump!
Here is a screen shot taken from the audience overview report:
This new bizarre wave of language spam first started appearing in accounts from around 8th November just as the US presidential elections were winding down and it is believed to have come from Russia.
You will also find it is also combined with referral spam, with multiple domains listed as source/medium, including abc.xyz, brateg.xyz, budilneg.xyz, begalka.xyz, bezlimitko.xyz, bukleteg.xyz, boltalko.xyz, biteg.xyz and others. So it is a two pronged attack trying to get the user’s attention to both the fake referrer domains and to the language report, probably because of it’s prominent placement on the Google Analytics report homepage.
This particular type of language spam only registers pageviews on your homepage, so metrics for internal pages should not be affected.
Ok so what’s the fix?
Unfortunately there is no retrospective action you can take to eliminate the data that is already in your account. However here are two filter fixes you should be implementing right away to make sure you’re safe going forward. Please Note: Setting up a view-level filters is fairly simple, but it should be noted that this is a permanent change going forward, so do be careful when using it, especially if you have little prior experience with view filters!
STEP 1. Implement a hostname filter for your domain. We recommend you should have this setup anyway to avoid spam traffic and fake referrals entering your account. To do this go to Admin >View Filters and create a custom filter to include your hostname. Make sure you sure you include hostnames relevant to you and be careful not to block any third party checkouts or subdomains related to your website.
STEP 2. Implement a Language Filter – The first step should do the job but its recommended you implement this as a backup . The following filter will filter out any traffic (hits) where the language dimension contains 12 or more symbols. Since most legitimate language settings sent by browsers are 5-6 symbols and rarely is there traffic with 8-9 symbols in this field, it should only filter out language spam.
Enter the following Regular Expression into your custom language filter setting as below and hit save.
.{12,}|s[^s]*s|.|,|!|/
With these 2 filters applied you should not see this kind of spam entering your accounts anymore.
What to do about existing data
To filter out language spam data from your existing report, a good solution is to apply a segment set to exclude language settings of .{12,}|s[^s]*s|.|,|!|/. This will allow you to view all your account data without spam. Here’s how this filter looks for reference: