Technical & Websites, Top Tips, Hints & Advice

WordPress site hacked: fake spam URLs indexed – RESOLVED

Written by Sukh Singh on 10th August 2018

WordPress-Spam

If your WordPress site has been hacked you aren’t alone as this happens quite often where there aren’t at least basic plugins in place to scan and block malware. As a result of this happening often, there are tried and tested fixes and preventative measures you can employ to minimise the chances of your site being hacked.

A WordPress site that we’re working on recently got hacked and a script on the site was creating fake URLs which then redirected to external sites. These URLs were all related to “dating” sites and there were over 2,000 of them – you would be surprised at the creativity involved in creating 2,000+ “dating”-related URLs!

How did we notice this spam?

We saw a significant increase in the number of pages indexed in the Google Search Console account – as we regularly check this for our SEO reports. This was the first indication that something was up.

WordPress Spam - Search Console indexation

We then checked the messages section of Search Console but we didn’t see anything, which was odd, as typically you would get a message like: “This site may be hacked”.

We then did a site search in Google to see all of the pages that were being indexed using “site:website.com” and after scrolling down past the legitimate pages on the site we saw URLs relating to “dating” sites like;

  • “website.com/single-men-dating”
  • “website.com/singles-dating”

…and so on.

All of the spam URLs included the word “dating” so we did another site search, this time using the search operator; “dating” site:website.com and this generated the following results:

We found over 2,000 results relating to these URLs so it was clear the problem was quite extensive.

How we resolved the WordPress hack

The first step was removing the malicious script from the server that was generating the spam. We enlisted the help of our development partners WebJuice to remove this from the server.

There are several plugins that scan and remove spam, as well as monitor the site moving forwards. Popular plugins include:

We went with WordFence as we familiar with it and know it works well.

Remove malicious spam script from the server:

  1. Install the plugin – use this quick guide: https://www.wordfence.com/docs/
  2. Scan your site, check and compare file changes:
    • There are ways to do this in more detail manually via the server, here is a guide from the plugin website detailing these. We used the plugin on this occasion.
    • We scanned the site and found the result shown below. Based on the message and our developer’s knowledge of the site we concluded that this file didn’t belong and so we deleted it:
      WordFence scan result
  3. Check one of the spam URLs (from Google’s search results) to see if they still redirect to these spam websites. If they return a 404 error this means the malware has been removed and these should inform Google to drop them from the index when Google looks for them next time they crawl the site
  4. We then added an email address to the plugin to alert us of any issues found in the future

Remove spam from Google’s index

We know that the URLs return a 404 error now and that Google should see these 404’s and eventually drop them from the index. However, as there are over 2,000 of these and that the site gets crawled and indexed every other day at varying amounts of URLs, this may take a month.

We want to speed up the crawling of these new spam 404 URLs to encourage Google to drop them from the SERPs, so let’s look at a couple of options for this.

Option 1 (slow to implement):

  1. If you are thinking of using Google Search Console’s “Remove URLs” tool for this, this will not only take an age as you submit one URL at a time, but it only temporarily removes them (90 days). If you want to combine this with the next option feel free, but I think you will find the next method a more definitive solution

Option 2 (quick to implement):

  1. Download all of the spam URLs from the Google SERPs
    • If you have a lot of spam results, you can install the Infinite Scroll add-on, this will allow you to scroll right down to last page of results without clicking “next page” to grab all URL links (link for Google Chrome browser, similar plugins available)
    • You can use a browser plugin called Link Klipper to help you download the spam URLs, its faster than copying and pasting (link for Google Chrome browser, similar plugins available)
  2. Use a search operator like “dating” site:website.com (replacing whatever the common spam name is in your URLs) to trigger the indexed spam URLs in Google
  3. Scroll down to the last result (it might be easier to use the “Page Down” button on your keyboard to do this quickly!)
  4. Once all results are fully loaded, right click and select “Link Klipper” and select “Extract all links” – which creates an Excel file
  5. Open the Excel file and sort all of the URLs by ascending, then copy the ones relating to your domain, ignoring all other URLs like “https://webcache.googleusercontent.com”
  6. Now we want to create a sitemap file:
    • This is a free tool that you can paste your URL list into then download the XML file – http://www.timestampgenerator.com/tools/xml-sitemap-from-list/
    • Screaming Frog is a crawler tool which has an XML sitemap generator There is a free version but limits the crawl to 500 URLs, so if you have more URLs you will need a paid version)
  7. Give your XML sitemap a distinctive name, next you need to upload it to the site to the root folder via FTP and test it. The URL would be: “website.com/sitemap-name.xml
  8. Now go to Google Search Console and under “crawl” and “sitemaps” submit the new sitemap to Google
  9. Refresh the page and you will see the status as “submitted”
  10. If you have over 1,000 spam URLs it’s unlikely you will download all of them from the SERPs in one go, so you need to repeat these steps every other day:
    • Download the spam URLs from the SERPs and add them to the same Excel sheet every time. This will then contain all the accumulated spam URLs discovered to date
    • Remove duplicates from the sheet every time you add new URLs until the number of URLs in your sheet matches the SERPs shown when you look for the spam pages with the search operator
    • Create and upload updated sitemap file, and re-submit to Google

Based on the 2,000+ spam URLs we found, this process took us about a month to fully remove all the spam from the index. We chose to do it daily and it only took about five minutes, so it’s not too difficult if you schedule it at the start of everyday.

Monitor your spam removal

Use Google Search Console to monitor your spam removal:

  1. Check the number of indexed pages from the spam sitemap as this will increase every time it’s crawled
  2. Check the number of 404 errors and that the URLs themselves are the spammy ones as this should also increase in parallel with the spam URLs being submitted. If you see a correlation between them this means the URLs are dropping out of the index
  3. Check Google with a site search to prove the number of returned results is decreasing

Avoid this happening in the first place

These are checks you should be performing as standard on a website you manage, which will help to avoid this happening in the future. Using the WordFence plugin or similar will also help prevent future spam attacks. It’s the responsibility of our Technical SEO department to perform checks on client’s websites and as a website owner or manager we recommend performing these at least every week if not daily:

  1. Make sure you configure your spam defence plugin to run a scan every day and block multiple attempts to log-into the site
  2. Add your email address to the plugin to alert you of any attempts or issues
  3. If you notice continuous attempts to access the site (via alerts from the Spam plugin) consider changing the URL of your login pages (i.e. “/admin”) to something unique, so that spammers would not know to try the new login URL
  4. Make sure you have access to the email address associated with the Search Console account so you see alerts for Spam or Crawl issues (set yourself up as a user so you don’t have to log-into the admin email account every time)
  5. Check the “messages” section in Search Console
  6. Check indexation and look for any strange spikes
  7. Check “Crawl Errors” in Search Console and look for any strange spikes in 500 and 400 errors
  8. Check the sitemap/s in Search Console to see if there’s any large discrepancies between “pages crawled” vs “pages indexed”

Results from our work

These are some our results from the work we did:

Spam indexation

You can see where the spam URL indexation spiked and when it started to normalise.

Spam sitemap indexation:

WordPress Spam - Search Console sitemap indexation

URLs were added to the sitemap daily at the start of the issue, then we left it to be crawled and pick-up 404s to drop out of the index. There were a few legitimate URLs already on the site that mention the term “dating”, that is why you see 39 URLs as indexed because these do not 404.

Spam crawl errors

WordPress Spam - Search Console crawl errors

Here we see an increase in 404s due to the spam 404 pages being picked up in the sitemap. We will clear this out completely soon by “marking as fixed” to re-set and only see legitimate 404 errors moving forwards.

How long will it take to resolve?

Here’s a log we kept while checking the spam issue which shows that based on a website with 800 legitimate pages, plus 2,303 fake/spam pages and a crawl/index rate of every day/every other day, the issue was resolved within a full month.

Therefore, you can get a rough idea, based on the size of your site and rate of indexation how long it will take for you to remove these URLs from the Google index.

We can help you resolve website spam issues

If you think your website has been hacked, whether it’s on WordPress, Magento or any other platform, and would like us to take a look feel free to get in touch today and a member from our Technical SEO team will help you.