- The Search Agents - http://www.thesearchagents.com -

A Lesson on Google’s Webspam Guidelines

google spam [1]

In my previous article, I gave an overview of the Google Search Quality Rating Guidelines [2] and how users rate the quality of Google’s search results. During the rating process, every page is also checked and flagged for inappropriate and malicious content, including spam. Understanding these guidelines is important to prevent having your pages flagged as spam.

What is Webspam?

According to Google, “Webspam is the term for webpages that are designed by webmasters to trick search engines and draw users to their websites.” Google checks for technical signals of Black Hat SEO practices to increase page ranking, spam webpages with little value and pages created for only commercial intent. Google also provides criteria about what good pages contain.

Google states, “Pages that are merely annoying, junky, or low quality, such as pages with lots of pop-ups or ads, are not necessarily spam.” The page quality rating is not affected by the spam flag. It is possible for pages to be graded as “relevant” to a query and also flagged for spam.

Technical Signals

Hidden Text and Links are not visible to users and exist only to add value to the page when it is crawled. Text and links with the same color as the background, a tiny font size (like 1pt), or that are shifted outside of normal viewing area are typically flagged as hidden. Hidden text is usually used to facilitate keyword stuffing.

Keyword Stuffing is using an excessive number of keywords on the page to draw search traffic. Using keywords that are irrelevant to the page content is also considered stuffing. For example, using keywords like “mortgages,” “cell phones” and “gambling” for an automotive page is spam. Page content and URLs that appear to have automatically-generated text are also flagged.

Sneaky Redirects send the user to a URL on a different domain, with the intent of delivering spam. This includes pages that redirect to pages with irrelevant content, through several URLs before hitting the landing page, and/or to a different URL on every visit of the original page.

Cloaking is the method of showing different content to users and search engines. This can be done by serving content to users through JavaScript to hide it from the search engines, or playing with frames to hide content from users.

Spam Webpages

Pages with PPC ads that were obviously created to only drive ad revenue are considered spam. This includes pages that use scraped or copied content to appear relevant. Some examples of these pages are wikis, blogs, message boards and search pages that are fake or copied from another source. Legitimate blogs and message boards are not typically flagged for user-generated spam in comments and posts.

Commercial Intent

Thin affiliates are spammers that use sites with very little useful or original information to earn commissions. Affiliate sites that provide relevant content and offer useful features such as shopping carts, return policies and functional forums are not considered spammers.

Pure PPC pages contain only ads or links to other spam pages. Fake directories are also in this category.

Parked (expired) domains are expired domains from legitimate companies that are purchased by spammers to leverage the existing name and link value.  The spammers use pre-existing links for driving traffic to their spam site.  These pages typically contain all paid links and have no relevance to the original domain name. Registering a domain with a similar spelling to a legitimate domain to drive revenue from accidental visitors is also considered spam.

Good Pages

Good pages are organized well, contain ads that are clearly identified and not distracting, and provide value to the user. Google provides the following criteria for determining good pages. If you can answer “yes” to any of the following, the page is probably not spam:

As mentioned before, Google’s page ranking process is not completely transparent. However, these guidelines provide brands with valuable information to ensure their website pages align with Google’s recommendations and requirements.