Review: Google Patent on Web Spam

Author: Bill Slawski
Year: –
Published in: Available on author’s blog.
Importance: Medium

My Review

What I found in this blog article is that Google may cluster pages to find spam and manipulative documents. Clusters contains both interlink and doorway pages. So, for determining whether or not page is manipulative, Google consider both local and not-local pages and grow cluster when it find more doorway pages. After making cluster manipulative signal of each inter and outer documents is counted. Clearly not mentioned to these signals both some of them include:

  1. The text of document (repeated, long, …)
  2. Meta Tags (repeated, long, …)
  3. Redirect (each script redirect page to other page)
  4. Similarly colored text and background
  5. History of document (new owner)
  6. Anchor text (links more than text)

To sum up, all of these signals are counted which are resulted to overall signal. Base on a threshold (as discussed in article: ) if a page marked as manipulative below action will be take:

  • Lowering page rank
  • Page removes entirely
  • Other treats and ways

