Review: Fighting Spam on Social Web Sites

May 9, 2008
Authors: Paul Heymann, Georgia Koutrika, and Hector Garcia-Molina
Year: 2007
Published in: IEEE Computer Society
Importance: High


In recent years, social Web sites have become important components of the
Web. With their success, however, has come a growing influx of spam. If left
unchecked, spam threatens to undermine resource sharing, interactivity, and
openness. This article surveys three categories of potential countermeasures —
those based on detection, demotion, and prevention. Although many of these
countermeasures have been proposed before for email and Web spam, the
authors find that their applicability to social Web sites differs. How should we
evaluate spam countermeasures for social Web sites, and what future challenges
might we face?

My Review

One of the beauitiful papers that I read recently was this paper. Authors very good classify current litrature in web spam filed the try to demonestrate how each anti-spam method works on their example of social community website (social bookmarking).

More dynamic content more avenue for spamming!

3 main anti-spam strategies:

  1. Detection-based: text classification, link analysis, user behavior anaysis, …
  2. Prevention-based: CAPTCHA, Account fee, Proof of work, …
  3. Demotion-based: Spam-hardened queries, rank-based, …

it is well understood that few works has been done in 2 and 3.

2 ingredient for method evaluation:

  1. Spam Model: capture whether content is spam or not
  2. Spam Metric: provide quantitative assessment of how spam affect a particular interface.

2 Spam models:

  1. Synthetic spam model: making assumption and define malicious behavior
  2. Trace-driven spam model: based on real data and positive/negative example of spam content

Since content of social community website are updated very rapidly authors used synthetic model to develop detection model.

All and all, their work was worth because of classification not in proposing a new method acctually.