The massive increase of spam is posing a very serious threat to email which has become an important means of communication. Not only does it annoy users, but it also consumes much of the bandwidth of the Internet. Most spam ﬁlters in existence are based on the content of email one way or the other. While these anti-spam tools have proven very useful, they do not prevent the bandwidth from being wasted and spammers are learning to bypass them via clever manipulation of the spam content. A very different approach to spam detection is based on the behavior of email senders. In this paper, we propose a learning approach to spam sender detection based on features extracted from social networks constructed from email exchange logs. Legitimacy scores are assigned to senders based on their likelihood of being a legitimate sender. Moreover, we also explore various spam ﬁltering and resisting possibilities.
The term “social network” which is stated in this paper refer to email transaction logs. Email transaction logs in SMTP server which contains sender address, ip address, sender email client, …. are parsed offline and construct email social networks. I like to mentioned to this term since it has different meaning from usual realization of this term.
In this paper, authors first mention to type of email spam:
- Unsolicited commercial email (UCE) – emails without recipient’s prior consent.
- Unsolicited bulk email (UBE) – emails which distribute virus and spywares.
Email spam detection are based on two approaches:
- Spam text detection
- Whitelist and blacklist
Their suggested method is based on spam detection whitelist and blacklist. They provide learning method for creating better black/whitelist.
Detection method based on 7 features, each feature is countend, normilized and weighted. then each of them is compaired with other valid feature data. What I mean by valid feature data is those data that are classified before as spam or non-spam. By compairing similarity between these futures a sender can be considered as spam / non-spam.
One drawback of this method is that may be some website which sent mass emails to users (such as mycareer, ebay, …) may fall into spam senders. So there should be some other policies for these legitimate senders.
- Communication Reciprocity
- Communication Interaction Average
- Clustering Coefficient
Cite this article as
Critical review on “A Learning Approach to Spam Detection based on Social Networks” by P.Hayati. 8th Mar 2008. Available online: https://pi3ch.wordpress.com/2008/03/07/review-a-learning-approach-to-spam-detection-based-on-social-networks/