Removing web spam links from search engine results

Egele, Manuel;Kruegel, Christopher;Kirda, Engin
EICAR 2009, 18th European Institute for Computer Antivirus Research Annual Conference, May 11-12, 2009, Berlin, Germany

 

 

 

 

 

 

Web spam denotes the manipulation of web pages with the sole intent to raise their position

 

in search engine rankings. Since a better position in the rankings directly and positively affects

 

the number of visits to a site, attackers use different techniques to boost their pages to higher

 

ranks. In the best case, web spam pages are a nuisance that provide undeserved advertisement

 

revenues to the page owners. In the worst case, these pages pose a threat to Internet users by

 

hosting malicious content and launching drive-by attacks against unsuspecting victims. When

 

successful, these drive-by attacks then install malware on the victims' machines.

 

In this paper, we introduce an approach to detect web spam pages in the list of results that are

 

returned by a search engine. In a first step, we determine the importance of different page features

 

to the ranking in search engine results. Based on this information, we develop a classification

 

technique that uses important features to successfully distinguish spam sites from legitimate entries.

 

By removing spam sites from the results, more slots are available to links that point to pages

 

with useful content. Additionally, and more importantly, the threat posed by malicious web sites

 

can be mitigated, reducing the risk for users to get infected by malicious code that spreads via

 

drive-by attacks.


Type:
Conférence
City:
Berlin
Date:
2009-05-11
Department:
Sécurité numérique
Eurecom Ref:
2779
Copyright:
EICAR

PERMALINK : https://www.eurecom.fr/publication/2779