Scalable Detection of Server-Side Polymorphic Malware

Yehonatan Cohen, Danny Hendler

Research output: Contribution to journalArticlepeer-review

9 Scopus citations


Server-side polymorphism is used by malware distributors in order to evade detection by anti-virus (AV) scanners. It is difficult for traditional AVs to detect this type of malware because the transformation code is not visible for security analysis. Using a tera-scale dataset consisting of antivirus telemetry reports pertaining to more than half a billion files, we conduct what is, to the best of our knowledge, the most wide-scale analysis of the properties of web-borne polymorphic malware done to date. We cluster the files population based on their locality-sensitive hash (LSH) values and analyze the resulting LSH clusters. Using ground truth labels, we identify benign and malicious clusters and analyse the differences between them in terms of the distributions of cluster-size, file download numbers and activity period, and in terms of their web domain utilization patterns. The results of this analysis are then leveraged for devising SPADE - a scalable Server-side Polymorphic mAlware DEtector that provides high-quality detection of both malicious files and malicious web domains.

Original languageEnglish
Pages (from-to)113-128
Number of pages16
JournalKnowledge-Based Systems
StatePublished - 15 Sep 2018


  • Locality-Sensitive Hashing
  • Malware Detection
  • Server-Side Polymorphism

ASJC Scopus subject areas

  • Software
  • Management Information Systems
  • Information Systems and Management
  • Artificial Intelligence


Dive into the research topics of 'Scalable Detection of Server-Side Polymorphic Malware'. Together they form a unique fingerprint.

Cite this