Improving malicious email detection through novel designated deep-learning architectures utilizing entire email

Trivikram Muralidharan, Nir Nissim

Research output: Contribution to journalArticlepeer-review

Abstract

In today's email dependent world, cyber criminals often target organizations using a variety of social engineering techniques and specially crafted malicious emails. When successful, such attacks can result in significant harm to physical and digital systems and assets, the leakage of sensitive information, reputation damage, and financial loss. Despite the plethora of studies on the detection of phishing attacks and malicious links in emails, there are no solutions capable of effectively, quickly, and accurately coping with more complex email-based attacks, such as malicious email attachments. This paper presents the first fully automated malicious email detection framework using deep ensemble learning to analyze all email segments (body, header, and attachments); this eliminates the need for human expert intervention for feature engineering. In this paper, we also demonstrate how an ensemble framework of deep learning classifiers each of which are trained on specific portions of an email (thereby independently utilizing the entire email) can generalize better than popular email analysis methods that analyze just a specific portion of the email for analysis. The proposed framework is evaluated comprehensively and with an AUC of 0.993, the proposed framework's results surpass state-of-the-art malicious email detection methods, including human expert feature-based machine learning models by a TPR of 5%.

Original languageEnglish
Pages (from-to)257-279
Number of pages23
JournalNeural Networks
Volume157
DOIs
StatePublished - 1 Jan 2023

Keywords

  • Analysis
  • Deep learning
  • Detection
  • Email
  • Malware
  • Phishing

ASJC Scopus subject areas

  • Cognitive Neuroscience
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Improving malicious email detection through novel designated deep-learning architectures utilizing entire email'. Together they form a unique fingerprint.

Cite this