Analyzing group E-mail exchange to detect data leakage

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


Today's organizations spend a great deal of time and effort on e-mail leakage prevention. However, there are still no satisfactory solutions; addressing mistakes are not detected and in some cases correct recipients are wrongly marked as potential mistakes. In this article we present a new approach for preventing e-mail addressing mistakes in organizations. The approach is based on an analysis of e-mail exchanges among members of an organization and the identification of groups based on common topics. When a new e-mail is about to be sent, each recipient is analyzed. A recipient is approved if the e-mail's content belongs to at least one common topic to both the sender and the recipient. This can be applied even if the sender and recipient have never communicated directly before. The new approach was evaluated using the Enron e-mail data set and was compared with a well known method for the detection of e-mail addressing mistakes. The results show that the proposed approach is capable of detecting 87% of nonlegitimate recipients while incorrectly classifying only 0.5% of the legitimate recipients. These results outperform previous work, which reports a detection rate of 82% without reference to the false positive rate.

Original languageEnglish
Pages (from-to)1780-1790
Number of pages11
JournalJournal of the American Society for Information Science and Technology
Issue number9
StatePublished - 1 Sep 2013


  • automatic classification
  • content analysis
  • e-mail

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Human-Computer Interaction
  • Computer Networks and Communications
  • Artificial Intelligence


Dive into the research topics of 'Analyzing group E-mail exchange to detect data leakage'. Together they form a unique fingerprint.

Cite this