XML-AD: Detecting anomalous patterns in XML documents

Eitan Menahem, Alon Schclar, Lior Rokach, Yuval Elovici

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Many information systems use XML documents to store data and to interact with other systems. Abnormal documents, which can be the result of either an on-going cyber attack or the actions of a benign user, can potentially harm the interacting systems and are therefore regarded as a threat. In this paper we address the problem of anomaly detection and localization in XML documents using machine learning techniques. We present XML-AD - a new XML anomaly detection framework. Within this framework, an automatic method for extraction of feature from XML documents as well as a practical method for transforming XML features into vectors of fixed dimensionality was developed. With these two methods in place, the XML-AD framework makes it possible to utilize general learning algorithms for anomaly detection. The core of the framework consists of a novel multi-univariate anomaly detection algorithm, ADIFA. The framework was evaluated using four XML documents datasets which were obtained from real information systems. It achieved over 89% true positive detection rate with less than 0.2% of false positives.

Original languageEnglish
Pages (from-to)71-88
Number of pages18
JournalInformation Sciences
Volume326
DOIs
StatePublished - 1 Jan 2016

Keywords

  • Anomaly-detection
  • Machine-learning
  • Outliers detection
  • XML anomaly Detection
  • XML security

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Theoretical Computer Science
  • Computer Science Applications
  • Information Systems and Management
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'XML-AD: Detecting anomalous patterns in XML documents'. Together they form a unique fingerprint.

Cite this