TY - GEN
T1 - Multi-lingual detection of terrorist content on the Web
AU - Last, Mark
AU - Markov, Alex
AU - Kandel, Abraham
PY - 2006/7/14
Y1 - 2006/7/14
N2 - Since the web is increasingly used by terrorist organizations for propaganda, disinformation, and other purposes, the ability to automatically detect terrorist-related content in multiple languages can be extremely useful. In this paper we describe a new, classification-based approach to multi-lingual detection of terrorist documents. The proposed approach builds upon the recently developed graph-based web document representation model combined with the popular C4.5 decision-tree classification algorithm. Evaluation is performed on a collection of 648 web documents in Arabic language. The results demonstrate that documents downloaded from several known terrorist sites can be reliably discriminated from the content of Arabic news reports using a simple decision tree.
AB - Since the web is increasingly used by terrorist organizations for propaganda, disinformation, and other purposes, the ability to automatically detect terrorist-related content in multiple languages can be extremely useful. In this paper we describe a new, classification-based approach to multi-lingual detection of terrorist documents. The proposed approach builds upon the recently developed graph-based web document representation model combined with the popular C4.5 decision-tree classification algorithm. Evaluation is performed on a collection of 648 web documents in Arabic language. The results demonstrate that documents downloaded from several known terrorist sites can be reliably discriminated from the content of Arabic news reports using a simple decision tree.
UR - http://www.scopus.com/inward/record.url?scp=33745780530&partnerID=8YFLogxK
U2 - 10.1007/11734628_3
DO - 10.1007/11734628_3
M3 - Conference contribution
AN - SCOPUS:33745780530
SN - 3540333614
SN - 9783540333616
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 16
EP - 30
BT - Intelligence and Security Informatics - International Workshop, WISI 2006, Proceedings
T2 - International Workshop on Intelligence and Security Informatics, WISI 2006
Y2 - 9 April 2006 through 9 April 2006
ER -