Abstract
In this paper we describe a classification method that allows the use of graph-based representations of data instead of traditional vector-based representations. We compare the vector approach combined with the k-Nearest Neighbor (k-NN) algorithm to the graph-matching approach when classifying three different web document collections, using the leave-one-out approach for measuring classification accuracy. We also compare the performance of different graph distance measures as well as various document representations that utilize graphs. The results show the graph-based approach can outperform traditional vector-based methods in terms of accuracy, dimensionality and execution time.
Original language | English |
---|---|
Pages (from-to) | 475-496 |
Number of pages | 22 |
Journal | International Journal of Pattern Recognition and Artificial Intelligence |
Volume | 18 |
Issue number | 3 |
DOIs | |
State | Published - 1 May 2004 |
Keywords
- Document classification
- Graph matching
- Graph representation
- k-nearest neighbors algorithm
ASJC Scopus subject areas
- Software
- Computer Vision and Pattern Recognition
- Artificial Intelligence