Abstract
We describe and evaluate an information-theoretic algorithm for data-driven induction of classification models based on a minimal subset of available features. The relationship between input (predictive) features and the target (classification) attribute is modeled by a tree-like structure termed an information network (IN). Unlike other decision-tree models, the information network uses the same input attribute across the nodes of a given layer (level). The input attributes are selected incrementally by the algorithm to maximize a global decrease in the conditional entropy of the target attribute. We are using the prepruning approach: When no attribute causes a statistically significant decrease in the entropy, the network construction is stopped. The algorithm is shown empirically to produce much more compact models than other methods of decision-tree learning while preserving nearly the same level of classification accuracy.
Original language | English |
---|---|
Pages (from-to) | 203-215 |
Number of pages | 13 |
Journal | IEEE Transactions on Knowledge and Data Engineering |
Volume | 16 |
Issue number | 2 |
DOIs | |
State | Published - 1 Feb 2004 |
Keywords
- Classification
- Data mining
- Decision trees
- Dimensionality reduction
- Feature selection
- Information theoretic network
- Information theory
- Knowledge discovery in databases
ASJC Scopus subject areas
- Information Systems
- Computer Science Applications
- Computational Theory and Mathematics