ConfDTree: Improving decision trees using confidence intervals

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

Decision trees have three main disadvantages: reduced performance when the training set is small, rigid decision criteria and the fact that a single "uncharacteristic" attribute might "derail" the classification process. In this paper we present ConfDTree - a post-processing method which enables decision trees to better classify outlier instances. This method, which can be applied on any decision trees algorithm, uses confidence intervals in order to identify these hard-to-classify instances and proposes alternative routes. The experimental study indicates that the proposed post-processing method consistently and significantly improves the predictive performance of decision trees, particularly for small, imbalanced or multi-class datasets in which an average improvement of 5%-9% in the AUC performance is reported.

Original languageEnglish
Title of host publicationProceedings - 12th IEEE International Conference on Data Mining, ICDM 2012
Pages339-348
Number of pages10
DOIs
StatePublished - 1 Dec 2012
Event12th IEEE International Conference on Data Mining, ICDM 2012 - Brussels, Belgium
Duration: 10 Dec 201213 Dec 2012

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786

Conference

Conference12th IEEE International Conference on Data Mining, ICDM 2012
Country/TerritoryBelgium
CityBrussels
Period10/12/1213/12/12

Keywords

  • Confidence intervals
  • Decision trees
  • Imbalanced datasets

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'ConfDTree: Improving decision trees using confidence intervals'. Together they form a unique fingerprint.

Cite this