Automatic Feature Engineering for Learning Compact Decision Trees

Inbal Roshanski, Meir Kalech, Lior Rokach

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Decision forests are known to excel in tabular data once their hyperparameters are well-tuned. In addition to being accurate and robust classifiers, these models can be easily converted into a collection of if-else rules that clearly describe the model's decision-making process. However, in practice, decision trees may reach enormous depth and size, and as a result, the collection of rules is vast and complicated. Furthermore, the model may consume a significant amount of memory space on the machine. By creating compact models, it is possible to establish a modest set of rules that may be easier to understand, require less memory, and by nature, may increase the decision-making ability and avoid overfitting. Previous studies attempted to reduce the size of the trees by altering their structure, but this affects both their advantages and simplicity because their structure is much more complicated. (e.g. oblique trees). In this research, we present FACET, a novel algorithm for retaining the compactness of trees while referring to the model as a black box. FACET addresses this by utilizing automated feature engineering methods, which generate a new feature set from the data set by manipulating the current feature set, resulting in a drastic reduction in the size of the decision trees while preserving and even improving the model's accuracy. Our algorithm, FACET, has been extensively tested on multiple datasets, models, and operators to demonstrate its effectiveness. On average, FACET achieves a 24% reduction in the size criteria of the tree-based model without sacrificing accuracy. This reduction in size leads to an average memory reduction of 44% on the dataset required for learning. These statistically significant results demonstrate the potential of FACET to enable more efficient and interpretable tree-based models, without compromising their accuracy, in practical applications.

Original languageEnglish
Article number120470
JournalExpert Systems with Applications
Volume229
DOIs
StatePublished - 1 Nov 2023

Keywords

  • Decision-tree
  • Feature-engineering
  • Machine-learning
  • Space-complexity

ASJC Scopus subject areas

  • General Engineering
  • Computer Science Applications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Automatic Feature Engineering for Learning Compact Decision Trees'. Together they form a unique fingerprint.

Cite this