ExploreKit: Automatic feature generation and selection

Gilad Katz, Eui Chul Richard Shin, Dawn Song

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

114 Scopus citations

Abstract

Feature generation is one of the challenging aspects of machine learning. We present ExploreKit, a framework for automated feature generation. ExploreKit generates a large set of candidate features by combining information in the original features, with the aim of maximizing predictive performance according to user-selected criteria. To overcome the exponential growth of the feature space, ExploreKit uses a novel machine learning-based feature selection approach to predict the usefulness of new candidate features. This approach enables efficient identification of the new features and produces superior results compared to existing feature selection solutions. We demonstrate the effectiveness and robustness of our approach by conducting an extensive evaluation on 25 datasets and 3 different classification algorithms. We show that ExploreKit can achieve classification-error reduction of 20% overall. Our code is available at https://github.com/giladkatz/ExploreKit.

Original languageEnglish
Title of host publicationProceedings - 16th IEEE International Conference on Data Mining, ICDM 2016
EditorsFrancesco Bonchi, Josep Domingo-Ferrer, Ricardo Baeza-Yates, Zhi-Hua Zhou, Xindong Wu
PublisherInstitute of Electrical and Electronics Engineers
Pages979-984
Number of pages6
ISBN (Electronic)9781509054725
DOIs
StatePublished - 2 Jul 2016
Externally publishedYes
Event16th IEEE International Conference on Data Mining, ICDM 2016 - Barcelona, Catalonia, Spain
Duration: 12 Dec 201615 Dec 2016

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
Volume0
ISSN (Print)1550-4786

Conference

Conference16th IEEE International Conference on Data Mining, ICDM 2016
Country/TerritorySpain
CityBarcelona, Catalonia
Period12/12/1615/12/16

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'ExploreKit: Automatic feature generation and selection'. Together they form a unique fingerprint.

Cite this