TRIO: Task-Agnostic Dataset Representation Optimized for Automatic Algorithm Selection

Noy Cohen-Shapira, Lior Rokach

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

With the growing number of machine learning (ML) algorithms, the selection of the top-performing algorithms for a given dataset, task, and evaluation measure is known to be a challenging task. The human expertise required for this task has fueled the demand for automatic solutions. Meta-learning is a popular approach for automatic algorithm selection based on dataset characterization. Existing meta-learning methods often represent the datasets using predefined features and thus cannot be generalized for various ML tasks, or alternatively, learn their representations in a supervised fashion, and thus cannot address unsupervised tasks. In this study, we first propose a novel learning-based task-agnostic method for dataset representation. Second, we present TRIO, a meta-learning approach based on the proposed dataset representation, which is capable of accurately recommending top-performing algorithms for unseen datasets. TRIO first learns graphical representations from the datasets and then utilizes a graph convolutional neural network technique to extract their latent representations. An extensive evaluation on 337 datasets and 195 ML algorithms demonstrates the effectiveness of our approach over state-of-the-art methods for algorithm selection for both supervised (classification and regression) and unsupervised (clustering) tasks.
Original languageEnglish
Title of host publicationProceedings - 21st IEEE International Conference on Data Mining, ICDM 2021
EditorsJames Bailey, Pauli Miettinen, Yun Sing Koh, Dacheng Tao, Xindong Wu
PublisherInstitute of Electrical and Electronics Engineers
Pages81-90
Number of pages10
ISBN (Electronic)9781665423984
DOIs
StatePublished - 2021
Event21st IEEE International Conference on Data Mining, ICDM 2021 - Virtual, Online, New Zealand
Duration: 7 Dec 202110 Dec 2021

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
Volume2021-December
ISSN (Print)1550-4786

Conference

Conference21st IEEE International Conference on Data Mining, ICDM 2021
Country/TerritoryNew Zealand
CityVirtual, Online
Period7/12/2110/12/21

Keywords

  • algorithm selection
  • AutoML
  • meta-learning
  • task-agnostic dataset representation

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'TRIO: Task-Agnostic Dataset Representation Optimized for Automatic Algorithm Selection'. Together they form a unique fingerprint.

Cite this