Abstract
The goal of this paper is to improve the predictive accuracy of data streaming algorithms without increasing the processing time of the incoming data. We propose the EnHAT (Ensemble Combined with Hoeffding Adaptive Tree) algorithm, which combines the state-of-the-art Hoeffding Adaptive Tree (HAT) algorithm with an ensemble of J48 decision trees induced from sequential chunks of the data stream. The slack time of HAT adaptation to a new window of incoming records is utilized in parallel for building a decision-tree ensemble. In our experiments on 4 benchmark streaming datasets and 4 synthetic datasets with different types of concept drift, EnHAT has reached the highest predictive accuracy in 23 out of 26 cases compared to an ensemble of J48 trees, HAT, and a single J48 model induced from the last sliding window. Thus, we can conclude that the ensemble/HAT synergy yields better prediction results than each one of the two approaches on its own. The higher accuracy does not come at the expense of any additional computational effort beyond the model induction times of the combined algorithms.
Original language | English |
---|---|
Pages (from-to) | 397-404 |
Number of pages | 8 |
Journal | Information Fusion |
Volume | 89 |
DOIs | |
State | Published - 1 Jan 2023 |
Keywords
- Concept drift
- Data streams
- Ensemble learning
- Hoeffding Adaptive Tree
- Online learning
ASJC Scopus subject areas
- Software
- Signal Processing
- Information Systems
- Hardware and Architecture