Abstract
Gradient boosting models like XGBoost are among the most popular models for tabular classification problems. Unfortunately, the greediness of gradient boosting algorithms can cause them to rely too heavily on some features, thereby starving the other features. We propose Iterative Feature eXclusion (IFX) to alleviate this problem by iteratively removing the most influential feature from the training data and continuing training. By forcing the model to learn from weaker features, we increase the diversity of the gradient boosting model and improve the predictive performance. Our experiments show that in most cases, IFX improves XGBoost predictive performance, sometimes by a large margin. All of the code and results from our experiments are freely available online. Iterative Feature eXclusion can be used as a drag-and-drop replacement for XGBoost, thereby easing the adoption of our work by machine learning researchers and practitioners.
Original language | English |
---|---|
Article number | 111546 |
Journal | Knowledge-Based Systems |
Volume | 289 |
DOIs | |
State | Published - 8 Apr 2024 |
Keywords
- Decision trees
- Feature selection
- Gradient boosting machine
- Shapley values
- Tabular classification
ASJC Scopus subject areas
- Software
- Management Information Systems
- Information Systems and Management
- Artificial Intelligence