TY - GEN
T1 - Diagnosis for Post Concept Drift Decision Trees Repair
AU - Almog, Shaked
AU - Kalech, Meir
N1 - Publisher Copyright:
© 2023 Proceedings of the International Conference on Knowledge Representation and Reasoning. All rights reserved
PY - 2023/1/1
Y1 - 2023/1/1
N2 - Decision trees are commonly used in machine learning since they are accurate and robust classifiers. After a decision tree is built, the data can change over time, causing the classification performance to decrease. This data distribution change is a known challenge in machine learning, referred to as concept drift. Once a concept drift has been detected, usually by experiencing a decrease in the model's performance, it can be handled by training a new model. However, this method does not explain the drift harming the performance but only handles the drift's effects. The main contribution of this paper presents a novel two-step approach called APPETITE, which applies diagnosis techniques to identify the feature that has drifted and then adjusts the model accordingly. For the diagnosis step, we present two algorithms. We experimented on 73 known datasets from the literature and semi-synthesized drifts in their features. Both algorithms are better at handling concept drift than training a new model based on the samples after the drift. Combining the two algorithms can provide an explanation of the drift and is a competitive model against a new model trained on the entire data from before and after the drift.
AB - Decision trees are commonly used in machine learning since they are accurate and robust classifiers. After a decision tree is built, the data can change over time, causing the classification performance to decrease. This data distribution change is a known challenge in machine learning, referred to as concept drift. Once a concept drift has been detected, usually by experiencing a decrease in the model's performance, it can be handled by training a new model. However, this method does not explain the drift harming the performance but only handles the drift's effects. The main contribution of this paper presents a novel two-step approach called APPETITE, which applies diagnosis techniques to identify the feature that has drifted and then adjusts the model accordingly. For the diagnosis step, we present two algorithms. We experimented on 73 known datasets from the literature and semi-synthesized drifts in their features. Both algorithms are better at handling concept drift than training a new model based on the samples after the drift. Combining the two algorithms can provide an explanation of the drift and is a competitive model against a new model trained on the entire data from before and after the drift.
UR - http://www.scopus.com/inward/record.url?scp=85176815050&partnerID=8YFLogxK
U2 - 10.24963/kr.2023/3
DO - 10.24963/kr.2023/3
M3 - Conference contribution
AN - SCOPUS:85176815050
T3 - Proceedings of the International Conference on Knowledge Representation and Reasoning
SP - 23
EP - 33
BT - Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning, KR 2023
A2 - Marquis, Pierre
A2 - Son, Tran Cao
A2 - Kern-Isberner, Gabriele
PB - Association for the Advancement of Artificial Intelligence
T2 - 20th International Conference on Principles of Knowledge Representation and Reasoning, KR 2023
Y2 - 2 September 2023 through 8 September 2023
ER -