TY - GEN
T1 - Mind the Gap
T2 - 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2025
AU - Sarig, Tal
AU - Guy, Ido
AU - Tavory, Ami
AU - Weinsberg, Udi
AU - Ioannidis, Stratis
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/8/3
Y1 - 2025/8/3
N2 - The purpose of an online electronic-payment risk detection system is to prevent leakage, i.e., the loss of revenue that occurs when users fail to pay for services or when transactions are reversed. Nonpayment prediction models are trained on datasets comprising of features available when the model is triggered and the corresponding nonpayment labels. The latter are typically only observed several weeks or even months later. Furthermore, behavior indicative of future nonpayment is highly non-stationary, and the true model may drift significantly in the gap between trigger events and label collection. To address these challenges, we use post-transaction signals to generate pseudo-labels, i.e., short-term proxies [23] or surrogate-indices [33]. Our framework attains a favorable tradeoff between ameliorating bias due to drift and introducing variance due to pseudo-label noise, as demonstrated by both offline and online experiments on several nonpayment-detection systems at Meta. Our deployment on live user traffic yields a statistically significant improvement in revenue, accounting also for leakage.
AB - The purpose of an online electronic-payment risk detection system is to prevent leakage, i.e., the loss of revenue that occurs when users fail to pay for services or when transactions are reversed. Nonpayment prediction models are trained on datasets comprising of features available when the model is triggered and the corresponding nonpayment labels. The latter are typically only observed several weeks or even months later. Furthermore, behavior indicative of future nonpayment is highly non-stationary, and the true model may drift significantly in the gap between trigger events and label collection. To address these challenges, we use post-transaction signals to generate pseudo-labels, i.e., short-term proxies [23] or surrogate-indices [33]. Our framework attains a favorable tradeoff between ameliorating bias due to drift and introducing variance due to pseudo-label noise, as demonstrated by both offline and online experiments on several nonpayment-detection systems at Meta. Our deployment on live user traffic yields a statistically significant improvement in revenue, accounting also for leakage.
KW - concept drift
KW - delayed feedback
KW - nonpayment models
UR - https://www.scopus.com/pages/publications/105014326792
U2 - 10.1145/3711896.3737247
DO - 10.1145/3711896.3737247
M3 - Conference contribution
AN - SCOPUS:105014326792
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 4784
EP - 4795
BT - KDD 2025 - Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
Y2 - 3 August 2025 through 7 August 2025
ER -