Drifter: Efficient Online Feature Monitoring for Improved Data Integrity in Large-Scale Recommendation Systems

Blaž Škrlj, Nir Ki-To, Lee Edelist, Natalia Silberstein, Hila Weisman-Zoha, Blaž Mramor, Davorin Kopic, Naama Ziporin

Research output: Contribution to journalConference articlepeer-review

Abstract

Real-world production systems often grapple with maintaining data quality in large-scale, dynamic streams. We introduce Drifter, an efficient and lightweight system for online feature monitoring and verification in recommendation use cases. Drifter addresses limitations of existing methods by delivering agile, responsive, and adaptable data quality monitoring, enabling real-time root cause analysis, drift detection and insights into problematic production events. Integrating state-of-the-art online feature ranking for sparse data and anomaly detection ideas, Drifter is highly scalable and resource-efficient, requiring only two threads and less than a gigabyte of RAM per production deployments that handle millions of instances per minute (model training). Drifter's effectiveness in alerting and mitigating data quality issues was demonstrated on a real-life system that handles up to a billion predictions per second.

Original languageEnglish
JournalCEUR Workshop Proceedings
Volume3549
StatePublished - 1 Jan 2023
Externally publishedYes
Event6th Workshop on Online Recommender Systems and User Modeling, ORSUM 2023 - Singapore, Singapore
Duration: 19 Sep 2023 → …

Keywords

  • feature monitoring
  • online advertising
  • online learning
  • recommender systems

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Drifter: Efficient Online Feature Monitoring for Improved Data Integrity in Large-Scale Recommendation Systems'. Together they form a unique fingerprint.

Cite this