Can we identify individuals at risk to develop multiple myeloma? A machine learning-based predictive model

Moshe Mittelman, Ariel Israel, Howard S. Oster, Michael Leshchinsky, Yatir Ben-Shlomo, Eldad Kepten, Osnat Jarchowsky Dolberg, Ran Balicer, Galit Shaham

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Multiple myeloma evolves unnoticed over years, and when diagnosed, organ damage is common. Electronic health records (EHR) can help in developing predictive models identifying ‘healthy’ people at risk. MM patients from Clalit Health Services (2002–2019) were matched with healthy controls. Stage I: EHR from 5 years prior to MM diagnosis were reviewed and >200 parameters were compared (patients vs. controls). Stage II: Establishing xgboost model predicting 5 year risk for MM, with validation. Stage III: A simplified logistic regression model for community, requiring 20 variables (Age; Hb; RBC; MCV; RDW; WBC; neutrophils; lymphocytes; monocytes; basophils; glucose; creatinine; total protein; albumin; calcium; uric acid; bilirubin; HDL-C; LDL-C; triglycerides). EHR from the pre-MM period of 4256 patients were compared to controls. Future MM patients had higher ESR, lower Hb, ANC, neutrophil/lymphocyte ratio, higher globulins and ferritin, more immune deficiencies, MDS and FMF. They took fewer tranquilizers, anti-diabetics and statins. Using labs from future MM (n = 19 129) and controls (n = 382 580, 20:1), a predictive model was developed (ROC AUC = 0.836). The simple LR model provided individual risk prediction for MM within 5 years (AUC = 0.72). Two models with machine learning predict the risk of myeloma in ‘healthy’ individuals within 5 years. The models can be used in practice.

Original languageEnglish
JournalBritish Journal of Haematology
DOIs
StateAccepted/In press - 1 Jan 2025
Externally publishedYes

Keywords

  • computer modelling
  • disease prediction
  • gradient boosted
  • logistic regression
  • multiple myeloma

ASJC Scopus subject areas

  • Hematology

Fingerprint

Dive into the research topics of 'Can we identify individuals at risk to develop multiple myeloma? A machine learning-based predictive model'. Together they form a unique fingerprint.

Cite this