TY - JOUR
T1 - Development and Validation of a Colorectal Cancer Prediction Model
T2 - A Nationwide Cohort-Based Study
AU - Isakov, Ofer
AU - Riesel, Dan
AU - Leshchinsky, Michael
AU - Shaham, Galit
AU - Reis, Ben Y.
AU - Keret, Dan
AU - Levi, Zohar
AU - Brener, Baruch
AU - Balicer, Ran
AU - Dagan, Noa
AU - Hayek, Samah
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/1/1
Y1 - 2024/1/1
N2 - Background: Early diagnosis of colorectal cancer (CRC) is critical to increasing survival rates. Computerized risk prediction models hold great promise for identifying individuals at high risk for CRC. In order to utilize such models effectively in a population-wide screening setting, development and validation should be based on cohorts that are similar to the target population. Aim: Establish a risk prediction model for CRC diagnosis based on electronic health records (EHR) from subjects eligible for CRC screening. Methods: A retrospective cohort study utilizing the EHR data of Clalit Health Services (CHS). The study includes CHS members aged 50–74 who were eligible for CRC screening from January 2013 to January 2019. The model was trained to predict receiving a CRC diagnosis within 2 years of the index date. Approximately 20,000 EHR demographic and clinical features were considered. Results: The study includes 2935 subjects with CRC diagnosis, and 1,133,457 subjects without CRC diagnosis. Incidence values of CRC among subjects in the top 1% risk scores were higher than baseline (2.3% vs 0.3%; lift 8.38; P value < 0.001). Cumulative event probabilities increased with higher model scores. Model-based risk stratification among subjects with a positive FOBT, identified subjects with more than twice the risk for CRC compared to FOBT alone. Conclusions: We developed an individualized risk prediction model for CRC that can be utilized as a complementary decision support tool for healthcare providers to precisely identify subjects at high risk for CRC and refer them for confirmatory testing.
AB - Background: Early diagnosis of colorectal cancer (CRC) is critical to increasing survival rates. Computerized risk prediction models hold great promise for identifying individuals at high risk for CRC. In order to utilize such models effectively in a population-wide screening setting, development and validation should be based on cohorts that are similar to the target population. Aim: Establish a risk prediction model for CRC diagnosis based on electronic health records (EHR) from subjects eligible for CRC screening. Methods: A retrospective cohort study utilizing the EHR data of Clalit Health Services (CHS). The study includes CHS members aged 50–74 who were eligible for CRC screening from January 2013 to January 2019. The model was trained to predict receiving a CRC diagnosis within 2 years of the index date. Approximately 20,000 EHR demographic and clinical features were considered. Results: The study includes 2935 subjects with CRC diagnosis, and 1,133,457 subjects without CRC diagnosis. Incidence values of CRC among subjects in the top 1% risk scores were higher than baseline (2.3% vs 0.3%; lift 8.38; P value < 0.001). Cumulative event probabilities increased with higher model scores. Model-based risk stratification among subjects with a positive FOBT, identified subjects with more than twice the risk for CRC compared to FOBT alone. Conclusions: We developed an individualized risk prediction model for CRC that can be utilized as a complementary decision support tool for healthcare providers to precisely identify subjects at high risk for CRC and refer them for confirmatory testing.
KW - Colorectal cancer
KW - Colorectal cancer screening
KW - Machine learning
UR - http://www.scopus.com/inward/record.url?scp=85191305009&partnerID=8YFLogxK
U2 - 10.1007/s10620-024-08427-4
DO - 10.1007/s10620-024-08427-4
M3 - Article
C2 - 38662163
AN - SCOPUS:85191305009
SN - 0163-2116
JO - Digestive Diseases and Sciences
JF - Digestive Diseases and Sciences
ER -