Deep Learning of the Electronic Health Record, Imaging and Genetic Data for Risk Protection of Post Colonoscopy Colorectal Cancer
Colonoscopy screening reduces incidence and mortality of colorectal cancer (CRC) by removal of precursor lesions (polyps) and detection of early cancers. After screening, individuals are recommended to undergone colonoscopy surveillance in variable intervals, depending on the positivity and severity of the findings. However, the current clinical guidelines for colonoscopy surveillance are based on expert interpretation of the literature and lack sufficient support. There is an urgent need to develop data-based risk stratification tools to potentially tailor colonoscopy surveillance for better prevention of CRC. In this project, we propose to integrate the rich electronic health record and imaging data to develop a prediction model for post-colonoscopy colorectal neoplasia. We will leverage a longitudinal cohort of patients undergoing colonoscopies in a large integrated health care system, the Mass General Brigham. The study outcome is the incidence of CRC and/or advanced polyps (collectively termed as advanced neoplasia) after index colonoscopy. We have assembled detailed endoscopic and polyp data from index and follow-up colonoscopies supplemented by use of validated natural language processing algorithms; curated routine clinical and laboratory data; and developed and validated a fully automated, deep learning model for body composition analysis using abdominal computed tomography scans in the MGB. Leveraging these readily available data, we propose to develop an electronic health record-based risk assessment tool using deep learning to predict advanced neoplasia (Aim 1) and then to assess the added value of further including CT-derived body composition characteristics (Aim 2). In summary, our project uses the novel deep learning approach in a large healthcare system to develop a risk prediction model for personalized colonoscopy surveillance.


