Validation of Diagnosis Codes to Identify Side of Colon in an Electronic Health Record Registry
Overview
Health Services
Authors
Affiliations
Background: The use of real-world data to generate evidence requires careful assessment and validation of critical variables before drawing clinical conclusions. Prospective clinical trial data suggest that anatomic origin of colon cancer impacts prognosis and treatment effectiveness. As an initial step in validating this observation in routine clinical settings, we explored the feasibility and accuracy of obtaining information on tumor sidedness from electronic health records (EHR) billing codes.
Methods: Nine thousand four hundred three patients with metastatic colorectal cancer (mCRC) were selected from the Flatiron Health database, which is derived from de-identified EHR data. This study included a random sample of 200 mCRC patients. Tumor site data derived from International Classification of Diseases (ICD) codes were compared with data abstracted from unstructured documents in the EHR (e.g. surgical and pathology notes). Concordance was determined via observed agreement and Cohen's kappa coefficient (κ). Accuracy of ICD codes for each tumor site (left, right, transverse) was determined by calculating the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), and corresponding 95% confidence intervals, using abstracted data as the gold standard.
Results: Study patients had similar characteristics and side of colon distribution compared with the full mCRC dataset. The observed agreement between the ICD codes and abstracted data for tumor site for all sampled patients was 0.58 (κ = 0.41). When restricting to the 62% of patients with a side-specific ICD code, the observed agreement was 0.84 (κ = 0.79). The specificity (92-98%) of structured data for tumor location was high, with lower sensitivity (49-63%), PPV (64-92%) and NPV (72-97%). Demographic and clinical characteristics were similar between patients with specific and non-specific side of colon ICD codes.
Conclusions: ICD codes are a highly reliable indicator of tumor location when the specific location code is entered in the EHR. However, non-specific side of colon ICD codes are present for a sizable minority of patients, and structured data alone may not be adequate to support testing of some research hypotheses. Careful assessment of key variables is required before determining the need for clinical abstraction to supplement structured data in generating real-world evidence from EHRs.
Current trends and future prospects of drug repositioning in gastrointestinal oncology.
Fatemi N, Karimpour M, Bahrami H, Zali M, Chaleshi V, Riccio A Front Pharmacol. 2024; 14:1329244.
PMID: 38239190 PMC: 10794567. DOI: 10.3389/fphar.2023.1329244.
Incidence of Colorectal Cancer in Patients Diagnosed With Pyogenic Liver Abscess.
Suzuki H, Kidder I, Tanaka T, Goto M JAMA Netw Open. 2023; 6(12):e2348218.
PMID: 38109112 PMC: 10728768. DOI: 10.1001/jamanetworkopen.2023.48218.
Brenner R, Amar-Farkash S, Klein-Brill A, Rosenberg-Katz K, Aran D Cancer Control. 2023; 30:10732748231202470.
PMID: 37724508 PMC: 10510351. DOI: 10.1177/10732748231202470.
Hirano T, Negishi M, Kuwatsuru Y, Arai M, Wakabayashi R, Saito N BMC Health Serv Res. 2023; 23(1):274.
PMID: 36944932 PMC: 10029250. DOI: 10.1186/s12913-023-09266-1.
Finn C, Wirtalla C, Roberts S, Collier K, Mehta S, Guerra C JAMA Netw Open. 2023; 6(2):e2255999.
PMID: 36790809 PMC: 9932827. DOI: 10.1001/jamanetworkopen.2022.55999.