» Articles » PMID: 35538298

Clinical Annotations for Prostate Cancer Research: Defining Data Elements, Creating a Reproducible Analytical Pipeline, and Assessing Data Quality

Abstract

Background: Routine clinical data from clinical charts are indispensable for retrospective and prospective observational studies and clinical trials. Their reproducibility is often not assessed. We developed a prostate cancer-specific database for clinical annotations and evaluated data reproducibility.

Methods: For men with prostate cancer who had clinical-grade paired tumor-normal sequencing at a comprehensive cancer center, we performed team-based retrospective data collection from the electronic medical record using a defined source hierarchy. We developed an open-source R package for data processing. With blinded repeat annotation by a reference medical oncologist, we assessed data completeness, reproducibility of team-based annotations, and impact of measurement error on bias in survival analyses.

Results: Data elements on demographics, diagnosis and staging, disease state at the time of procuring a genomically characterized sample, and clinical outcomes were piloted and then abstracted for 2261 patients (with 2631 samples). Completeness of data elements was generally high. Comparing to the repeat annotation by a medical oncologist blinded to the database (100 patients/samples), reproducibility of annotations was high; T stage, metastasis date, and presence and date of castration resistance had lower reproducibility. Impact of measurement error on estimates for strong prognostic factors was modest.

Conclusions: With a prostate cancer-specific data dictionary and quality control measures, manual clinical annotations by a multidisciplinary team can be scalable and reproducible. The data dictionary and the R package for reproducible data processing are freely available to increase data quality and efficiency in clinical prostate cancer research.

Citing Articles

Automated real-world data integration improves cancer outcome prediction.

Jee J, Fong C, Pichotta K, Tran T, Luthra A, Waters M Nature. 2024; 636(8043):728-736.

PMID: 39506116 PMC: 11655358. DOI: 10.1038/s41586-024-08167-5.


promotes oncogenesis and lethal progression of prostate cancer.

Su X, Stopsack K, Schmidt D, Ma D, Li Z, Scheet P Proc Natl Acad Sci U S A. 2024; 121(36):e2405543121.

PMID: 39190349 PMC: 11388324. DOI: 10.1073/pnas.2405543121.


Microsatellite Instability, Tumor Mutational Burden, and Response to Immune Checkpoint Blockade in Patients with Prostate Cancer.

Lenis A, Ravichandran V, Brown S, Alam S, Katims A, Truong H Clin Cancer Res. 2024; 30(17):3894-3903.

PMID: 38949888 PMC: 11371520. DOI: 10.1158/1078-0432.CCR-23-3403.


Identification of Key Elements in Prostate Cancer for Ontology Building via a Multidisciplinary Consensus Agreement.

Moreno A, Solanki A, Xu T, Lin R, Palta J, Daugherty E Cancers (Basel). 2023; 15(12).

PMID: 37370731 PMC: 10295832. DOI: 10.3390/cancers15123121.


The Impact of PIK3R1 Mutations and Insulin-PI3K-Glycolytic Pathway Regulation in Prostate Cancer.

Chakraborty G, Nandakumar S, Hirani R, Nguyen B, Stopsack K, Kreitzer C Clin Cancer Res. 2022; 28(16):3603-3617.

PMID: 35670774 PMC: 9438279. DOI: 10.1158/1078-0432.CCR-21-4272.


References
1.
Harris P, Taylor R, Thielke R, Payne J, Gonzalez N, Conde J . Research electronic data capture (REDCap)--a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform. 2008; 42(2):377-81. PMC: 2700030. DOI: 10.1016/j.jbi.2008.08.010. View

2.
Scher H, Morris M, Stadler W, Higano C, Basch E, Fizazi K . Trial Design and Objectives for Castration-Resistant Prostate Cancer: Updated Recommendations From the Prostate Cancer Clinical Trials Working Group 3. J Clin Oncol. 2016; 34(12):1402-18. PMC: 4872347. DOI: 10.1200/JCO.2015.64.2702. View

3.
Lubeck D, Litwin M, Henning J, Stier D, Mazonson P, Fisk R . The CaPSURE database: a methodology for clinical practice and research in prostate cancer. CaPSURE Research Panel. Cancer of the Prostate Strategic Urologic Research Endeavor. Urology. 1996; 48(5):773-7. DOI: 10.1016/s0090-4295(96)00226-9. View

4.
Nguyen B, Mota J, Nandakumar S, Stopsack K, Weg E, Rathkopf D . Pan-cancer Analysis of CDK12 Alterations Identifies a Subset of Prostate Cancers with Distinct Genomic and Clinical Characteristics. Eur Urol. 2020; 78(5):671-679. PMC: 7572747. DOI: 10.1016/j.eururo.2020.03.024. View

5.
Zehir A, Benayed R, Shah R, Syed A, Middha S, Kim H . Mutational landscape of metastatic cancer revealed from prospective clinical sequencing of 10,000 patients. Nat Med. 2017; 23(6):703-713. PMC: 5461196. DOI: 10.1038/nm.4333. View