» Articles » PMID: 31542521

Making Work Visible for Electronic Phenotype Implementation: Lessons Learned from the EMERGE Network

Abstract

Background: Implementation of phenotype algorithms requires phenotype engineers to interpret human-readable algorithms and translate the description (text and flowcharts) into computable phenotypes - a process that can be labor intensive and error prone. To address the critical need for reducing the implementation efforts, it is important to develop portable algorithms.

Methods: We conducted a retrospective analysis of phenotype algorithms developed in the Electronic Medical Records and Genomics (eMERGE) network and identified common customization tasks required for implementation. A novel scoring system was developed to quantify portability from three aspects: Knowledge conversion, clause Interpretation, and Programming (KIP). Tasks were grouped into twenty representative categories. Experienced phenotype engineers were asked to estimate the average time spent on each category and evaluate time saving enabled by a common data model (CDM), specifically the Observational Medical Outcomes Partnership (OMOP) model, for each category.

Results: A total of 485 distinct clauses (phenotype criteria) were identified from 55 phenotype algorithms, corresponding to 1153 customization tasks. In addition to 25 non-phenotype-specific tasks, 46 tasks are related to interpretation, 613 tasks are related to knowledge conversion, and 469 tasks are related to programming. A score between 0 and 2 (0 for easy, 1 for moderate, and 2 for difficult portability) is assigned for each aspect, yielding a total KIP score range of 0 to 6. The average clause-wise KIP score to reflect portability is 1.37 ± 1.38. Specifically, the average knowledge (K) score is 0.64 ± 0.66, interpretation (I) score is 0.33 ± 0.55, and programming (P) score is 0.40 ± 0.64. 5% of the categories can be completed within one hour (median). 70% of the categories take from days to months to complete. The OMOP model can assist with vocabulary mapping tasks.

Conclusion: This study presents firsthand knowledge of the substantial implementation efforts in phenotyping and introduces a novel metric (KIP) to measure portability of phenotype algorithms for quantifying such efforts across the eMERGE Network. Phenotype developers are encouraged to analyze and optimize the portability in regards to knowledge, interpretation and programming. CDMs can be used to improve the portability for some 'knowledge-oriented' tasks.

Citing Articles

Real-World Evidence BRIDGE: A Tool to Connect Protocol With Code Programming.

Royo A, Elbers Jhj R, Weibel D, Hoxhaj V, Kurkcuoglu Z, Sturkenboom M Pharmacoepidemiol Drug Saf. 2024; 33(12):e70062.

PMID: 39603653 PMC: 11602246. DOI: 10.1002/pds.70062.


Representing and utilizing clinical textual data for real world studies: An OHDSI approach.

Keloth V, Banda J, Gurley M, Heider P, Kennedy G, Liu H J Biomed Inform. 2023; 142:104343.

PMID: 36935011 PMC: 10428170. DOI: 10.1016/j.jbi.2023.104343.


Evaluation of the portability of computable phenotypes with natural language processing in the eMERGE network.

Pacheco J, Rasmussen L, Wiley Jr K, Person T, Cronkite D, Sohn S Sci Rep. 2023; 13(1):1971.

PMID: 36737471 PMC: 9898520. DOI: 10.1038/s41598-023-27481-y.


Multimodal machine learning in precision health: A scoping review.

Kline A, Wang H, Li Y, Dennis S, Hutch M, Xu Z NPJ Digit Med. 2022; 5(1):171.

PMID: 36344814 PMC: 9640667. DOI: 10.1038/s41746-022-00712-8.


Translating and evaluating historic phenotyping algorithms using SNOMED CT.

Elkheder M, Gonzalez-Izquierdo A, Qummer Ul Arfeen M, Kuan V, Lumbers R, Denaxas S J Am Med Inform Assoc. 2022; 30(2):222-232.

PMID: 36083213 PMC: 9846670. DOI: 10.1093/jamia/ocac158.


References
1.
Mo H, Pacheco J, Rasmussen L, Speltz P, Pathak J, Denny J . A Prototype for Executable and Portable Electronic Clinical Quality Measures Using the KNIME Analytics Platform. AMIA Jt Summits Transl Sci Proc. 2015; 2015:127-31. PMC: 4525225. View

2.
Newton K, Peissig P, Kho A, Bielinski S, Berg R, Choudhary V . Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network. J Am Med Inform Assoc. 2013; 20(e1):e147-54. PMC: 3715338. DOI: 10.1136/amiajnl-2012-000896. View

3.
Denny J, Spickard 3rd A, Johnson K, Peterson N, Peterson J, Miller R . Evaluation of a method to identify and categorize section headers in clinical documents. J Am Med Inform Assoc. 2009; 16(6):806-15. PMC: 3002123. DOI: 10.1197/jamia.M3037. View

4.
McCarty C, Chisholm R, Chute C, Kullo I, Jarvik G, Larson E . The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies. BMC Med Genomics. 2011; 4:13. PMC: 3038887. DOI: 10.1186/1755-8794-4-13. View

5.
Hripcsak G . Writing Arden Syntax Medical Logic Modules. Comput Biol Med. 1994; 24(5):331-63. DOI: 10.1016/0010-4825(94)90002-7. View