Ordinal Labels in Machine Learning: a User-centered Approach to Improve Data Validity in Medical Settings

Overview

Journal BMC Med Inform Decis Mak

Publisher Biomed Central

Specialty Medical Informatics

Date 2020 Aug 22

PMID 32819345

Citations 1

Authors

Andrea Seveso

Andrea Campagner

Davide Ciucci

Federico Cabitza

Affiliations

Soon will be listed here.

Abstract

Background: Despite the vagueness and uncertainty that is intrinsic in any medical act, interpretation and decision (including acts of data reporting and representation of relevant medical conditions), still little research has focused on how to explicitly take this uncertainty into account. In this paper, we focus on the representation of a general and wide-spread medical terminology, which is grounded on a traditional and well-established convention, to represent severity of health conditions (for instance, pain, visible signs), ranging from Absent to Extreme. Specifically, we will study how both potential patients and doctors perceive the different levels of the terminology in both quantitative and qualitative terms, and if the embedded user knowledge could improve the representation of ordinal values in the construction of machine learning models.

Methods: To this aim, we conducted a questionnaire-based research study involving a relatively large sample of 1,152 potential patients and 31 clinicians to represent numerically the perceived meaning of standard and widely-applied labels to describe health conditions. Using these collected values, we then present and discuss different possible fuzzy-set based representations that address the vagueness of medical interpretation by taking into account the perceptions of domain experts. We also apply the findings of this user study to evaluate the impact of different encodings on the predictive performance of common machine learning models in regard to a real-world medical prognostic task.

Results: We found significant differences in the perception of pain levels between the two user groups. We also show that the proposed encodings can improve the performances of specific classes of models, and discuss when this is the case.

Conclusions: In perspective, our hope is that the proposed techniques for ordinal scale representation and ordinal encoding may be useful to the research community, and also that our methodology will be applied to other widely used ordinal scales for improving validity of datasets and bettering the results of machine learning tasks.

Citing Articles

Electronic Health Record and Semantic Issues Using Fast Healthcare Interoperability Resources: Systematic Mapping Review.

Amar F, April A, Abran A J Med Internet Res. 2024; 26:e45209.

PMID: 38289660 PMC: 10865191. DOI: 10.2196/45209.

References

Gulshan V, Peng L, Coram M, Stumpe M, Wu D, Narayanaswamy A . Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA. 2016; 316(22):2402-2410. DOI: 10.1001/jama.2016.17216. View

Challener D, Prokop L, Abu-Saleh O . The Proliferation of Reports on Clinical Scoring Systems: Issues About Uptake and Clinical Utility. JAMA. 2019; 321(24):2405-2406. DOI: 10.1001/jama.2019.5284. View

Baumhauer J, Bozic K . Value-based Healthcare: Patient-reported Outcomes in Clinical Decision Making. Clin Orthop Relat Res. 2016; 474(6):1375-8. PMC: 4868147. DOI: 10.1007/s11999-016-4813-4. View

Black N . Patient reported outcome measures could help transform healthcare. BMJ. 2013; 346:f167. DOI: 10.1136/bmj.f167. View

Hung M, Bounsanga J, Voss M, Saltzman C . Establishing minimum clinically important difference values for the Patient-Reported Outcomes Measurement Information System Physical Function, hip disability and osteoarthritis outcome score for joint reconstruction, and knee injury and.... World J Orthop. 2018; 9(3):41-49. PMC: 5859199. DOI: 10.5312/wjo.v9.i3.41. View

Hernandez G, Garin O, Dima A, Pont A, Marti Pastor M, Alonso J . EuroQol (EQ-5D-5L) Validity in Assessing the Quality of Life in Adults With Asthma: Cross-Sectional Study. J Med Internet Res. 2019; 21(1):e10178. PMC: 6364208. DOI: 10.2196/10178. View

Boyle C . Difference between patients' and doctors' interpretation of some common medical terms. Br Med J. 1970; 2(5704):286-9. PMC: 1700443. DOI: 10.1136/bmj.2.5704.286. View

Forrest M, Hermann G, Andersen B . Assessment of pain: a comparison between patients and doctors. Acta Anaesthesiol Scand. 1989; 33(3):255-6. DOI: 10.1111/j.1399-6576.1989.tb02901.x. View

Saripalle R, Runyan C, Russell M . Using HL7 FHIR to achieve interoperability in patient health record. J Biomed Inform. 2019; 94:103188. DOI: 10.1016/j.jbi.2019.103188. View

10.

Cabitza F, Locoro A, Alderighi C, Rasoini R, Compagnone D, Berjano P . The elephant in the record: On the multiplicity of data recording work. Health Informatics J. 2019; 25(3):475-490. DOI: 10.1177/1460458218824705. View

11.

Fay M, Proschan M . Wilcoxon-Mann-Whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules. Stat Surv. 2010; 4:1-39. PMC: 2857732. DOI: 10.1214/09-SS051. View

12.

Jakobsson U . Statistical presentation and analysis of ordinal data in nursing research. Scand J Caring Sci. 2004; 18(4):437-40. DOI: 10.1111/j.1471-6712.2004.00305.x. View

13.

Esteva A, Kuprel B, Novoa R, Ko J, Swetter S, Blau H . Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017; 542(7639):115-118. PMC: 8382232. DOI: 10.1038/nature21056. View

14.

Vetterlein T, Mandl H, Adlassnig K . Fuzzy Arden Syntax: A fuzzy programming language for medicine. Artif Intell Med. 2010; 49(1):1-10. DOI: 10.1016/j.artmed.2010.01.003. View

15.

Atkinson T, Hay J, Dueck A, Mitchell S, Mendoza T, Rogak L . What Do "None," "Mild," "Moderate," "Severe," and "Very Severe" Mean to Patients With Cancer? Content Validity of PRO-CTCAE™ Response Scales. J Pain Symptom Manage. 2017; 55(3):e3-e6. PMC: 6317851. DOI: 10.1016/j.jpainsymman.2017.10.024. View

16.

Godo L, de Mantaras R, Puyol-Gruart J, Sierra C . Renoir, Pneumon-IA and Terap-IA: three medical applications based on fuzzy logic. Artif Intell Med. 2001; 21(1-3):153-62. DOI: 10.1016/s0933-3657(00)00080-4. View

17.

Forrest M, Andersen B . Ordinal scale and statistics in medical research. Br Med J (Clin Res Ed). 1986; 292(6519):537-8. PMC: 1339517. DOI: 10.1136/bmj.292.6519.537. View

18.

Ranstam J . Why the P-value culture is bad and confidence intervals a better alternative. Osteoarthritis Cartilage. 2012; 20(8):805-8. DOI: 10.1016/j.joca.2012.04.001. View

19.

Ahmadi H, Gholamzadeh M, Shahmoradi L, Nilashi M, Rashvand P . Diseases diagnosis using fuzzy logic methods: A systematic and meta-analysis review. Comput Methods Programs Biomed. 2018; 161:145-172. DOI: 10.1016/j.cmpb.2018.04.013. View

20.

Salomon J . Reconsidering the use of rankings in the valuation of health states: a model for estimating cardinal values from ordinal data. Popul Health Metr. 2003; 1(1):12. PMC: 344742. DOI: 10.1186/1478-7954-1-12. View