» Articles » PMID: 26156435

Quality of Big Data in Health Care

Overview
Date 2015 Jul 10
PMID 26156435
Citations 14
Authors
Affiliations
Soon will be listed here.
Abstract

Purpose: The current trend in Big Data analytics and in particular health information technology is toward building sophisticated models, methods and tools for business, operational and clinical intelligence. However, the critical issue of data quality required for these models is not getting the attention it deserves. The purpose of this paper is to highlight the issues of data quality in the context of Big Data health care analytics.

Design/methodology/approach: The insights presented in this paper are the results of analytics work that was done in different organizations on a variety of health data sets. The data sets include Medicare and Medicaid claims, provider enrollment data sets from both public and private sources, electronic health records from regional health centers accessed through partnerships with health care claims processing entities under health privacy protected guidelines.

Findings: Assessment of data quality in health care has to consider: first, the entire lifecycle of health data; second, problems arising from errors and inaccuracies in the data itself; third, the source(s) and the pedigree of the data; and fourth, how the underlying purpose of data collection impact the analytic processing and knowledge expected to be derived. Automation in the form of data handling, storage, entry and processing technologies is to be viewed as a double-edged sword. At one level, automation can be a good solution, while at another level it can create a different set of data quality issues. Implementation of health care analytics with Big Data is enabled by a road map that addresses the organizational and technological aspects of data quality assurance.

Practical Implications: The value derived from the use of analytics should be the primary determinant of data quality. Based on this premise, health care enterprises embracing Big Data should have a road map for a systematic approach to data quality. Health care data quality problems can be so very specific that organizations might have to build their own custom software or data quality rule engines.

Originality/value: Today, data quality issues are diagnosed and addressed in a piece-meal fashion. The authors recommend a data lifecycle approach and provide a road map, that is more appropriate with the dimensions of Big Data and fits different stages in the analytical workflow.

Citing Articles

Identifying High-Priority Ethical Challenges for Precision Emergency Medicine: Nominal Group Study.

Rose C, Shearer E, Woller I, Foster A, Ashenburg N, Kim I JMIR Form Res. 2025; 9:e68371.

PMID: 39916376 PMC: 11825900. DOI: 10.2196/68371.


Patterns of availability and accuracy of risk factor data for cardiovascular diseases among people initiated on antiretroviral therapy at selected health facilities in Khomas region, Namibia: a retrospective, cross-sectional, quantitative study.

Mahalie R, Angula P, Mitonga K, Oladimeji O Pan Afr Med J. 2024; 47:33.

PMID: 38586067 PMC: 10998256. DOI: 10.11604/pamj.2024.47.33.25340.


Digital Health Data Quality Issues: Systematic Review.

Syed R, Eden R, Makasi T, Chukwudi I, Mamudu A, Kamalpour M J Med Internet Res. 2023; 25:e42615.

PMID: 37000497 PMC: 10131725. DOI: 10.2196/42615.


Big Data Analytics to Reduce Preventable Hospitalizations-Using Real-World Data to Predict Ambulatory Care-Sensitive Conditions.

Schulte T, Wurz T, Groene O, Bohnet-Joschko S Int J Environ Res Public Health. 2023; 20(6).

PMID: 36981600 PMC: 10049041. DOI: 10.3390/ijerph20064693.


Nurse-led Telehealth Intervention for Rehabilitation (Telerehabilitation) Among Community-Dwelling Patients With Chronic Diseases: Systematic Review and Meta-analysis.

Lee A, Wong A, Hung T, Yan J, Yang S J Med Internet Res. 2022; 24(11):e40364.

PMID: 36322107 PMC: 9669889. DOI: 10.2196/40364.