» Articles » PMID: 26843812

Big Data Application in Biomedical Research and Health Care: A Literature Review

Overview
Publisher Sage Publications
Date 2016 Feb 5
PMID 26843812
Citations 98
Authors
Affiliations
Soon will be listed here.
Abstract

Big data technologies are increasingly used for biomedical and health-care informatics research. Large amounts of biological and clinical data have been generated and collected at an unprecedented speed and scale. For example, the new generation of sequencing technologies enables the processing of billions of DNA sequence data per day, and the application of electronic health records (EHRs) is documenting large amounts of patient data. The cost of acquiring and analyzing biomedical data is expected to decrease dramatically with the help of technology upgrades, such as the emergence of new sequencing machines, the development of novel hardware and software for parallel computing, and the extensive expansion of EHRs. Big data applications present new opportunities to discover new knowledge and create novel methods to improve the quality of health care. The application of big data in health care is a fast-growing field, with many new discoveries and methodologies published in the last five years. In this paper, we review and discuss big data application in four major biomedical subdisciplines: (1) bioinformatics, (2) clinical informatics, (3) imaging informatics, and (4) public health informatics. Specifically, in bioinformatics, high-throughput experiments facilitate the research of new genome-wide association studies of diseases, and with clinical informatics, the clinical field benefits from the vast amount of collected patient data for making intelligent decisions. Imaging informatics is now more rapidly integrated with cloud platforms to share medical image data and workflows, and public health informatics leverages big data techniques for predicting and monitoring infectious disease outbreaks, such as Ebola. In this paper, we review the recent progress and breakthroughs of big data applications in these health-care domains and summarize the challenges, gaps, and opportunities to improve and advance big data applications in health care.

Citing Articles

The impacts on population health by China's regional health data centers and the potential mechanism of influence.

Cai J, Li Y, Coyte P Digit Health. 2025; 11():20552076251314102.

PMID: 39830144 PMC: 11742170. DOI: 10.1177/20552076251314102.


Navigating ethics in HIV data and biomaterial management within Black, African, and Caribbean communities in Canada.

Souleymanov R, Akinyele-Akanbi B, Njeze C, Ukoli P, Migliardi P, Larcombe L BMC Med Ethics. 2025; 26(1):5.

PMID: 39815313 PMC: 11737225. DOI: 10.1186/s12910-025-01161-0.


Annotated corpus for traditional formula-disease relationships in biomedical articles.

Yea S, Jang H, Kim S, Lee S, Kim J Sci Data. 2025; 12(1):26.

PMID: 39774689 PMC: 11707285. DOI: 10.1038/s41597-025-04377-2.


: a Python package for simplifying cBioPortal data access in cancer research.

Valerio M, Inno A, Gori S JAMIA Open. 2024; 8(1):ooae146.

PMID: 39735786 PMC: 11671144. DOI: 10.1093/jamiaopen/ooae146.


Demographic factors, knowledge, attitude and perception and their association with nursing students' intention to use artificial intelligence (AI): a multicentre survey across 10 Arab countries.

Al Omari O, Alshammari M, Al Jabri W, Al Yahyaei A, Aljohani K, Sanad H BMC Med Educ. 2024; 24(1):1456.

PMID: 39696341 PMC: 11653676. DOI: 10.1186/s12909-024-06452-5.


References
1.
He C, Fan X, Li Y . Toward ubiquitous healthcare services with a novel efficient cloud platform. IEEE Trans Biomed Eng. 2012; 60(1):230-4. DOI: 10.1109/TBME.2012.2222404. View

2.
Nagasaki H, Mochizuki T, Kodama Y, Saruhashi S, Morizaki S, Sugawara H . DDBJ read annotation pipeline: a cloud computing-based pipeline for high-throughput analysis of next-generation sequencing data. DNA Res. 2013; 20(4):383-90. PMC: 3738164. DOI: 10.1093/dnares/dst017. View

3.
Wiewiorka M, Messina A, Pacholewska A, Maffioletti S, Gawrysiak P, Okoniewski M . SparkSeq: fast, scalable and cloud-ready tool for the interactive genomic data analysis with nucleotide precision. Bioinformatics. 2014; 30(18):2652-3. DOI: 10.1093/bioinformatics/btu343. View

4.
Reiner B . Medical imaging data reconciliation, part 3: reconciliation of historical and current radiology report data. J Am Coll Radiol. 2011; 8(11):768-71. DOI: 10.1016/j.jacr.2011.04.021. View

5.
Schumacher A, Pireddu L, Niemenmaa M, Kallio A, Korpelainen E, Zanetti G . SeqPig: simple and scalable scripting for large sequencing data sets in Hadoop. Bioinformatics. 2013; 30(1):119-20. PMC: 3866557. DOI: 10.1093/bioinformatics/btt601. View