» Articles » PMID: 29901703

MetaMap: an Atlas of Metatranscriptomic Reads in Human Disease-related RNA-seq Data

Overview
Journal Gigascience
Specialties Biology
Genetics
Date 2018 Jun 15
PMID 29901703
Citations 15
Authors
Affiliations
Soon will be listed here.
Abstract

Background: With the advent of the age of big data in bioinformatics, large volumes of data and high-performance computing power enable researchers to perform re-analyses of publicly available datasets at an unprecedented scale. Ever more studies imply the microbiome in both normal human physiology and a wide range of diseases. RNA sequencing technology (RNA-seq) is commonly used to infer global eukaryotic gene expression patterns under defined conditions, including human disease-related contexts; however, its generic nature also enables the detection of microbial and viral transcripts.

Findings: We developed a bioinformatic pipeline to screen existing human RNA-seq datasets for the presence of microbial and viral reads by re-inspecting the non-human-mapping read fraction. We validated this approach by recapitulating outcomes from six independent, controlled infection experiments of cell line models and compared them with an alternative metatranscriptomic mapping strategy. We then applied the pipeline to close to 150 terabytes of publicly available raw RNA-seq data from  more than 17,000 samples from more than 400 studies relevant to human disease using state-of-the-art high-performance computing systems. The resulting data from this large-scale re-analysis are made available in the presented MetaMap resource.

Conclusions: Our results demonstrate that common human RNA-seq data, including those archived in public repositories, might contain valuable information to correlate microbial and viral detection patterns with diverse diseases. The presented MetaMap database thus provides a rich resource for hypothesis generation toward the role of the microbiome in human disease. Additionally, codes to process new datasets and perform statistical analyses are made available.

Citing Articles

Three modes of viral adaption by the heart.

Griffiths C, Shah M, Shao W, Borgman C, Janes K Sci Adv. 2024; 10(46):eadp6303.

PMID: 39536108 PMC: 11559625. DOI: 10.1126/sciadv.adp6303.


Three Modes of Viral Adaption by the Heart.

Griffiths C, Shah M, Shao W, Borgman C, Janes K bioRxiv. 2024; .

PMID: 38585853 PMC: 10996681. DOI: 10.1101/2024.03.28.587274.


Pathogen detection in RNA-seq data with Pathonoia.

Liebhoff A, Menden K, Laschtowitz A, Franke A, Schramm C, Bonn S BMC Bioinformatics. 2023; 24(1):53.

PMID: 36803415 PMC: 9938591. DOI: 10.1186/s12859-023-05144-z.


Metadata retrieval from sequence databases with ffq.

Galvez-Merchan A, Min K, Pachter L, Booeshaghi A Bioinformatics. 2023; 39(1).

PMID: 36610997 PMC: 9883619. DOI: 10.1093/bioinformatics/btac667.


Hypothesis of a potential BrainBiota and its relation to CNS autoimmune inflammation.

Elkjaer M, Simon L, Frisch T, Bente L, Kacprowski T, Thomassen M Front Immunol. 2022; 13:1043579.

PMID: 36532064 PMC: 9756883. DOI: 10.3389/fimmu.2022.1043579.


References
1.
Turnbaugh P, Ley R, Mahowald M, Magrini V, Mardis E, Gordon J . An obesity-associated gut microbiome with increased capacity for energy harvest. Nature. 2006; 444(7122):1027-31. DOI: 10.1038/nature05414. View

2.
Juranic Lisnic V, Babic Cac M, Lisnic B, Trsan T, Mefferd A, Mukhopadhyay C . Dual analysis of the murine cytomegalovirus and host cell transcriptomes reveal new aspects of the virus-host cell interface. PLoS Pathog. 2013; 9(9):e1003611. PMC: 3784481. DOI: 10.1371/journal.ppat.1003611. View

3.
Lindgreen S, Adair K, Gardner P . An evaluation of the accuracy and speed of metagenome analysis tools. Sci Rep. 2016; 6:19233. PMC: 4726098. DOI: 10.1038/srep19233. View

4.
Love M, Huber W, Anders S . Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014; 15(12):550. PMC: 4302049. DOI: 10.1186/s13059-014-0550-8. View

5.
Ounit R, Lonardi S . Higher classification sensitivity of short metagenomic reads with CLARK-S. Bioinformatics. 2016; 32(24):3823-3825. DOI: 10.1093/bioinformatics/btw542. View