» Articles » PMID: 23714400

Identifying Cancer Mutation Targets Across Thousands of Samples: MuteProc, a High Throughput Mutation Analysis Pipeline

Overview
Publisher Biomed Central
Specialty Biology
Date 2013 May 30
PMID 23714400
Citations 1
Authors
Affiliations
Soon will be listed here.
Abstract

Background: In the past decade, bioinformatics tools have matured enough to reliably perform sophisticated primary data analysis on Next Generation Sequencing (NGS) data, such as mapping, assemblies and variant calling, however, there is still a dire need for improvements in the higher level analysis such as NGS data organization, analysis of mutation patterns and Genome Wide Association Studies (GWAS).

Results: We present a high throughput pipeline for identifying cancer mutation targets, capable of processing billions of variations across thousands of samples. This pipeline is coupled with our Human Variation Database to provide more complex down stream analysis on the variations hosted in the database. Most notably, these analysis include finding significantly mutated regions across multiple genomes and regions with mutational preferences within certain types of cancers. The results of the analysis is presented in HTML summary reports that incorporate gene annotations from various resources for the reported regions.

Conclusion: MuteProc is available for download through the Vancouver Short Read Analysis Package on Sourceforge: http://vancouvershortr.sourceforge.net. Instructions for use and a tutorial are provided on the accompanying wiki pages at https://sourceforge.net/apps/mediawiki/vancouvershortr/index.php?title=Pipeline_introduction.

Citing Articles

Applying Expression Profile Similarity for Discovery of Patient-Specific Functional Mutations.

Meng G High Throughput. 2018; 7(1).

PMID: 29485617 PMC: 5876532. DOI: 10.3390/ht7010006.

References
1.
Huang F, Hodis E, Xu M, Kryukov G, Chin L, Garraway L . Highly recurrent TERT promoter mutations in human melanoma. Science. 2013; 339(6122):957-9. PMC: 4423787. DOI: 10.1126/science.1229259. View

2.
Sherry S, Ward M, Kholodov M, Baker J, Phan L, Smigielski E . dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2000; 29(1):308-11. PMC: 29783. DOI: 10.1093/nar/29.1.308. View

3.
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A . The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010; 20(9):1297-303. PMC: 2928508. DOI: 10.1101/gr.107524.110. View

4.
Pasqualucci L, Neumeister P, Goossens T, Nanjangud G, Chaganti R, Kuppers R . Hypermutation of multiple proto-oncogenes in B-cell diffuse large-cell lymphomas. Nature. 2001; 412(6844):341-6. DOI: 10.1038/35085588. View

5.
Haider S, Ballester B, Smedley D, Zhang J, Rice P, Kasprzyk A . BioMart Central Portal--unified access to biological data. Nucleic Acids Res. 2009; 37(Web Server issue):W23-7. PMC: 2703988. DOI: 10.1093/nar/gkp265. View