» Articles » PMID: 23139896

Next Generation Sequence Analysis and Computational Genomics Using Graphical Pipeline Workflows

Abstract

Whole-genome and exome sequencing have already proven to be essential and powerful methods to identify genes responsible for simple Mendelian inherited disorders. These methods can be applied to complex disorders as well, and have been adopted as one of the current mainstream approaches in population genetics. These achievements have been made possible by next generation sequencing (NGS) technologies, which require substantial bioinformatics resources to analyze the dense and complex sequence data. The huge analytical burden of data from genome sequencing might be seen as a bottleneck slowing the publication of NGS papers at this time, especially in psychiatric genetics. We review the existing methods for processing NGS data, to place into context the rationale for the design of a computational resource. We describe our method, the Graphical Pipeline for Computational Genomics (GPCG), to perform the computational steps required to analyze NGS data. The GPCG implements flexible workflows for basic sequence alignment, sequence data quality control, single nucleotide polymorphism analysis, copy number variant identification, annotation, and visualization of results. These workflows cover all the analytical steps required for NGS data, from processing the raw reads to variant calling and annotation. The current version of the pipeline is freely available at http://pipeline.loni.ucla.edu. These applications of NGS analysis may gain clinical utility in the near future (e.g., identifying miRNA signatures in diseases) when the bioinformatics approach is made feasible. Taken together, the annotation tools and strategies that have been developed to retrieve information and test hypotheses about the functional role of variants present in the human genome will help to pinpoint the genetic risk factors for psychiatric disorders.

Citing Articles

Local data commons: the sleeping beauty in the community of data commons.

Jeong J, Hands I, Kolesar J, Rao M, Davis B, Dobyns Y BMC Bioinformatics. 2022; 23(Suppl 12):386.

PMID: 36151511 PMC: 9502580. DOI: 10.1186/s12859-022-04922-5.


Maternal regulation of biliary disease in neonates via gut microbial metabolites.

Jee J, Yang L, Shivakumar P, Xu P, Mourya R, Thanekar U Nat Commun. 2022; 13(1):18.

PMID: 35013245 PMC: 8748778. DOI: 10.1038/s41467-021-27689-4.


Three-dimensional self-attention conditional GAN with spectral normalization for multimodal neuroimaging synthesis.

Lan H, Toga A, Sepehrband F Magn Reson Med. 2021; 86(3):1718-1733.

PMID: 33961321 PMC: 9070032. DOI: 10.1002/mrm.28819.


Global and Regional Changes in Perivascular Space in Idiopathic and Familial Parkinson's Disease.

Donahue E, Murdos A, Jakowec M, Sheikh-Bahaei N, Toga A, Petzinger G Mov Disord. 2021; 36(5):1126-1136.

PMID: 33470460 PMC: 8127386. DOI: 10.1002/mds.28473.


Volumetric distribution of perivascular space in relation to mild cognitive impairment.

Sepehrband F, Barisano G, Sheikh-Bahaei N, Choupan J, Cabeen R, Lynch K Neurobiol Aging. 2021; 99:28-43.

PMID: 33422892 PMC: 7902350. DOI: 10.1016/j.neurobiolaging.2020.12.010.


References
1.
Wheeler D, Srinivasan M, Egholm M, Shen Y, Chen L, McGuire A . The complete genome of an individual by massively parallel DNA sequencing. Nature. 2008; 452(7189):872-6. DOI: 10.1038/nature06884. View

2.
Korbel J, Abyzov A, Mu X, Carriero N, Cayting P, Zhang Z . PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data. Genome Biol. 2009; 10(2):R23. PMC: 2688268. DOI: 10.1186/gb-2009-10-2-r23. View

3.
Olson S . EMBOSS opens up sequence analysis. European Molecular Biology Open Software Suite. Brief Bioinform. 2002; 3(1):87-91. DOI: 10.1093/bib/3.1.87. View

4.
Rumble S, Lacroute P, Dalca A, Fiume M, Sidow A, Brudno M . SHRiMP: accurate mapping of short color-space reads. PLoS Comput Biol. 2009; 5(5):e1000386. PMC: 2678294. DOI: 10.1371/journal.pcbi.1000386. View

5.
Mokry M, Feitsma H, Nijman I, de Bruijn E, van der Zaag P, Guryev V . Accurate SNP and mutation detection by targeted custom microarray-based genomic enrichment of short-fragment sequencing libraries. Nucleic Acids Res. 2010; 38(10):e116. PMC: 2879533. DOI: 10.1093/nar/gkq072. View