» Articles » PMID: 34429852

A Primer on Machine Learning Techniques for Genomic Applications

Overview
Specialty Biotechnology
Date 2021 Aug 25
PMID 34429852
Citations 8
Authors
Affiliations
Soon will be listed here.
Abstract

High throughput sequencing technologies have enabled the study of complex biological aspects at single nucleotide resolution, opening the big data era. The analysis of large volumes of heterogeneous "omic" data, however, requires novel and efficient computational algorithms based on the paradigm of Artificial Intelligence. In the present review, we introduce and describe the most common machine learning methodologies, and lately deep learning, applied to a variety of genomics tasks, trying to emphasize capabilities, strengths and limitations through a simple and intuitive language. We highlight the power of the machine learning approach in handling big data by means of a real life example, and underline how described methods could be relevant in all cases in which large amounts of multimodal genomic data are available.

Citing Articles

When less is more: sketching with minimizers in genomics.

Ndiaye M, Prieto-Banos S, Fitzgerald L, Yazdizadeh Kharrazi A, Oreshkov S, Dessimoz C Genome Biol. 2024; 25(1):270.

PMID: 39402664 PMC: 11472564. DOI: 10.1186/s13059-024-03414-4.


TIRESIA and TISBE: Explainable Artificial Intelligence Based Web Platforms for the Transparent Assessment of the Developmental Toxicity of Chemicals and Drugs.

Togo M, Mastrolorito F, Gambacorta N, Trisciuzzi D, Tondo A, Cutropia F Methods Mol Biol. 2024; 2834:373-391.

PMID: 39312175 DOI: 10.1007/978-1-0716-4003-6_18.


Missing genotype imputation in non-model species using self-organizing maps.

Mora-Marquez F, Nuno J, Soto A, Lopez de Heredia U Mol Ecol Resour. 2024; 25(3):e13992.

PMID: 38970328 PMC: 11887599. DOI: 10.1111/1755-0998.13992.


Explainable artificial intelligence and microbiome data for food geographical origin: the Mozzarella di Bufala Campana PDO Case of Study.

Magarelli M, Novielli P, De Filippis F, Magliulo R, Di Bitonto P, Diacono D Front Microbiol. 2024; 15:1393243.

PMID: 38887708 PMC: 11180736. DOI: 10.3389/fmicb.2024.1393243.


Deep learning in bioinformatics.

Yousef M, Allmer J Turk J Biol. 2024; 47(6):366-382.

PMID: 38681776 PMC: 11045206. DOI: 10.55730/1300-0152.2671.


References
1.
Aevermann B, Novotny M, Bakken T, Miller J, Diehl A, Osumi-Sutherland D . Cell type discovery using single-cell transcriptomics: implications for ontological representation. Hum Mol Genet. 2018; 27(R1):R40-R47. PMC: 5946857. DOI: 10.1093/hmg/ddy100. View

2.
Reuter J, Spacek D, Snyder M . High-throughput sequencing technologies. Mol Cell. 2015; 58(4):586-97. PMC: 4494749. DOI: 10.1016/j.molcel.2015.05.004. View

3.
Boloc D, Gortat A, Cheng-Zhang J, Garcia-Cerro S, Rodriguez N, Parellada M . Improving pharmacogenetic prediction of extrapyramidal symptoms induced by antipsychotics. Transl Psychiatry. 2018; 8(1):276. PMC: 6293322. DOI: 10.1038/s41398-018-0330-4. View

4.
Barros-Silva D, Marques C, Henrique R, Jeronimo C . Profiling DNA Methylation Based on Next-Generation Sequencing Approaches: New Insights and Clinical Applications. Genes (Basel). 2018; 9(9). PMC: 6162482. DOI: 10.3390/genes9090429. View

5.
Wang Q, Garrity G, Tiedje J, Cole J . Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol. 2007; 73(16):5261-7. PMC: 1950982. DOI: 10.1128/AEM.00062-07. View