A Primer on Machine Learning Techniques for Genomic Applications
Overview
Authors
Affiliations
High throughput sequencing technologies have enabled the study of complex biological aspects at single nucleotide resolution, opening the big data era. The analysis of large volumes of heterogeneous "omic" data, however, requires novel and efficient computational algorithms based on the paradigm of Artificial Intelligence. In the present review, we introduce and describe the most common machine learning methodologies, and lately deep learning, applied to a variety of genomics tasks, trying to emphasize capabilities, strengths and limitations through a simple and intuitive language. We highlight the power of the machine learning approach in handling big data by means of a real life example, and underline how described methods could be relevant in all cases in which large amounts of multimodal genomic data are available.
When less is more: sketching with minimizers in genomics.
Ndiaye M, Prieto-Banos S, Fitzgerald L, Yazdizadeh Kharrazi A, Oreshkov S, Dessimoz C Genome Biol. 2024; 25(1):270.
PMID: 39402664 PMC: 11472564. DOI: 10.1186/s13059-024-03414-4.
Togo M, Mastrolorito F, Gambacorta N, Trisciuzzi D, Tondo A, Cutropia F Methods Mol Biol. 2024; 2834:373-391.
PMID: 39312175 DOI: 10.1007/978-1-0716-4003-6_18.
Missing genotype imputation in non-model species using self-organizing maps.
Mora-Marquez F, Nuno J, Soto A, Lopez de Heredia U Mol Ecol Resour. 2024; 25(3):e13992.
PMID: 38970328 PMC: 11887599. DOI: 10.1111/1755-0998.13992.
Magarelli M, Novielli P, De Filippis F, Magliulo R, Di Bitonto P, Diacono D Front Microbiol. 2024; 15:1393243.
PMID: 38887708 PMC: 11180736. DOI: 10.3389/fmicb.2024.1393243.
Deep learning in bioinformatics.
Yousef M, Allmer J Turk J Biol. 2024; 47(6):366-382.
PMID: 38681776 PMC: 11045206. DOI: 10.55730/1300-0152.2671.