» Articles » PMID: 35970979

Deep Polygenic Neural Network for Predicting and Identifying Yield-associated Genes in Indonesian Rice Accessions

Overview
Journal Sci Rep
Specialty Science
Date 2022 Aug 15
PMID 35970979
Authors
Affiliations
Soon will be listed here.
Abstract

As the fourth most populous country in the world, Indonesia must increase the annual rice production rate to achieve national food security by 2050. One possible solution comes from the nanoscopic level: a genetic variant called Single Nucleotide Polymorphism (SNP), which can express significant yield-associated genes. The prior benchmark of this study utilized a statistical genetics model where no SNP position information and attention mechanism were involved. Hence, we developed a novel deep polygenic neural network, named the NucleoNet model, to address these obstacles. The NucleoNets were constructed with the combination of prominent components that include positional SNP encoding, the context vector, wide models, Elastic Net, and Shannon's entropy loss. This polygenic modeling obtained up to 2.779 of Mean Squared Error (MSE) with 47.156% of Symmetric Mean Absolute Percentage Error (SMAPE), while revealing 15 new important SNPs. Furthermore, the NucleoNets reduced the MSE score up to 32.28% compared to the Ordinary Least Squares (OLS) model. Through the ablation study, we learned that the combination of Xavier distribution for weights initialization and Normal distribution for biases initialization sparked more various important SNPs throughout 12 chromosomes. Our findings confirmed that the NucleoNet model was successfully outperformed the OLS model and identified important SNPs to Indonesian rice yields.

Citing Articles

Transformer Architecture and Attention Mechanisms in Genome Data Analysis: A Comprehensive Review.

Choi S, Lee M Biology (Basel). 2023; 12(7).

PMID: 37508462 PMC: 10376273. DOI: 10.3390/biology12071033.

References
1.
Jeong K, Baten A, Waters D, Pantoja O, Julia C, Wissuwa M . Phosphorus remobilization from rice flag leaves during grain filling: an RNA-seq study. Plant Biotechnol J. 2016; 15(1):15-26. PMC: 5253468. DOI: 10.1111/pbi.12586. View

2.
Basith S, Manavalan B, Shin T, Lee G . SDM6A: A Web-Based Integrative Machine-Learning Framework for Predicting 6mA Sites in the Rice Genome. Mol Ther Nucleic Acids. 2019; 18:131-141. PMC: 6796762. DOI: 10.1016/j.omtn.2019.08.011. View

3.
Thabet S, Moursi Y, Karam M, Borner A, Alqudah A . Natural Variation Uncovers Candidate Genes for Barley Spikelet Number and Grain Yield under Drought Stress. Genes (Basel). 2020; 11(5). PMC: 7290517. DOI: 10.3390/genes11050533. View

4.
Chen H, Xie W, He H, Yu H, Chen W, Li J . A high-density SNP genotyping array for rice biology and molecular breeding. Mol Plant. 2013; 7(3):541-53. DOI: 10.1093/mp/sst135. View

5.
Nallamilli B, Zhang J, Mujahid H, Malone B, Bridges S, Peng Z . Polycomb group gene OsFIE2 regulates rice (Oryza sativa) seed development and grain filling via a mechanism distinct from Arabidopsis. PLoS Genet. 2013; 9(3):e1003322. PMC: 3591265. DOI: 10.1371/journal.pgen.1003322. View