» Articles » PMID: 22101908

Partial Least Squares Regression, Support Vector Machine Regression, and Transcriptome-based Distances for Prediction of Maize Hybrid Performance with Gene Expression Data

Overview
Publisher Springer
Specialty Genetics
Date 2011 Nov 22
PMID 22101908
Citations 20
Authors
Affiliations
Soon will be listed here.
Abstract

The performance of hybrids can be predicted with gene expression data from their parental inbred lines. Implementing such prediction approaches in breeding programs promises to increase the efficiency of hybrid breeding. The objectives of our study were to compare the accuracy of prediction models employing multiple linear regression (MLR), partial least squares regression (PLS), support vector machine regression (SVM), and transcriptome-based distances (D(B)). For a factorial of 7 flint and 14 dent maize lines, the grain yield of the hybrids was assessed and the gene expression of the parental lines was profiled with a 56k microarray. The accuracy of the prediction models was measured by the correlation between predicted and observed yield employing two cross-validation schemes. The first modeled the prediction of hybrids when testcross data are available for both parental lines (type 2 hybrids), and the second modeled the prediction of hybrids when no testcross data for the parental lines were available (type 0 hybrids). MLR, SVM, and PLS resulted in a high correlation between predicted and observed yield for type 2 hybrids, whereas for type 0 hybrids D(B) had greater prediction accuracy. The regression methods were robust to the choice of the set of profiled genes and required only a few hundred genes. In contrast, for an accurate hybrid prediction with D(B), 1,000-1,500 genes were required, and the prediction accuracy depended strongly on the set of profiled genes. We conclude that for prediction within one set of genetic material MLR is a promising approach, and for transferring prediction models from one set of genetic material to a related one, the transcriptome-based distance D(B) is most promising.

Citing Articles

Using phenomic selection to predict hybrid values with NIR spectra measured on the parental lines: proof of concept on maize.

Rincent R, Solin J, Lorenzi A, Nunes L, Griveau Y, Pirus L Theor Appl Genet. 2025; 138(1):28.

PMID: 39797978 PMC: 11724800. DOI: 10.1007/s00122-024-04809-4.


High-dimensional multi-omics measured in controlled conditions are useful for maize platform and field trait predictions.

Ali B, Huguenin-Bizot B, Laurent M, Chaumont F, Maistriaux L, Nicolas S Theor Appl Genet. 2024; 137(7):175.

PMID: 38958724 DOI: 10.1007/s00122-024-04679-w.


Correlation between Parental Transcriptome and Field Data for the Characterization of Heterosis in Chinese Cabbage.

Li R, Tian M, He Q, Zhang L Genes (Basel). 2023; 14(4).

PMID: 37107533 PMC: 10137735. DOI: 10.3390/genes14040776.


Genomic prediction of rice mesocotyl length indicative of directing seeding suitability using a half-sib hybrid population.

Chen L, Liu J, He S, Cao L, Ye G PLoS One. 2023; 18(4):e0283989.

PMID: 37018326 PMC: 10075464. DOI: 10.1371/journal.pone.0283989.


De novo genome assembly and analyses of 12 founder inbred lines provide insights into maize heterosis.

Wang B, Hou M, Shi J, Ku L, Song W, Ning Q Nat Genet. 2023; 55(2):312-323.

PMID: 36646891 DOI: 10.1038/s41588-022-01283-w.


References
1.
Maenhout S, De Baets B, Haesaert G, Van Bockstaele E . Support vector machine regression for the prediction of maize hybrid performance. Theor Appl Genet. 2007; 115(7):1003-13. DOI: 10.1007/s00122-007-0627-9. View

2.
Thiemann A, Fu J, Schrag T, Melchinger A, Frisch M, Scholten S . Correlation between parental transcriptome and field data for the characterization of heterosis in Zea mays L. Theor Appl Genet. 2009; 120(2):401-13. DOI: 10.1007/s00122-009-1189-9. View

3.
Kerr M, Churchill G . Experimental design for gene expression microarrays. Biostatistics. 2003; 2(2):183-201. DOI: 10.1093/biostatistics/2.2.183. View

4.
Frisch M, Thiemann A, Fu J, Schrag T, Scholten S, Melchinger A . Transcriptome-based distance measures for grouping of germplasm and prediction of hybrid performance in maize. Theor Appl Genet. 2009; 120(2):441-50. DOI: 10.1007/s00122-009-1204-1. View

5.
Smyth G . Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol. 2006; 3:Article3. DOI: 10.2202/1544-6115.1027. View